WorldWideScience

Sample records for supercomputing applications ncsa

  1. Input/output behavior of supercomputing applications

    Science.gov (United States)

    Miller, Ethan L.

    1991-01-01

    The collection and analysis of supercomputer I/O traces and their use in a collection of buffering and caching simulations are described. This serves two purposes. First, it gives a model of how individual applications running on supercomputers request file system I/O, allowing system designer to optimize I/O hardware and file system algorithms to that model. Second, the buffering simulations show what resources are needed to maximize the CPU utilization of a supercomputer given a very bursty I/O request rate. By using read-ahead and write-behind in a large solid stated disk, one or two applications were sufficient to fully utilize a Cray Y-MP CPU.

  2. Improved Access to Supercomputers Boosts Chemical Applications.

    Science.gov (United States)

    Borman, Stu

    1989-01-01

    Supercomputing is described in terms of computing power and abilities. The increase in availability of supercomputers for use in chemical calculations and modeling are reported. Efforts of the National Science Foundation and Cray Research are highlighted. (CW)

  3. Low Cost Supercomputer for Applications in Physics

    Science.gov (United States)

    Ahmed, Maqsood; Ahmed, Rashid; Saeed, M. Alam; Rashid, Haris; Fazal-e-Aleem

    2007-02-01

    Using parallel processing technique and commodity hardware, Beowulf supercomputers can be built at a much lower cost. Research organizations and educational institutions are using this technique to build their own high performance clusters. In this paper we discuss the architecture and design of Beowulf supercomputer and our own experience of building BURRAQ cluster.

  4. Application of the NCSA Habanero tool for collaboration on structural integrity assessments

    Energy Technology Data Exchange (ETDEWEB)

    Bass, B.R.; Kruse, K. [Oak Ridge National Lab., TN (United States); Dodds, R.H. Jr. [Univ. of Illinois, Urbana, IL (United States); Malik, S.N.M. [Nuclear Regulatory Commission, Washington, DC (United States)

    1998-11-01

    The Habanero software was developed by the National Center for Superconducting Applications at the University of Illinois, Urbana-Champaign, as a framework for the collaborative sharing of Java applications. The Habanero tool performs distributed communication of single-user, computer software interactions to a multiuser collaborative environment. An investigation was conducted to evaluate the capabilities of the Habanero tool in providing an Internet-based collaborative framework for researchers located at different sites and operating on different workstations. These collaborative sessions focused on the sharing of test data and analysis results from materials engineering areas (i.e., fracture mechanics and structural integrity evaluations) related to reactor pressure vessel safety research sponsored by the US Nuclear Regulatory Commission. This report defines collaborative-system requirements for engineering applications and provides an overview of collaborative systems within the project. The installation, application, and detailed evaluation of the performance of the Habanero collaborative tool are compared to those of another commercially available collaborative product. Recommendations are given for future work in collaborative communications.

  5. Applications of parallel supercomputers: Scientific results and computer science lessons

    Energy Technology Data Exchange (ETDEWEB)

    Fox, G.C.

    1989-07-12

    Parallel Computing has come of age with several commercial and inhouse systems that deliver supercomputer performance. We illustrate this with several major computations completed or underway at Caltech on hypercubes, transputer arrays and the SIMD Connection Machine CM-2 and AMT DAP. Applications covered are lattice gauge theory, computational fluid dynamics, subatomic string dynamics, statistical and condensed matter physics,theoretical and experimental astronomy, quantum chemistry, plasma physics, grain dynamics, computer chess, graphics ray tracing, and Kalman filters. We use these applications to compare the performance of several advanced architecture computers including the conventional CRAY and ETA-10 supercomputers. We describe which problems are suitable for which computers in the terms of a matching between problem and computer architecture. This is part of a set of lessons we draw for hardware, software, and performance. We speculate on the emergence of new academic disciplines motivated by the growing importance of computers. 138 refs., 23 figs., 10 tabs.

  6. Porting Ordinary Applications to Blue Gene/Q Supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy; Katz, Daniel S.; Binkowski, T. Andrew; Zhong, Xiaoliang; Heinonen, Olle; Karpeyev, Dmitry; Wilde, Michael

    2015-08-31

    Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt's sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.

  7. Supercomputational science

    CERN Document Server

    Wilson, S

    1990-01-01

    In contemporary research, the supercomputer now ranks, along with radio telescopes, particle accelerators and the other apparatus of "big science", as an expensive resource, which is nevertheless essential for state of the art research. Supercomputers are usually provided as shar.ed central facilities. However, unlike, telescopes and accelerators, they are find a wide range of applications which extends across a broad spectrum of research activity. The difference in performance between a "good" and a "bad" computer program on a traditional serial computer may be a factor of two or three, but on a contemporary supercomputer it can easily be a factor of one hundred or even more! Furthermore, this factor is likely to increase with future generations of machines. In keeping with the large capital and recurrent costs of these machines, it is appropriate to devote effort to training and familiarization so that supercomputers are employed to best effect. This volume records the lectures delivered at a Summer School ...

  8. Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu

    2013-12-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel\\'s MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.

  9. Application of Supercomputer Technologies for Simulation Of Socio-Economic Systems

    Directory of Open Access Journals (Sweden)

    Vladimir Valentinovich Okrepilov

    2015-06-01

    Full Text Available To date, an extensive experience has been accumulated in investigation of problems related to quality, assessment of management systems, modeling of economic system sustainability. The performed studies have created a basis for development of a new research area — Economics of Quality. Its tools allow to use opportunities of model simulation for construction of the mathematical models adequately reflecting the role of quality in natural, technical, social regularities of functioning of the complex socio-economic systems. Extensive application and development of models, and also system modeling with use of supercomputer technologies, on our deep belief, will bring the conducted research of socio-economic systems to essentially new level. Moreover, the current scientific research makes a significant contribution to model simulation of multi-agent social systems and that is not less important, it belongs to the priority areas in development of science and technology in our country. This article is devoted to the questions of supercomputer technologies application in public sciences, first of all, — regarding technical realization of the large-scale agent-focused models (AFM. The essence of this tool is that owing to the power computer increase it has become possible to describe the behavior of many separate fragments of a difficult system, as socio-economic systems are. The article also deals with the experience of foreign scientists and practicians in launching the AFM on supercomputers, and also the example of AFM developed in CEMI RAS, stages and methods of effective calculating kernel display of multi-agent system on architecture of a modern supercomputer will be analyzed. The experiments on the basis of model simulation on forecasting the population of St. Petersburg according to three scenarios as one of the major factors influencing the development of socio-economic system and quality of life of the population are presented in the

  10. Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Sreepathi, Sarat [ORNL; D' Azevedo, Eduardo [ORNL; Philip, Bobby [ORNL; Worley, Patrick H [ORNL

    2016-01-01

    On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phase of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.

  11. Efficient development of memory bounded geo-applications to scale on modern supercomputers

    Science.gov (United States)

    Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric

    2016-04-01

    Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases

  12. Cell-based Adaptive Mesh Refinement on the GPU with Applications to Exascale Supercomputing

    Science.gov (United States)

    Trujillo, Dennis; Robey, Robert; Davis, Neal; Nicholaeff, David

    2011-10-01

    We present an OpenCL implementation of a cell-based adaptive mesh refinement (AMR) scheme for the shallow water equations. The challenges associated with ensuring the locality of algorithm architecture to fully exploit the massive number of parallel threads on the GPU is discussed. This includes a proof of concept that a cell-based AMR code can be effectively implemented, even on a small scale, in the memory and threading model provided by OpenCL. Additionally, the program requires dynamic memory in order to properly implement the mesh; as this is not supported in the OpenCL 1.1 standard, a combination of CPU memory management and GPU computation effectively implements a dynamic memory allocation scheme. Load balancing is achieved through a new stencil-based implementation of a space-filling curve, eliminating the need for a complete recalculation of the indexing on the mesh. A cartesian grid hash table scheme to allow fast parallel neighbor accesses is also discussed. Finally, the relative speedup of the GPU-enabled AMR code is compared to the original serial version. We conclude that parallelization using the GPU provides significant speedup for typical numerical applications and is feasible for scientific applications in the next generation of supercomputing.

  13. Federal Council on Science, Engineering and Technology: Committee on Computer Research and Applications, Subcommittee on Science and Engineering Computing: The US Supercomputer Industry

    Energy Technology Data Exchange (ETDEWEB)

    1987-12-01

    The Federal Coordinating Council on Science, Engineering, and Technology (FCCSET) Committee on Supercomputing was chartered by the Director of the Office of Science and Technology Policy in 1982 to examine the status of supercomputing in the United States and to recommend a role for the Federal Government in the development of this technology. In this study, the FCCSET Committee (now called the Subcommittee on Science and Engineering Computing of the FCCSET Committee on Computer Research and Applications) reports on the status of the supercomputer industry and addresses changes that have occured since issuance of the 1983 and 1985 reports. The review based upon periodic meetings with and site visits to supercomputer manufacturers and consultation with experts in high performance scientific computing. White papers have been contributed to this report by industry leaders and supercomputer experts.

  14. High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

    Science.gov (United States)

    Kramer, Williams T. C.; Simon, Horst D.

    1994-01-01

    This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.

  15. Enabling department-scale supercomputing

    Energy Technology Data Exchange (ETDEWEB)

    Greenberg, D.S.; Hart, W.E.; Phillips, C.A.

    1997-11-01

    The Department of Energy (DOE) national laboratories have one of the longest and most consistent histories of supercomputer use. The authors summarize the architecture of DOE`s new supercomputers that are being built for the Accelerated Strategic Computing Initiative (ASCI). The authors then argue that in the near future scaled-down versions of these supercomputers with petaflop-per-weekend capabilities could become widely available to hundreds of research and engineering departments. The availability of such computational resources will allow simulation of physical phenomena to become a full-fledged third branch of scientific exploration, along with theory and experimentation. They describe the ASCI and other supercomputer applications at Sandia National Laboratories, and discuss which lessons learned from Sandia`s long history of supercomputing can be applied in this new setting.

  16. Introduction to Reconfigurable Supercomputing

    CERN Document Server

    Lanzagorta, Marco; Rosenberg, Robert

    2010-01-01

    This book covers technologies, applications, tools, languages, procedures, advantages, and disadvantages of reconfigurable supercomputing using Field Programmable Gate Arrays (FPGAs). The target audience is the community of users of High Performance Computers (HPe who may benefit from porting their applications into a reconfigurable environment. As such, this book is intended to guide the HPC user through the many algorithmic considerations, hardware alternatives, usability issues, programming languages, and design tools that need to be understood before embarking on the creation of reconfigur

  17. 超级计算中心核心应用的浅析%Brief Exploration on Technical Development of Key Applications at Supercomputing Center

    Institute of Scientific and Technical Information of China (English)

    党岗; 程志全

    2013-01-01

    目前,我国国家级超算中心大多采用“地方政府投资、以市场为导向开展应用”的建设思路,地方政府更关心涉及本地企事业单位的高性能计算应用和服务,超算中心常被用于普通的应用,很难充分发挥超级计算的战略作用.如何让超算中心这艘能力超强的航母生存下来,进而“攻城掠地”,推动技术创新,一直是业内人士研究的课题.初步探讨了国内超算中心核心应用所面临的挑战,提出了超算中心核心应用服务地方建设的几点建议.%National supercomputing centers at China work use building thought of local government investigation, and market-oriented application performing. Supercomputing resources are always applied at general applications,as the local govenment more focuses on the high-performance computing applications and services related to local business, rather than supercomputing working as strategical role in the traditional way. It is a long-term researching topic how to make the supercomputing powerful as a super-carrier active and applicable to benefit the technical innovation. Some challenging technical issues suiting for the superomputing were discussed by taking domestic supercomputing center as the example, and some useful advises were addressed for applying international supercomputing center at local services.

  18. Performance Evaluation of an Intel Haswell- and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications

    Science.gov (United States)

    Saini, Subhash; Hood, Robert T.; Chang, Johnny; Baron, John

    2016-01-01

    We present a performance evaluation conducted on a production supercomputer of the Intel Xeon Processor E5- 2680v3, a twelve-core implementation of the fourth-generation Haswell architecture, and compare it with Intel Xeon Processor E5-2680v2, an Ivy Bridge implementation of the third-generation Sandy Bridge architecture. Several new architectural features have been incorporated in Haswell including improvements in all levels of the memory hierarchy as well as improvements to vector instructions and power management. We critically evaluate these new features of Haswell and compare with Ivy Bridge using several low-level benchmarks including subset of HPCC, HPCG and four full-scale scientific and engineering applications. We also present a model to predict the performance of HPCG and Cart3D within 5%, and Overflow within 10% accuracy.

  19. Parallel workflow manager for non-parallel bioinformatic applications to solve large-scale biological problems on a supercomputer.

    Science.gov (United States)

    Suplatov, Dmitry; Popova, Nina; Zhumatiy, Sergey; Voevodin, Vladimir; Švedas, Vytas

    2016-04-01

    Rapid expansion of online resources providing access to genomic, structural, and functional information associated with biological macromolecules opens an opportunity to gain a deeper understanding of the mechanisms of biological processes due to systematic analysis of large datasets. This, however, requires novel strategies to optimally utilize computer processing power. Some methods in bioinformatics and molecular modeling require extensive computational resources. Other algorithms have fast implementations which take at most several hours to analyze a common input on a modern desktop station, however, due to multiple invocations for a large number of subtasks the full task requires a significant computing power. Therefore, an efficient computational solution to large-scale biological problems requires both a wise parallel implementation of resource-hungry methods as well as a smart workflow to manage multiple invocations of relatively fast algorithms. In this work, a new computer software mpiWrapper has been developed to accommodate non-parallel implementations of scientific algorithms within the parallel supercomputing environment. The Message Passing Interface has been implemented to exchange information between nodes. Two specialized threads - one for task management and communication, and another for subtask execution - are invoked on each processing unit to avoid deadlock while using blocking calls to MPI. The mpiWrapper can be used to launch all conventional Linux applications without the need to modify their original source codes and supports resubmission of subtasks on node failure. We show that this approach can be used to process huge amounts of biological data efficiently by running non-parallel programs in parallel mode on a supercomputer. The C++ source code and documentation are available from http://biokinet.belozersky.msu.ru/mpiWrapper .

  20. Microprocessors: from desktops to supercomputers.

    Science.gov (United States)

    Baskett, F; Hennessy, J L

    1993-08-13

    Continuing improvements in integrated circuit technology and computer architecture have driven microprocessors to performance levels that rival those of supercomputers-at a fraction of the price. The use of sophisticated memory hierarchies enables microprocessor-based machines to have very large memories built from commodity dynamic random access memory while retaining the high bandwidth and low access time needed in a high-performance machine. Parallel processors composed of these high-performance microprocessors are becoming the supercomputing technology of choice for scientific and engineering applications. The challenges for these new supercomputers have been in developing multiprocessor architectures that are easy to program and that deliver high performance without extraordinary programming efforts by users. Recent progress in multiprocessor architecture has led to ways to meet these challenges.

  1. The Application of Cloud Computing to the Creation of Image Mosaics and Management of Their Provenance

    CERN Document Server

    Berriman, G Bruce; Groth, Paul; Juve, Gideon

    2010-01-01

    We have used the Montage image mosaic engine to investigate the cost and performance of processing images on the Amazon EC2 cloud, and to inform the requirements that higher-level products impose on provenance management technologies. We will present a detailed comparison of the performance of Montage on the cloud and on the Abe high performance cluster at the National Center for Supercomputing Applications (NCSA). Because Montage generates many intermediate products, we have used it to understand the science requirements that higher-level products impose on provenance management technologies. We describe experiments with provenance management technologies such as the "Provenance Aware Service Oriented Architecture" (PASOA).

  2. KAUST Supercomputing Laboratory

    KAUST Repository

    Bailey, April Renee

    2011-11-15

    KAUST has partnered with IBM to establish a Supercomputing Research Center. KAUST is hosting the Shaheen supercomputer, named after the Arabian falcon famed for its swiftness of flight. This 16-rack IBM Blue Gene/P system is equipped with 4 gigabyte memory per node and capable of 222 teraflops, making KAUST campus the site of one of the world’s fastest supercomputers in an academic environment. KAUST is targeting petaflop capability within 3 years.

  3. Ultrascalable petaflop parallel supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Blumrich, Matthias A. (Ridgefield, CT); Chen, Dong (Croton On Hudson, NY); Chiu, George (Cross River, NY); Cipolla, Thomas M. (Katonah, NY); Coteus, Paul W. (Yorktown Heights, NY); Gara, Alan G. (Mount Kisco, NY); Giampapa, Mark E. (Irvington, NY); Hall, Shawn (Pleasantville, NY); Haring, Rudolf A. (Cortlandt Manor, NY); Heidelberger, Philip (Cortlandt Manor, NY); Kopcsay, Gerard V. (Yorktown Heights, NY); Ohmacht, Martin (Yorktown Heights, NY); Salapura, Valentina (Chappaqua, NY); Sugavanam, Krishnan (Mahopac, NY); Takken, Todd (Brewster, NY)

    2010-07-20

    A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.

  4. Advancements and performance of iterative methods in industrial applications codes on CRAY parallel/vector supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Poole, G.; Heroux, M. [Engineering Applications Group, Eagan, MN (United States)

    1994-12-31

    This paper will focus on recent work in two widely used industrial applications codes with iterative methods. The ANSYS program, a general purpose finite element code widely used in structural analysis applications, has now added an iterative solver option. Some results are given from real applications comparing performance with the tradition parallel/vector frontal solver used in ANSYS. Discussion of the applicability of iterative solvers as a general purpose solver will include the topics of robustness, as well as memory requirements and CPU performance. The FIDAP program is a widely used CFD code which uses iterative solvers routinely. A brief description of preconditioners used and some performance enhancements for CRAY parallel/vector systems is given. The solution of large-scale applications in structures and CFD includes examples from industry problems solved on CRAY systems.

  5. Science Driven Supercomputing Architectures: AnalyzingArchitectural Bottlenecks with Applications and Benchmark Probes

    Energy Technology Data Exchange (ETDEWEB)

    Kamil, S.; Yelick, K.; Kramer, W.T.; Oliker, L.; Shalf, J.; Shan,H.; Strohmaier, E.

    2005-09-26

    There is a growing gap between the peak speed of parallel computing systems and the actual delivered performance for scientific applications. In general this gap is caused by inadequate architectural support for the requirements of modern scientific applications, as commercial applications and the much larger market they represent, have driven the evolution of computer architectures. This gap has raised the importance of developing better benchmarking methodologies to characterize and to understand the performance requirements of scientific applications, to communicate them efficiently to influence the design of future computer architectures. This improved understanding of the performance behavior of scientific applications will allow improved performance predictions, development of adequate benchmarks for identification of hardware and application features that work well or poorly together, and a more systematic performance evaluation in procurement situations. The Berkeley Institute for Performance Studies has developed a three-level approach to evaluating the design of high end machines and the software that runs on them: (1) A suite of representative applications; (2) A set of application kernels; and (3) Benchmarks to measure key system parameters. The three levels yield different type of information, all of which are useful in evaluating systems, and enable NSF and DOE centers to select computer architectures more suited for scientific applications. The analysis will further allow the centers to engage vendors in discussion of strategies to alleviate the present architectural bottlenecks using quantitative information. These may include small hardware changes or larger ones that may be out interest to non-scientific workloads. Providing quantitative models to the vendors allows them to assess the benefits of technology alternatives using their own internal cost-models in the broader marketplace, ideally facilitating the development of future computer

  6. Emerging supercomputer architectures

    Energy Technology Data Exchange (ETDEWEB)

    Messina, P.C.

    1987-01-01

    This paper will examine the current and near future trends for commercially available high-performance computers with architectures that differ from the mainstream ''supercomputer'' systems in use for the last few years. These emerging supercomputer architectures are just beginning to have an impact on the field of high performance computing. 7 refs., 1 tab.

  7. Science Driven Supercomputing Architectures: AnalyzingArchitectural Bottlenecks with Applications and Benchmark Probes

    Energy Technology Data Exchange (ETDEWEB)

    Kamil, S.; Yelick, K.; Kramer, W.T.; Oliker, L.; Shalf, J.; Shan,H.; Strohmaier, E.

    2005-09-26

    There is a growing gap between the peak speed of parallelcomputing systems and the actual delivered performance for scientificapplications. In general this gap is caused by inadequate architecturalsupport for the requirements of modern scientific applications, ascommercial applications and the much larger market they represent, havedriven the evolution of computer architectures. This gap has raised theimportance of developing better benchmarking methodologies tocharacterize and to understand the performance requirements of scientificapplications, to communicate them efficiently to influence the design offuture computer architectures. This improved understanding of theperformance behavior of scientific applications will allow improvedperformance predictions, development of adequate benchmarks foridentification of hardware and application features that work well orpoorly together, and a more systematic performance evaluation inprocurement situations. The Berkeley Institute for Performance Studieshas developed a three-level approach to evaluating the design of high endmachines and the software that runs on them: 1) A suite of representativeapplications; 2) A set of application kernels; and 3) Benchmarks tomeasure key system parameters. The three levels yield different type ofinformation, all of which are useful in evaluating systems, and enableNSF and DOE centers to select computer architectures more suited forscientific applications. The analysis will further allow the centers toengage vendors in discussion of strategies to alleviate the presentarchitectural bottlenecks using quantitative information. These mayinclude small hardware changes or larger ones that may be out interest tonon-scientific workloads. Providing quantitative models to the vendorsallows them to assess the benefits of technology alternatives using theirown internal cost-models in the broader marketplace, ideally facilitatingthe development of future computer architectures more suited forscientific

  8. Programmable lithography engine (ProLE) grid-type supercomputer and its applications

    Science.gov (United States)

    Petersen, John S.; Maslow, Mark J.; Gerold, David J.; Greenway, Robert T.

    2003-06-01

    There are many variables that can affect lithographic dependent device yield. Because of this, it is not enough to make optical proximity corrections (OPC) based on the mask type, wavelength, lens, illumination-type and coherence. Resist chemistry and physics along with substrate, exposure, and all post-exposure processing must be considered too. Only a holistic approach to finding imaging solutions will accelerate yield and maximize performance. Since experiments are too costly in both time and money, accomplishing this takes massive amounts of accurate simulation capability. Our solution is to create a workbench that has a set of advanced user applications that utilize best-in-class simulator engines for solving litho-related DFM problems using distributive computing. Our product, ProLE (Programmable Lithography Engine), is an integrated system that combines Petersen Advanced Lithography Inc."s (PAL"s) proprietary applications and cluster management software wrapped around commercial software engines, along with optional commercial hardware and software. It uses the most rigorous lithography simulation engines to solve deep sub-wavelength imaging problems accurately and at speeds that are several orders of magnitude faster than current methods. Specifically, ProLE uses full vector thin-mask aerial image models or when needed, full across source 3D electromagnetic field simulation to make accurate aerial image predictions along with calibrated resist models;. The ProLE workstation from Petersen Advanced Lithography, Inc., is the first commercial product that makes it possible to do these intensive calculations at a fraction of a time previously available thus significantly reducing time to market for advance technology devices. In this work, ProLE is introduced, through model comparison to show why vector imaging and rigorous resist models work better than other less rigorous models, then some applications of that use our distributive computing solution are shown

  9. Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    KAUST Repository

    Wu, Xingfu

    2013-07-01

    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.

  10. Validation of a multidimensional deterministic nuclear data sensitivity and uncertainty code system: an application needing supercomputing

    Energy Technology Data Exchange (ETDEWEB)

    Bidaud, A.; Mastrangelo, V. [Conservatoire National des Arts et Metiers, Laboratoire de Physique (CNAM), 75 - Paris (France); Institut de Physique Nucleaire (IN2P3/CNRS) 91 - Orsay (France); Kodeli, I.; Sartori, E. [OECD NEA Data Bank, 92 - Issy les Moulineaux (France)

    2003-07-01

    The quality of nuclear core modelling is linked to the quality of basic nuclear data such as probability of reaction (i.e. cross sections) between neutrons and the nucleus of the core materials. Perturbation Theory, whose applications in nuclear science has been largely developed in the sixties provides tools for estimating the sensitivity of integral parameters such as k-eff, reaction rates, or breeding ratio to the cross sections. The computation with these tools requires approximations in the simulation of space, angles and energy dependent neutron transport. To minimise the impact of the geometry modelling approximations in the calculation, use of 3 dimensional multigroup transport codes is recommended. Sensitivity and uncertainty analyses are the tools needed to estimate the accuracy that a code system with data libraries can achieve. They can guide users as to the specific need for improved data to carry out reliable simulations. However, as full-scale models in 3 dimensions with refined descriptions of the phase-space are used, high performance computers and codes designed to run on parallel architectures are needed to obtain results within acceptable time limits.

  11. NSF Commits to Supercomputers.

    Science.gov (United States)

    Waldrop, M. Mitchell

    1985-01-01

    The National Science Foundation (NSF) has allocated at least $200 million over the next five years to support four new supercomputer centers. Issues and trends related to this NSF initiative are examined. (JN)

  12. Energy sciences supercomputing 1990

    Energy Technology Data Exchange (ETDEWEB)

    Mirin, A.A.; Kaiper, G.V. (eds.)

    1990-01-01

    This report contains papers on the following topics: meeting the computational challenge; lattice gauge theory: probing the standard model; supercomputing for the superconducting super collider; and overview of ongoing studies in climate model diagnosis and intercomparison; MHD simulation of the fueling of a tokamak fusion reactor through the injection of compact toroids; gyrokinetic particle simulation of tokamak plasmas; analyzing chaos: a visual essay in nonlinear dynamics; supercomputing and research in theoretical chemistry; monte carlo simulations of light nuclei; parallel processing; and scientists of the future: learning by doing.

  13. Supercomputers to transform Science

    CERN Multimedia

    2006-01-01

    "New insights into the structure of space and time, climate modeling, and the design of novel drugs, are but a few of the many research areas that will be transforned by the installation of three supercomputers at the Unversity of Bristol." (1/2 page)

  14. Petaflop supercomputers of China

    Institute of Scientific and Technical Information of China (English)

    Guoliang CHEN

    2010-01-01

    @@ After ten years of development, high performance computing (HPC) in China has made remarkable progress. In November, 2010, the NUDT Tianhe-1A and the Dawning Nebulae respectively claimed the 1st and 3rd places in the Top500 Supercomputers List; this recognizes internationally the level that China has achieved in high performance computer manufacturing.

  15. A training program for scientific supercomputing users

    Energy Technology Data Exchange (ETDEWEB)

    Hanson, F.; Moher, T.; Sabelli, N.; Solem, A.

    1988-01-01

    There is need for a mechanism to transfer supercomputing technology into the hands of scientists and engineers in such a way that they will acquire a foundation of knowledge that will permit integration of supercomputing as a tool in their research. Most computing center training emphasizes computer-specific information about how to use a particular computer system; most academic programs teach concepts to computer scientists. Only a few brief courses and new programs are designed for computational scientists. This paper describes an eleven-week training program aimed principally at graduate and postdoctoral students in computationally-intensive fields. The program is designed to balance the specificity of computing center courses, the abstractness of computer science courses, and the personal contact of traditional apprentice approaches. It is based on the experience of computer scientists and computational scientists, and consists of seminars and clinics given by many visiting and local faculty. It covers a variety of supercomputing concepts, issues, and practices related to architecture, operating systems, software design, numerical considerations, code optimization, graphics, communications, and networks. Its research component encourages understanding of scientific computing and supercomputer hardware issues. Flexibility in thinking about computing needs is emphasized by the use of several different supercomputer architectures, such as the Cray X/MP48 at the National Center for Supercomputing Applications at University of Illinois at Urbana-Champaign, IBM 3090 600E/VF at the Cornell National Supercomputer Facility, and Alliant FX/8 at the Advanced Computing Research Facility at Argonne National Laboratory. 11 refs., 6 tabs.

  16. The Application of Cloud Computing to Astronomy: A Study of Cost and Performance

    CERN Document Server

    Berriman, G Bruce; Juve, Gideon; Regelson, Moira; Plavchan, Peter

    2010-01-01

    Cloud computing is a powerful new technology that is widely used in the business world. Recently, we have been investigating the benefits it offers to scientific computing. We have used three workflow applications to compare the performance of processing data on the Amazon EC2 cloud with the performance on the Abe high-performance cluster at the National Center for Supercomputing Applications (NCSA). We show that the Amazon EC2 cloud offers better performance and value for processor- and memory-limited applications than for I/O-bound applications. We provide an example of how the cloud is well suited to the generation of a science product: an atlas of periodograms for the 210,000 light curves released by the NASA Kepler Mission. This atlas will support the identification of periodic signals, including those due to transiting exoplanets, in the Kepler data sets.

  17. Supercomputer debugging workshop 1991 proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Brown, J.

    1991-01-01

    This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)

  18. Supercomputer debugging workshop 1991 proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Brown, J.

    1991-12-31

    This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)

  19. Investigation of supercomputer capabilities for the scalable numerical simulation of computational fluid dynamics problems in industrial applications

    Science.gov (United States)

    Kozelkov, A. S.; Kurulin, V. V.; Lashkin, S. V.; Shagaliev, R. M.; Yalozo, A. V.

    2016-08-01

    Two main issues of the efficient usage of computational fluid dynamics (CFD) in industrial applications—simulation of turbulence and speedup of computations—are analyzed. Results of the investigation of potentials of the eddy-resolving approaches to turbulence simulation in industrial applications with the use of arbitrary unstructured grids are presented. Algorithms for speeding up the scalable high-performance computations based on multigrid technologies are proposed.

  20. Optimization of Applications with Non-blocking Neighborhood Collectives via Multisends on the Blue Gene/P Supercomputer.

    Science.gov (United States)

    Kumar, Sameer; Heidelberger, Philip; Chen, Dong; Hines, Michael

    2010-04-19

    We explore the multisend interface as a data mover interface to optimize applications with neighborhood collective communication operations. One of the limitations of the current MPI 2.1 standard is that the vector collective calls require counts and displacements (zero and nonzero bytes) to be specified for all the processors in the communicator. Further, all the collective calls in MPI 2.1 are blocking and do not permit overlap of communication with computation. We present the record replay persistent optimization to the multisend interface that minimizes the processor overhead of initiating the collective. We present four different case studies with the multisend API on Blue Gene/P (i) 3D-FFT, (ii) 4D nearest neighbor exchange as used in Quantum Chromodynamics, (iii) NAMD and (iv) neural network simulator NEURON. Performance results show 1.9× speedup with 32(3) 3D-FFTs, 1.9× speedup for 4D nearest neighbor exchange with the 2(4) problem, 1.6× speedup in NAMD and almost 3× speedup in NEURON with 256K cells and 1k connections/cell.

  1. Computational Dimensionalities of Global Supercomputing

    Directory of Open Access Journals (Sweden)

    Richard S. Segall

    2013-12-01

    Full Text Available This Invited Paper pertains to subject of my Plenary Keynote Speech at the 17th World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI 2013 held in Orlando, Florida on July 9-12, 2013. The title of my Plenary Keynote Speech was: "Dimensionalities of Computation: from Global Supercomputing to Data, Text and Web Mining" but this Invited Paper will focus only on the "Computational Dimensionalities of Global Supercomputing" and is based upon a summary of the contents of several individual articles that have been previously written with myself as lead author and published in [75], [76], [77], [78], [79], [80] and [11]. The topics of these of the Plenary Speech included Overview of Current Research in Global Supercomputing [75], Open-Source Software Tools for Data Mining Analysis of Genomic and Spatial Images using High Performance Computing [76], Data Mining Supercomputing with SAS™ JMP® Genomics ([77], [79], [80], and Visualization by Supercomputing Data Mining [81]. ______________________ [11.] Committee on the Future of Supercomputing, National Research Council (2003, The Future of Supercomputing: An Interim Report, ISBN-13: 978-0-309-09016- 2, http://www.nap.edu/catalog/10784.html [75.] Segall, Richard S.; Zhang, Qingyu and Cook, Jeffrey S.(2013, "Overview of Current Research in Global Supercomputing", Proceedings of Forty- Fourth Meeting of Southwest Decision Sciences Institute (SWDSI, Albuquerque, NM, March 12-16, 2013. [76.] Segall, Richard S. and Zhang, Qingyu (2010, "Open-Source Software Tools for Data Mining Analysis of Genomic and Spatial Images using High Performance Computing", Proceedings of 5th INFORMS Workshop on Data Mining and Health Informatics, Austin, TX, November 6, 2010. [77.] Segall, Richard S., Zhang, Qingyu and Pierce, Ryan M.(2010, "Data Mining Supercomputing with SAS™ JMP®; Genomics: Research-in-Progress, Proceedings of 2010 Conference on Applied Research in Information Technology, sponsored by

  2. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    OpenAIRE

    A. Gunzinger; BÄumle, B.; Frey, M.; Klebl, M.; Kocheisen, M.; Kohler, P.; Morel, R.; Müller, U; Rosenthal, M

    1996-01-01

    At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH) in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication) has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of...

  3. World's fastest supercomputer opens up to users

    Science.gov (United States)

    Xin, Ling

    2016-08-01

    China's latest supercomputer - Sunway TaihuLight - has claimed the crown as the world's fastest computer according to the latest TOP500 list, released at the International Supercomputer Conference in Frankfurt in late June.

  4. Multi-petascale highly efficient parallel supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O' Brien, John K.; O' Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

    2015-07-14

    A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

  5. Desktop supercomputers. Advance medical imaging.

    Science.gov (United States)

    Frisiello, R S

    1991-02-01

    Medical imaging tools that radiologists as well as a wide range of clinicians and healthcare professionals have come to depend upon are emerging into the next phase of functionality. The strides being made in supercomputing technologies--including reduction of size and price--are pushing medical imaging to a new level of accuracy and functionality.

  6. An integrated distributed processing interface for supercomputers and workstations

    Energy Technology Data Exchange (ETDEWEB)

    Campbell, J.; McGavran, L.

    1989-01-01

    Access to documentation, communication between multiple processes running on heterogeneous computers, and animation of simulations of engineering problems are typically weak in most supercomputer environments. This presentation will describe how we are improving this situation in the Computer Research and Applications group at Los Alamos National Laboratory. We have developed a tool using UNIX filters and a SunView interface that allows users simple access to documentation via mouse driven menus. We have also developed a distributed application that integrated a two point boundary value problem on one of our Cray Supercomputers. It is controlled and displayed graphically by a window interface running on a workstation screen. Our motivation for this research has been to improve the usual typewriter/static interface using language independent controls to show capabilities of the workstation/supercomputer combination. 8 refs.

  7. Combining Gprof and Event—Driven Monitoring for Analyzing Distributed Programs:A Rough View of NCSA Mosaic

    Institute of Scientific and Technical Information of China (English)

    彭澄廉; R.Klar

    1996-01-01

    There are several purposes of analyzing a program:functional or performance analysis,debugging or,more recently,mapping a program to a new paraller or distributed architecture.In this paper,we introduce an effective method leading to the Execution Graph (EG)from a program.First,the Unix profiling tool Gprof is used to get the Execution Model(EM)of a C-program.Then the event-driven monitoring tool AICOS-SIMPLE is used to get the EG which includes not only the call graph but also the execution time table of the program.This method is suitable for analyzing modern distributed programs.As the example of the analysis,the well known HTTP protocol under the NCSA Mosaic is chosen.Ae EG of NCSA Mosaic on the routing level is given.

  8. Seismic signal processing on heterogeneous supercomputers

    Science.gov (United States)

    Gokhberg, Alexey; Ermert, Laura; Fichtner, Andreas

    2015-04-01

    The processing of seismic signals - including the correlation of massive ambient noise data sets - represents an important part of a wide range of seismological applications. It is characterized by large data volumes as well as high computational input/output intensity. Development of efficient approaches towards seismic signal processing on emerging high performance computing systems is therefore essential. Heterogeneous supercomputing systems introduced in the recent years provide numerous computing nodes interconnected via high throughput networks, every node containing a mix of processing elements of different architectures, like several sequential processor cores and one or a few graphical processing units (GPU) serving as accelerators. A typical representative of such computing systems is "Piz Daint", a supercomputer of the Cray XC 30 family operated by the Swiss National Supercomputing Center (CSCS), which we used in this research. Heterogeneous supercomputers provide an opportunity for manifold application performance increase and are more energy-efficient, however they have much higher hardware complexity and are therefore much more difficult to program. The programming effort may be substantially reduced by the introduction of modular libraries of software components that can be reused for a wide class of seismology applications. The ultimate goal of this research is design of a prototype for such library suitable for implementing various seismic signal processing applications on heterogeneous systems. As a representative use case we have chosen an ambient noise correlation application. Ambient noise interferometry has developed into one of the most powerful tools to image and monitor the Earth's interior. Future applications will require the extraction of increasingly small details from noise recordings. To meet this demand, more advanced correlation techniques combined with very large data volumes are needed. This poses new computational problems that

  9. An assessment of worldwide supercomputer usage

    Energy Technology Data Exchange (ETDEWEB)

    Wasserman, H.J.; Simmons, M.L.; Hayes, A.H.

    1995-01-01

    This report provides a comparative study of advanced supercomputing usage in Japan and the United States as of Spring 1994. It is based on the findings of a group of US scientists whose careers have centered on programming, evaluating, and designing high-performance supercomputers for over ten years. The report is a follow-on to an assessment of supercomputing technology in Europe and Japan that was published in 1993. Whereas the previous study focused on supercomputer manufacturing capabilities, the primary focus of the current work was to compare where and how supercomputers are used. Research for this report was conducted through both literature studies and field research in Japan.

  10. A workbench for tera-flop supercomputing

    Energy Technology Data Exchange (ETDEWEB)

    Resch, M.M.; Kuester, U.; Mueller, M.S.; Lang, U. [High Performance Computing Center Stuttgart (HLRS), Stuttgart (Germany)

    2003-07-01

    Supercomputers currently reach a peak performance in the range of TFlop/s. With but one exception - the Japanese Earth Simulator - none of these systems has so far been able to also show a level of sustained performance for a variety of applications that comes close to the peak performance. Sustained TFlop/s are therefore rarely seen. The reasons are manifold and are well known: Bandwidth and latency both for main memory and for the internal network are the key internal technical problems. Cache hierarchies with large caches can bring relief but are no remedy to the problem. However, there are not only technical problems that inhibit the full exploitation by scientists of the potential of modern supercomputers. More and more organizational issues come to the forefront. This paper shows the approach of the High Performance Computing Center Stuttgart (HLRS) to deliver a sustained performance of TFlop/s for a wide range of applications from a large group of users spread over Germany. The core of the concept is the role of the data. Around this we design a simulation workbench that hides the complexity of interacting computers, networks and file systems from the user. (authors)

  11. NCSA: A New Protocol for Random Multiple Access Based on Physical Layer Network Coding

    CERN Document Server

    Bui, Huyen Chi; Boucheret, Marie-Laure

    2010-01-01

    This paper introduces a random multiple access method for satellite communications, named Network Coding-based Slotted Aloha (NCSA). The goal is to improve diversity of data bursts on a slotted-ALOHA-like channel thanks to error correcting codes and Physical-layer Network Coding (PNC). This scheme can be considered as a generalization of the Contention Resolution Diversity Slotted Aloha (CRDSA) where the different replicas of this system are replaced by the different parts of a single word of an error correcting code. The performance of this scheme is first studied through a density evolution approach. Then, simulations confirm the CRDSA results by showing that, for a time frame of $400$ slots, the achievable total throughput is greater than $0.7\\times C$, where $C$ is the maximal throughput achieved by a centralized scheme. This paper is a first analysis of the proposed scheme which open several perspectives. The most promising approach is to integrate collided bursts into the decoding process in order to im...

  12. The GF11 supercomputer

    Science.gov (United States)

    Beetem, J.; Denneau, M.; Weingarten, D.

    1987-01-01

    GF11 is a parallel computer currently under construction at the IBM Yorktown Research Center. The machine incorporates 576 floating-point processors arranged in a modified SIMD architecture. Each has space for 2 Mbytes of memory and is capable of 20 Mflops, giving the total machine a peak of 1.125 Gbytes of memory and 11.52 Gflops. The floating-point processors are interconnected by a dynamically reconfigurable non-blocking switching network. At each machine cycle any of 1024 pre-selected permutations of data can be realized among the processors. The main intended application of GF11 is a class of calculations arising from quantum chromodynamics.

  13. Power-constrained supercomputing

    Science.gov (United States)

    Bailey, Peter E.

    As we approach exascale systems, power is turning from an optimization goal to a critical operating constraint. With power bounds imposed by both stakeholders and the limitations of existing infrastructure, achieving practical exascale computing will therefore rely on optimizing performance subject to a power constraint. However, this requirement should not add to the burden of application developers; optimizing the runtime environment given restricted power will primarily be the job of high-performance system software. In this dissertation, we explore this area and develop new techniques that extract maximum performance subject to a particular power constraint. These techniques include a method to find theoretical optimal performance, a runtime system that shifts power in real time to improve performance, and a node-level prediction model for selecting power-efficient operating points. We use a linear programming (LP) formulation to optimize application schedules under various power constraints, where a schedule consists of a DVFS state and number of OpenMP threads for each section of computation between consecutive message passing events. We also provide a more flexible mixed integer-linear (ILP) formulation and show that the resulting schedules closely match schedules from the LP formulation. Across four applications, we use our LP-derived upper bounds to show that current approaches trail optimal, power-constrained performance by up to 41%. This demonstrates limitations of current systems, and our LP formulation provides future optimization approaches with a quantitative optimization target. We also introduce Conductor, a run-time system that intelligently distributes available power to nodes and cores to improve performance. The key techniques used are configuration space exploration and adaptive power balancing. Configuration exploration dynamically selects the optimal thread concurrency level and DVFS state subject to a hardware-enforced power bound

  14. High Performance Networks From Supercomputing to Cloud Computing

    CERN Document Server

    Abts, Dennis

    2011-01-01

    Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati

  15. Supercomputer debugging workshop `92

    Energy Technology Data Exchange (ETDEWEB)

    Brown, J.S.

    1993-02-01

    This report contains papers or viewgraphs on the following topics: The ABCs of Debugging in the 1990s; Cray Computer Corporation; Thinking Machines Corporation; Cray Research, Incorporated; Sun Microsystems, Inc; Kendall Square Research; The Effects of Register Allocation and Instruction Scheduling on Symbolic Debugging; Debugging Optimized Code: Currency Determination with Data Flow; A Debugging Tool for Parallel and Distributed Programs; Analyzing Traces of Parallel Programs Containing Semaphore Synchronization; Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs; Direct Manipulation Techniques for Parallel Debuggers; Transparent Observation of XENOOPS Objects; A Parallel Software Monitor for Debugging and Performance Tools on Distributed Memory Multicomputers; Profiling Performance of Inter-Processor Communications in an iWarp Torus; The Application of Code Instrumentation Technology in the Los Alamos Debugger; and CXdb: The Road to Remote Debugging.

  16. A Reliability Calculation Method for Web Service Composition Using Fuzzy Reasoning Colored Petri Nets and Its Application on Supercomputing Cloud Platform

    Directory of Open Access Journals (Sweden)

    Ziyun Deng

    2016-09-01

    Full Text Available In order to develop a Supercomputing Cloud Platform (SCP prototype system using Service-Oriented Architecture (SOA and Petri nets, we researched some technologies for Web service composition. Specifically, in this paper, we propose a reliability calculation method for Web service compositions, which uses Fuzzy Reasoning Colored Petri Net (FRCPN to verify the Web service compositions. We put forward a definition of semantic threshold similarity for Web services and a formal definition of FRCPN. We analyzed five kinds of production rules in FRCPN, and applied our method to the SCP prototype. We obtained the reliability value of the end Web service as an indicator of the overall reliability of the FRCPN. The method can test the activity of FRCPN. Experimental results show that the reliability of the Web service composition has a correlation with the number of Web services and the range of reliability transition values.

  17. Automatic discovery of the communication network topology for building a supercomputer model

    Science.gov (United States)

    Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim

    2016-10-01

    The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.

  18. TOP500 Supercomputers for June 2004

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2004-06-23

    23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.

  19. Programming Environment for a High-Performance Parallel Supercomputer with Intelligent Communication

    Directory of Open Access Journals (Sweden)

    A. Gunzinger

    1996-01-01

    Full Text Available At the Electronics Laboratory of the Swiss Federal Institute of Technology (ETH in Zürich, the high-performance parallel supercomputer MUSIC (MUlti processor System with Intelligent Communication has been developed. As applications like neural network simulation and molecular dynamics show, the Electronics Laboratory supercomputer is absolutely on par with those of conventional supercomputers, but electric power requirements are reduced by a factor of 1,000, weight is reduced by a factor of 400, and price is reduced by a factor of 100. Software development is a key issue of such parallel systems. This article focuses on the programming environment of the MUSIC system and on its applications.

  20. Will Your Next Supercomputer Come from Costco?

    Energy Technology Data Exchange (ETDEWEB)

    Farber, Rob

    2007-04-15

    A fun topic for April, one that is not an April fool’s joke, is that you can purchase a commodity 200+ Gflop (single-precision) Linux supercomputer for around $600 from your favorite electronic vendor. Yes, it’s true. Just walk in and ask for a Sony Playstation 3 (PS3), take it home and install Linux on it. IBM has provided an excellent tutorial for installing Linux and building applications at http://www-128.ibm.com/developerworks/power/library/pa-linuxps3-1. If you want to raise some eyebrows at work, then submit a purchase request for a Sony PS3 game console and watch the reactions as your paperwork wends its way through the procurement process.

  1. INTEL: Intel based systems move up in supercomputing ranks

    CERN Multimedia

    2002-01-01

    "The TOP500 supercomputer rankings released today at the Supercomputing 2002 conference show a dramatic increase in the number of Intel-based systems being deployed in high-performance computing (HPC) or supercomputing areas" (1/2 page).

  2. Comparing Clusters and Supercomputers for Lattice QCD

    CERN Document Server

    Gottlieb, S

    2001-01-01

    Since the development of the Beowulf project to build a parallel computer from commodity PC components, there have been many such clusters built. The MILC QCD code has been run on a variety of clusters and supercomputers. Key design features are identified, and the cost effectiveness of clusters and supercomputers are compared.

  3. INTEGRATION OF PANDA WORKLOAD MANAGEMENT SYSTEM WITH SUPERCOMPUTERS

    Energy Technology Data Exchange (ETDEWEB)

    De, K [University of Texas at Arlington; Jha, S [Rutgers University; Maeno, T [Brookhaven National Laboratory (BNL); Mashinistov, R. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Nilsson, P [Brookhaven National Laboratory (BNL); Novikov, A. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Poyda, A. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Ryabinkin, E. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Teslyuk, A. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Tsulaia, V. [Lawrence Berkeley National Laboratory (LBNL); Velikhov, V. [Russian Research Center, Kurchatov Institute, Moscow, Russia; Wen, G. [University of Wisconsin, Madison; Wells, Jack C [ORNL; Wenaus, T [Brookhaven National Laboratory (BNL)

    2016-01-01

    Monte-Carlo workloads on several supercomputing platforms. We will present our current accom- plishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility s infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  4. Integration of Panda Workload Management System with supercomputers

    Science.gov (United States)

    De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

    2016-09-01

    on several supercomputing platforms. We will present our current accomplishments in running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facility's infrastructure for High Energy and Nuclear Physics, as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  5. Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

    Science.gov (United States)

    1991-01-01

    Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)

  6. TOP500 Supercomputers for June 2005

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2005-06-22

    25th Edition of TOP500 List of World's Fastest Supercomputers Released: DOE/L LNL BlueGene/L and IBM gain Top Positions MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 25th edition of the TOP500 list of the world's fastest supercomputers was released today (June 22, 2005) at the 20th International Supercomputing Conference (ISC2005) in Heidelberg Germany.

  7. TOP500 Supercomputers for November 2003

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2003-11-16

    22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.

  8. Lifetime Prevalence of Mental Disorders in U.S. Adolescents: Results from the National Comorbidity Survey Replication-Adolescent Supplement (NCS-A)

    Science.gov (United States)

    Merikangas, Kathleen Ries; He, Jian-ping; Burstein, Marcy; Swanson, Sonja A.; Avenevoli, Shelli; Cui, Lihong; Benjet, Corina; Georgiades, Katholiki; Swendsen, Joel

    2010-01-01

    Objective: To present estimates of the lifetime prevalence of "DSM-IV" mental disorders with and without severe impairment, their comorbidity across broad classes of disorder, and their sociodemographic correlates. Method: The National Comorbidity Survey-Adolescent Supplement NCS-A is a nationally representative face-to-face survey of…

  9. 16 million [pounds] investment for 'virtual supercomputer'

    CERN Multimedia

    Holland, C

    2003-01-01

    "The Particle Physics and Astronomy Research Council is to spend 16million [pounds] to create a massive computing Grid, equivalent to the world's second largest supercomputer after Japan's Earth Simulator computer" (1/2 page)

  10. Supercomputers open window of opportunity for nursing.

    Science.gov (United States)

    Meintz, S L

    1993-01-01

    A window of opportunity was opened for nurse researchers with the High Performance Computing and Communications (HPCC) initiative in President Bush's 1992 fiscal-year budget. Nursing research moved into the high-performance computing environment through the University of Nevada Las Vegas/Cray Project for Nursing and Health Data Research (PNHDR). USing the CRAY YMP 2/216 supercomputer, the PNHDR established the validity of a supercomputer platform for nursing research. In addition, the research has identified a paradigm shift in statistical analysis, delineated actual and potential barriers to nursing research in a supercomputing environment, conceptualized a new branch of nursing science called Nurmetrics, and discovered new avenue for nursing research utilizing supercomputing tools.

  11. TOP500 Supercomputers for November 2004

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2004-11-08

    24th Edition of TOP500 List of World's Fastest Supercomputers Released: DOE/IBM BlueGene/L and NASA/SGI's Columbia gain Top Positions MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 24th edition of the TOP500 list of the worlds fastest supercomputers was released today (November 8, 2004) at the SC2004 Conference in Pittsburgh, Pa.

  12. Misleading Performance Reporting in the Supercomputing Field

    Directory of Open Access Journals (Sweden)

    David H. Bailey

    1992-01-01

    Full Text Available In a previous humorous note, I outlined 12 ways in which performance figures for scientific supercomputers can be distorted. In this paper, the problem of potentially misleading performance reporting is discussed in detail. Included are some examples that have appeared in recent published scientific papers. This paper also includes some proposed guidelines for reporting performance, the adoption of which would raise the level of professionalism and reduce the level of confusion in the field of supercomputing.

  13. Simulating Galactic Winds on Supercomputers

    Science.gov (United States)

    Schneider, Evan

    2017-01-01

    Galactic winds are a ubiquitous feature of rapidly star-forming galaxies. Observations of nearby galaxies have shown that winds are complex, multiphase phenomena, comprised of outflowing gas at a large range of densities, temperatures, and velocities. Describing how starburst-driven outflows originate, evolve, and affect the circumgalactic medium and gas supply of galaxies is an important challenge for theories of galaxy evolution. In this talk, I will discuss how we are using a new hydrodynamics code, Cholla, to improve our understanding of galactic winds. Cholla is a massively parallel, GPU-based code that takes advantage of specialized hardware on the newest generation of supercomputers. With Cholla, we can perform large, three-dimensional simulations of multiphase outflows, allowing us to track the coupling of mass and momentum between gas phases across hundreds of parsecs at sub-parsec resolution. The results of our recent simulations demonstrate that the evolution of cool gas in galactic winds is highly dependent on the initial structure of embedded clouds. In particular, we find that turbulent density structures lead to more efficient mass transfer from cool to hot phases of the wind. I will discuss the implications of our results both for the incorporation of winds into cosmological simulations, and for interpretations of observed multiphase winds and the circumgalatic medium of nearby galaxies.

  14. GREEN SUPERCOMPUTING IN A DESKTOP BOX

    Energy Technology Data Exchange (ETDEWEB)

    HSU, CHUNG-HSING [Los Alamos National Laboratory; FENG, WU-CHUN [NON LANL; CHING, AVERY [NON LANL

    2007-01-17

    The computer workstation, introduced by Sun Microsystems in 1982, was the tool of choice for scientists and engineers as an interactive computing environment for the development of scientific codes. However, by the mid-1990s, the performance of workstations began to lag behind high-end commodity PCs. This, coupled with the disappearance of BSD-based operating systems in workstations and the emergence of Linux as an open-source operating system for PCs, arguably led to the demise of the workstation as we knew it. Around the same time, computational scientists started to leverage PCs running Linux to create a commodity-based (Beowulf) cluster that provided dedicated computer cycles, i.e., supercomputing for the rest of us, as a cost-effective alternative to large supercomputers, i.e., supercomputing for the few. However, as the cluster movement has matured, with respect to cluster hardware and open-source software, these clusters have become much more like their large-scale supercomputing brethren - a shared (and power-hungry) datacenter resource that must reside in a machine-cooled room in order to operate properly. Consequently, the above observations, when coupled with the ever-increasing performance gap between the PC and cluster supercomputer, provide the motivation for a 'green' desktop supercomputer - a turnkey solution that provides an interactive and parallel computing environment with the approximate form factor of a Sun SPARCstation 1 'pizza box' workstation. In this paper, they present the hardware and software architecture of such a solution as well as its prowess as a developmental platform for parallel codes. In short, imagine a 12-node personal desktop supercomputer that achieves 14 Gflops on Linpack but sips only 185 watts of power at load, resulting in a performance-power ratio that is over 300% better than their reference SMP platform.

  15. Data mining method for anomaly detection in the supercomputer task flow

    Science.gov (United States)

    Voevodin, Vadim; Voevodin, Vladimir; Shaikhislamov, Denis; Nikitenko, Dmitry

    2016-10-01

    The efficiency of most supercomputer applications is extremely low. At the same time, the user rarely even suspects that their applications may be wasting computing resources. Software tools need to be developed to help detect inefficient applications and report them to the users. We suggest an algorithm for detecting anomalies in the supercomputer's task flow, based on a data mining methods. System monitoring is used to calculate integral characteristics for every job executed, and the data is used as input for our classification method based on the Random Forest algorithm. The proposed approach can currently classify the application as one of three classes - normal, suspicious and definitely anomalous. The proposed approach has been demonstrated on actual applications running on the "Lomonosov" supercomputer.

  16. Storage-Intensive Supercomputing Benchmark Study

    Energy Technology Data Exchange (ETDEWEB)

    Cohen, J; Dossa, D; Gokhale, M; Hysom, D; May, J; Pearce, R; Yoo, A

    2007-10-30

    Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe: (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows

  17. TOP500 Supercomputers for June 2003

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2003-06-23

    21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamos National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.

  18. TOP500 Supercomputers for June 2002

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2002-06-20

    19th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 19th edition of the TOP500 list of the worlds fastest supercomputers was released today (June 20, 2002). The recently installed Earth Simulator supercomputer at the Earth Simulator Center in Yokohama, Japan, is as expected the clear new number 1. Its performance of 35.86 Tflop/s (trillions of calculations per second) running the Linpack benchmark is almost five times higher than the performance of the now No.2 IBM ASCI White system at Lawrence Livermore National Laboratory (7.2 Tflop/s). This powerful leap frogging to the top by a system so much faster than the previous top system is unparalleled in the history of the TOP500.

  19. TOP500 Supercomputers for November 2002

    Energy Technology Data Exchange (ETDEWEB)

    Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

    2002-11-15

    20th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 20th edition of the TOP500 list of the world's fastest supercomputers was released today (November 15, 2002). The Earth Simulator supercomputer installed earlier this year at the Earth Simulator Center in Yokohama, Japan, is with its Linpack benchmark performance of 35.86 Tflop/s (trillions of calculations per second) retains the number one position. The No.2 and No.3 positions are held by two new, identical ASCI Q systems at Los Alamos National Laboratory (7.73Tflop/s each). These systems are built by Hewlett-Packard and based on the Alpha Server SC computer system.

  20. GPUs: An Oasis in the Supercomputing Desert

    CERN Document Server

    Kamleh, Waseem

    2012-01-01

    A novel metric is introduced to compare the supercomputing resources available to academic researchers on a national basis. Data from the supercomputing Top 500 and the top 500 universities in the Academic Ranking of World Universities (ARWU) are combined to form the proposed "500/500" score for a given country. Australia scores poorly in the 500/500 metric when compared with other countries with a similar ARWU ranking, an indication that HPC-based researchers in Australia are at a relative disadvantage with respect to their overseas competitors. For HPC problems where single precision is sufficient, commodity GPUs provide a cost-effective means of quenching the computational thirst of otherwise parched Lattice practitioners traversing the Australian supercomputing desert. We explore some of the more difficult terrain in single precision territory, finding that BiCGStab is unreliable in single precision at large lattice sizes. We test the CGNE and CGNR forms of the conjugate gradient method on the normal equa...

  1. Simulating functional magnetic materials on supercomputers.

    Science.gov (United States)

    Gruner, Markus Ernst; Entel, Peter

    2009-07-22

    The recent passing of the petaflop per second landmark by the Roadrunner project at the Los Alamos National Laboratory marks a preliminary peak of an impressive world-wide development in the high-performance scientific computing sector. Also, purely academic state-of-the-art supercomputers such as the IBM Blue Gene/P at Forschungszentrum Jülich allow us nowadays to investigate large systems of the order of 10(3) spin polarized transition metal atoms by means of density functional theory. Three applications will be presented where large-scale ab initio calculations contribute to the understanding of key properties emerging from a close interrelation between structure and magnetism. The first two examples discuss the size dependent evolution of equilibrium structural motifs in elementary iron and binary Fe-Pt and Co-Pt transition metal nanoparticles, which are currently discussed as promising candidates for ultra-high-density magnetic data storage media. However, the preference for multiply twinned morphologies at smaller cluster sizes counteracts the formation of a single-crystalline L1(0) phase, which alone provides the required hard magnetic properties. The third application is concerned with the magnetic shape memory effect in the Ni-Mn-Ga Heusler alloy, which is a technologically relevant candidate for magnetomechanical actuators and sensors. In this material strains of up to 10% can be induced by external magnetic fields due to the field induced shifting of martensitic twin boundaries, requiring an extremely high mobility of the martensitic twin boundaries, but also the selection of the appropriate martensitic structure from the rich phase diagram.

  2. Floating point arithmetic in future supercomputers

    Science.gov (United States)

    Bailey, David H.; Barton, John T.; Simon, Horst D.; Fouts, Martin J.

    1989-01-01

    Considerations in the floating-point design of a supercomputer are discussed. Particular attention is given to word size, hardware support for extended precision, format, and accuracy characteristics. These issues are discussed from the perspective of the Numerical Aerodynamic Simulation Systems Division at NASA Ames. The features believed to be most important for a future supercomputer floating-point design include: (1) a 64-bit IEEE floating-point format with 11 exponent bits, 52 mantissa bits, and one sign bit and (2) hardware support for reasonably fast double-precision arithmetic.

  3. Guide to dataflow supercomputing basic concepts, case studies, and a detailed example

    CERN Document Server

    Milutinovic, Veljko; Trifunovic, Nemanja; Giorgi, Roberto

    2015-01-01

    This unique text/reference describes an exciting and novel approach to supercomputing in the DataFlow paradigm. The major advantages and applications of this approach are clearly described, and a detailed explanation of the programming model is provided using simple yet effective examples. The work is developed from a series of lecture courses taught by the authors in more than 40 universities across more than 20 countries, and from research carried out by Maxeler Technologies, Inc. Topics and features: presents a thorough introduction to DataFlow supercomputing for big data problems; revie

  4. Adventures in Supercomputing: An innovative program

    Energy Technology Data Exchange (ETDEWEB)

    Summers, B.G.; Hicks, H.R.; Oliver, C.E.

    1995-06-01

    Within the realm of education, seldom does an innovative program become available with the potential to change an educator`s teaching methodology and serve as a spur to systemic reform. The Adventures in Supercomputing (AiS) program, sponsored by the Department of Energy, is such a program. Adventures in Supercomputing is a program for high school and middle school teachers. It has helped to change the teaching paradigm of many of the teachers involved in the program from a teacher-centered classroom to a student-centered classroom. ``A student-centered classroom offers better opportunities for development of internal motivation, planning skills, goal setting and perseverance than does the traditional teacher-directed mode``. Not only is the process of teaching changed, but evidences of systemic reform are beginning to surface. After describing the program, the authors discuss the teaching strategies being used and the evidences of systemic change in many of the AiS schools in Tennessee.

  5. Solidification in a Supercomputer: From Crystal Nuclei to Dendrite Assemblages

    Science.gov (United States)

    Shibuta, Yasushi; Ohno, Munekazu; Takaki, Tomohiro

    2015-08-01

    Thanks to the recent progress in high-performance computational environments, the range of applications of computational metallurgy is expanding rapidly. In this paper, cutting-edge simulations of solidification from atomic to microstructural levels performed on a graphics processing unit (GPU) architecture are introduced with a brief introduction to advances in computational studies on solidification. In particular, million-atom molecular dynamics simulations captured the spontaneous evolution of anisotropy in a solid nucleus in an undercooled melt and homogeneous nucleation without any inducing factor, which is followed by grain growth. At the microstructural level, the quantitative phase-field model has been gaining importance as a powerful tool for predicting solidification microstructures. In this paper, the convergence behavior of simulation results obtained with this model is discussed, in detail. Such convergence ensures the reliability of results of phase-field simulations. Using the quantitative phase-field model, the competitive growth of dendrite assemblages during the directional solidification of a binary alloy bicrystal at the millimeter scale is examined by performing two- and three-dimensional large-scale simulations by multi-GPU computation on the supercomputer, TSUBAME2.5. This cutting-edge approach using a GPU supercomputer is opening a new phase in computational metallurgy.

  6. Supercomputer debugging workshop '92

    Energy Technology Data Exchange (ETDEWEB)

    Brown, J.S.

    1993-01-01

    This report contains papers or viewgraphs on the following topics: The ABCs of Debugging in the 1990s; Cray Computer Corporation; Thinking Machines Corporation; Cray Research, Incorporated; Sun Microsystems, Inc; Kendall Square Research; The Effects of Register Allocation and Instruction Scheduling on Symbolic Debugging; Debugging Optimized Code: Currency Determination with Data Flow; A Debugging Tool for Parallel and Distributed Programs; Analyzing Traces of Parallel Programs Containing Semaphore Synchronization; Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs; Direct Manipulation Techniques for Parallel Debuggers; Transparent Observation of XENOOPS Objects; A Parallel Software Monitor for Debugging and Performance Tools on Distributed Memory Multicomputers; Profiling Performance of Inter-Processor Communications in an iWarp Torus; The Application of Code Instrumentation Technology in the Los Alamos Debugger; and CXdb: The Road to Remote Debugging.

  7. [Teacher enhancement at Supercomputing `96

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1998-02-13

    The SC`96 Education Program provided a three-day professional development experience for middle and high school science, mathematics, and computer technology teachers. The program theme was Computers at Work in the Classroom, and a majority of the sessions were presented by classroom teachers who have had several years experience in using these technologies with their students. The teachers who attended the program were introduced to classroom applications of computing and networking technologies and were provided to the greatest extent possible with lesson plans, sample problems, and other resources that could immediately be used in their own classrooms. The attached At a Glance Schedule and Session Abstracts describes in detail the three-day SC`96 Education Program. Also included is the SC`96 Education Program evaluation report and the financial report.

  8. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    Energy Technology Data Exchange (ETDEWEB)

    De, K [University of Texas at Arlington; Jha, S [Rutgers University; Klimentov, A [Brookhaven National Laboratory (BNL); Maeno, T [Brookhaven National Laboratory (BNL); Nilsson, P [Brookhaven National Laboratory (BNL); Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Wells, Jack C [ORNL; Wenaus, T [Brookhaven National Laboratory (BNL)

    2016-01-01

    was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full production for the ATLAS experiment since September 2015. We will present our current accomplishments with running PanDA WMS at supercomputers and demonstrate our ability to use PanDA as a portal independent of the computing facilities infrastructure for High Energy and Nuclear Physics as well as other data-intensive science applications, such as bioinformatics and astro-particle physics.

  9. Data-intensive computing on numerically-insensitive supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Ahrens, James P [Los Alamos National Laboratory; Fasel, Patricia K [Los Alamos National Laboratory; Habib, Salman [Los Alamos National Laboratory; Heitmann, Katrin [Los Alamos National Laboratory; Lo, Li - Ta [Los Alamos National Laboratory; Patchett, John M [Los Alamos National Laboratory; Williams, Sean J [Los Alamos National Laboratory; Woodring, Jonathan L [Los Alamos National Laboratory; Wu, Joshua [Los Alamos National Laboratory; Hsu, Chung - Hsing [ONL

    2010-12-03

    With the advent of the era of petascale supercomputing, via the delivery of the Roadrunner supercomputing platform at Los Alamos National Laboratory, there is a pressing need to address the problem of visualizing massive petascale-sized results. In this presentation, I discuss progress on a number of approaches including in-situ analysis, multi-resolution out-of-core streaming and interactive rendering on the supercomputing platform. These approaches are placed in context by the emerging area of data-intensive supercomputing.

  10. Parallel supercomputers for lattice gauge theory.

    Science.gov (United States)

    Brown, F R; Christ, N H

    1988-03-18

    During the past 10 years, particle physicists have increasingly employed numerical simulation to answer fundamental theoretical questions about the properties of quarks and gluons. The enormous computer resources required by quantum chromodynamic calculations have inspired the design and construction of very powerful, highly parallel, dedicated computers optimized for this work. This article gives a brief description of the numerical structure and current status of these large-scale lattice gauge theory calculations, with emphasis on the computational demands they make. The architecture, present state, and potential of these special-purpose supercomputers is described. It is argued that a numerical solution of low energy quantum chromodynamics may well be achieved by these machines.

  11. Lectures in Supercomputational Neurosciences Dynamics in Complex Brain Networks

    CERN Document Server

    Graben, Peter beim; Thiel, Marco; Kurths, Jürgen

    2008-01-01

    Computational Neuroscience is a burgeoning field of research where only the combined effort of neuroscientists, biologists, psychologists, physicists, mathematicians, computer scientists, engineers and other specialists, e.g. from linguistics and medicine, seem to be able to expand the limits of our knowledge. The present volume is an introduction, largely from the physicists' perspective, to the subject matter with in-depth contributions by system neuroscientists. A conceptual model for complex networks of neurons is introduced that incorporates many important features of the real brain, such as various types of neurons, various brain areas, inhibitory and excitatory coupling and the plasticity of the network. The computational implementation on supercomputers, which is introduced and discussed in detail in this book, will enable the readers to modify and adapt the algortihm for their own research. Worked-out examples of applications are presented for networks of Morris-Lecar neurons to model the cortical co...

  12. Toward the Graphics Turing Scale on a Blue Gene Supercomputer

    CERN Document Server

    McGuigan, Michael

    2008-01-01

    We investigate raytracing performance that can be achieved on a class of Blue Gene supercomputers. We measure a 822 times speedup over a Pentium IV on a 6144 processor Blue Gene/L. We measure the computational performance as a function of number of processors and problem size to determine the scaling performance of the raytracing calculation on the Blue Gene. We find nontrivial scaling behavior at large number of processors. We discuss applications of this technology to scientific visualization with advanced lighting and high resolution. We utilize three racks of a Blue Gene/L in our calculations which is less than three percent of the the capacity of the worlds largest Blue Gene computer.

  13. Internal computational fluid mechanics on supercomputers for aerospace propulsion systems

    Science.gov (United States)

    Andersen, Bernhard H.; Benson, Thomas J.

    1987-01-01

    The accurate calculation of three-dimensional internal flowfields for application towards aerospace propulsion systems requires computational resources available only on supercomputers. A survey is presented of three-dimensional calculations of hypersonic, transonic, and subsonic internal flowfields conducted at the Lewis Research Center. A steady state Parabolized Navier-Stokes (PNS) solution of flow in a Mach 5.0, mixed compression inlet, a Navier-Stokes solution of flow in the vicinity of a terminal shock, and a PNS solution of flow in a diffusing S-bend with vortex generators are presented and discussed. All of these calculations were performed on either the NAS Cray-2 or the Lewis Research Center Cray XMP.

  14. Supercomputing Centers and Electricity Service Providers

    DEFF Research Database (Denmark)

    Patki, Tapasya; Bates, Natalie; Ghatikar, Girish

    2016-01-01

    Supercomputing Centers (SCs) have high and variable power demands, which increase the challenges of the Electricity Service Providers (ESPs) with regards to efficient electricity distribution and reliable grid operation. High penetration of renewable energy generation further exacerbates this pro......Supercomputing Centers (SCs) have high and variable power demands, which increase the challenges of the Electricity Service Providers (ESPs) with regards to efficient electricity distribution and reliable grid operation. High penetration of renewable energy generation further exacerbates...... from a detailed, quantitative survey-based analysis and compare the perspectives of the European grid and SCs to the ones of the United States (US). We then show that contrary to the expectation, SCs in the US are more open toward cooperating and developing demand-management strategies with their ESPs...... (LRZ). We conclude that perspectives on demand management are dependent on the electricity market and pricing in the geographical region and on the degree of control that a particular SC has in terms of power-purchase negotiation....

  15. Virtualizing Super-Computation On-Board Uas

    Science.gov (United States)

    Salami, E.; Soler, J. A.; Cuadrado, R.; Barrado, C.; Pastor, E.

    2015-04-01

    Unmanned aerial systems (UAS, also known as UAV, RPAS or drones) have a great potential to support a wide variety of aerial remote sensing applications. Most UAS work by acquiring data using on-board sensors for later post-processing. Some require the data gathered to be downlinked to the ground in real-time. However, depending on the volume of data and the cost of the communications, this later option is not sustainable in the long term. This paper develops the concept of virtualizing super-computation on-board UAS, as a method to ease the operation by facilitating the downlink of high-level information products instead of raw data. Exploiting recent developments in miniaturized multi-core devices is the way to speed-up on-board computation. This hardware shall satisfy size, power and weight constraints. Several technologies are appearing with promising results for high performance computing on unmanned platforms, such as the 36 cores of the TILE-Gx36 by Tilera (now EZchip) or the 64 cores of the Epiphany-IV by Adapteva. The strategy for virtualizing super-computation on-board includes the benchmarking for hardware selection, the software architecture and the communications aware design. A parallelization strategy is given for the 36-core TILE-Gx36 for a UAS in a fire mission or in similar target-detection applications. The results are obtained for payload image processing algorithms and determine in real-time the data snapshot to gather and transfer to ground according to the needs of the mission, the processing time, and consumed watts.

  16. The company's mainframes join CERN's openlab for DataGrid apps and are pivotal in a new $22 million Supercomputer in the U.K.

    CERN Multimedia

    2002-01-01

    Hewlett-Packard has installed a supercomputer system valued at more than $22 million at the Wellcome Trust Sanger Institute (WTSI) in the U.K. HP has also joined the CERN openlab for DataGrid applications (1 page).

  17. Most Social Scientists Shun Free Use of Supercomputers.

    Science.gov (United States)

    Kiernan, Vincent

    1998-01-01

    Social scientists, who frequently complain that the federal government spends too little on them, are passing up what scholars in the physical and natural sciences see as the government's best give-aways: free access to supercomputers. Some social scientists say the supercomputers are difficult to use; others find desktop computers provide…

  18. Supercomputing - Use Cases, Advances, The Future (2/2)

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Supercomputing has become a staple of science and the poster child for aggressive developments in silicon technology, energy efficiency and programming. In this series we examine the key components of supercomputing setups and the various advances – recent and past – that made headlines and delivered bigger and bigger machines. We also take a closer look at the future prospects of supercomputing, and the extent of its overlap with high throughput computing, in the context of main use cases ranging from oil exploration to market simulation. On the second day, we will focus on software and software paradigms driving supercomputers, workloads that need supercomputing treatment, advances in technology and possible future developments. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and i...

  19. HPL and STREAM Benchmarks on SANAM Supercomputer

    KAUST Repository

    Bin Sulaiman, Riman A.

    2017-03-13

    SANAM supercomputer was jointly built by KACST and FIAS in 2012 ranking second that year in the Green500 list with a power efficiency of 2.3 GFLOPS/W (Rohr et al., 2014). It is a heterogeneous accelerator-based HPC system that has 300 compute nodes. Each node includes two Intel Xeon E5?2650 CPUs, two AMD FirePro S10000 dual GPUs and 128 GiB of main memory. In this work, the seven benchmarks of HPCC were installed and configured to reassess the performance of SANAM, as part of an unpublished master thesis, after it was reassembled in the Kingdom of Saudi Arabia. We present here detailed results of HPL and STREAM benchmarks.

  20. Multiprocessing on supercomputers for computational aerodynamics

    Science.gov (United States)

    Yarrow, Maurice; Mehta, Unmeel B.

    1991-01-01

    Little use is made of multiple processors available on current supercomputers (computers with a theoretical peak performance capability equal to 100 MFLOPS or more) to improve turnaround time in computational aerodynamics. The productivity of a computer user is directly related to this turnaround time. In a time-sharing environment, such improvement in this speed is achieved when multiple processors are used efficiently to execute an algorithm. The concept of multiple instructions and multiple data (MIMD) is applied through multitasking via a strategy that requires relatively minor modifications to an existing code for a single processor. This approach maps the available memory to multiple processors, exploiting the C-Fortran-Unix interface. The existing code is mapped without the need for developing a new algorithm. The procedure for building a code utilizing this approach is automated with the Unix stream editor.

  1. The PMS project Poor Man's Supercomputer

    CERN Document Server

    Csikor, Ferenc; Hegedüs, P; Horváth, V K; Katz, S D; Piróth, A

    2001-01-01

    We briefly describe the Poor Man's Supercomputer (PMS) project that is carried out at Eotvos University, Budapest. The goal is to develop a cost effective, scalable, fast parallel computer to perform numerical calculations of physical problems that can be implemented on a lattice with nearest neighbour interactions. To reach this goal we developed the PMS architecture using PC components and designed a special, low cost communication hardware and the driver software for Linux OS. Our first implementation of the PMS includes 32 nodes (PMS1). The performance of the PMS1 was tested by Lattice Gauge Theory simulations. Using SU(3) pure gauge theory or bosonic MSSM on the PMS1 computer we obtained 3$/Mflops price-per-sustained performance ratio. The design of the special hardware and the communication driver are freely available upon request for non-profit organizations.

  2. The BlueGene/L Supercomputer

    CERN Document Server

    Bhanot, G V; Gara, A; Vranas, P M; Bhanot, Gyan; Chen, Dong; Gara, Alan; Vranas, Pavlos

    2002-01-01

    The architecture of the BlueGene/L massively parallel supercomputer is described. Each computing node consists of a single compute ASIC plus 256 MB of external memory. The compute ASIC integrates two 700 MHz PowerPC 440 integer CPU cores, two 2.8 Gflops floating point units, 4 MB of embedded DRAM as cache, a memory controller for external memory, six 1.4 Gbit/s bi-directional ports for a 3-dimensional torus network connection, three 2.8 Gbit/s bi-directional ports for connecting to a global tree network and a Gigabit Ethernet for I/O. 65,536 of such nodes are connected into a 3-d torus with a geometry of 32x32x64. The total peak performance of the system is 360 Teraflops and the total amount of memory is 16 TeraBytes.

  3. Supercomputer and cluster performance modeling and analysis efforts:2004-2006.

    Energy Technology Data Exchange (ETDEWEB)

    Sturtevant, Judith E.; Ganti, Anand; Meyer, Harold (Hal) Edward; Stevenson, Joel O.; Benner, Robert E., Jr. (.,; .); Goudy, Susan Phelps; Doerfler, Douglas W.; Domino, Stefan Paul; Taylor, Mark A.; Malins, Robert Joseph; Scott, Ryan T.; Barnette, Daniel Wayne; Rajan, Mahesh; Ang, James Alfred; Black, Amalia Rebecca; Laub, Thomas William; Vaughan, Courtenay Thomas; Franke, Brian Claude

    2007-02-01

    This report describes efforts by the Performance Modeling and Analysis Team to investigate performance characteristics of Sandia's engineering and scientific applications on the ASC capability and advanced architecture supercomputers, and Sandia's capacity Linux clusters. Efforts to model various aspects of these computers are also discussed. The goals of these efforts are to quantify and compare Sandia's supercomputer and cluster performance characteristics; to reveal strengths and weaknesses in such systems; and to predict performance characteristics of, and provide guidelines for, future acquisitions and follow-on systems. Described herein are the results obtained from running benchmarks and applications to extract performance characteristics and comparisons, as well as modeling efforts, obtained during the time period 2004-2006. The format of the report, with hypertext links to numerous additional documents, purposefully minimizes the document size needed to disseminate the extensive results from our research.

  4. World's biggest 'virtual supercomputer' given the go-ahead

    CERN Multimedia

    2003-01-01

    "The Particle Physics and Astronomy Research Council has today announced GBP 16 million to create a massive computing Grid, equivalent to the world's second largest supercomputer after Japan's Earth Simulator computer" (1 page).

  5. The TianHe-1A Supercomputer: Its Hardware and Software

    Institute of Scientific and Technical Information of China (English)

    Xue-Jun Yang; Xiang-Ke Liao; Kai Lu; Qing-Feng Hu; Jun-Qiang Song; Jin-Shu Su

    2011-01-01

    This paper presents an overview of TianHe-1A (TH-1A) supercomputer, which is built by National University of Defense Technology of China (NUDT). TH-1A adopts a hybrid architecture by integrating CPUs and GPUs, and its interconnect network is a proprietary high-speed communication network. The theoretical peak performance of TH-1A is 4700TFlops, and its LINPACK test result is 2566TFlops. It was ranked the No. 1 on the TOP500 List released in November, 2010. TH-1A is now deployed in National Supercomputer Center in Tianjin and provides high performance computing services. TH-1A has played an important role in many applications, such as oil exploration, weather forecast, bio-medical research.

  6. Developing and Deploying Advanced Algorithms to Novel Supercomputing Hardware

    CERN Document Server

    Brunner, Robert J; Myers, Adam D

    2007-01-01

    The objective of our research is to demonstrate the practical usage and orders of magnitude speedup of real-world applications by using alternative technologies to support high performance computing. Currently, the main barrier to the widespread adoption of this technology is the lack of development tools and case studies that typically impede non-specialists that might otherwise develop applications that could leverage these technologies. By partnering with the Innovative Systems Laboratory at the National Center for Supercomputing, we have obtained access to several novel technologies, including several Field-Programmable Gate Array (FPGA) systems, NVidia Graphics Processing Units (GPUs), and the STI Cell BE platform. Our goal is to not only demonstrate the capabilities of these systems, but to also serve as guides for others to follow in our path. To date, we have explored the efficacy of the SRC-6 MAP-C and MAP-E and SGI RASC Athena and RC100 reconfigurable computing platforms in supporting a two-point co...

  7. Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report.

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, Richard C.

    2009-09-01

    This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential of PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.

  8. Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

    Science.gov (United States)

    Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

    2016-10-01

    The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the

  9. Analyzing the Interplay of Failures and Workload on a Leadership-Class Supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Meneses, Esteban [University of Pittsburgh; Ni, Xiang [University of Illinois at Urbana-Champaign; Jones, Terry R [ORNL; Maxwell, Don E [ORNL

    2015-01-01

    The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of fault tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.

  10. Taking ASCI supercomputing to the end game.

    Energy Technology Data Exchange (ETDEWEB)

    DeBenedictis, Erik P.

    2004-03-01

    The ASCI supercomputing program is broadly defined as running physics simulations on progressively more powerful digital computers. What happens if we extrapolate the computer technology to its end? We have developed a model for key ASCI computations running on a hypothetical computer whose technology is parameterized in ways that account for advancing technology. This model includes technology information such as Moore's Law for transistor scaling and developments in cooling technology. The model also includes limits imposed by laws of physics, such as thermodynamic limits on power dissipation, limits on cooling, and the limitation of signal propagation velocity to the speed of light. We apply this model and show that ASCI computations will advance smoothly for another 10-20 years to an 'end game' defined by thermodynamic limits and the speed of light. Performance levels at the end game will vary greatly by specific problem, but will be in the Exaflops to Zetaflops range for currently anticipated problems. We have also found an architecture that would be within a constant factor of giving optimal performance at the end game. This architecture is an evolutionary derivative of the mesh-connected microprocessor (such as ASCI Red Storm or IBM Blue Gene/L). We provide designs for the necessary enhancement to microprocessor functionality and the power-efficiency of both the processor and memory system. The technology we develop in the foregoing provides a 'perfect' computer model with which we can rate the quality of realizable computer designs, both in this writing and as a way of designing future computers. This report focuses on classical computers based on irreversible digital logic, and more specifically on algorithms that simulate space computing, irreversible logic, analog computers, and other ways to address stockpile stewardship that are outside the scope of this report.

  11. Numerical infinities and infinitesimals in a new supercomputing framework

    Science.gov (United States)

    Sergeyev, Yaroslav D.

    2016-06-01

    Traditional computers are able to work numerically with finite numbers only. The Infinity Computer patented recently in USA and EU gets over this limitation. In fact, it is a computational device of a new kind able to work numerically not only with finite quantities but with infinities and infinitesimals, as well. The new supercomputing methodology is not related to non-standard analysis and does not use either Cantor's infinite cardinals or ordinals. It is founded on Euclid's Common Notion 5 saying `The whole is greater than the part'. This postulate is applied to all numbers (finite, infinite, and infinitesimal) and to all sets and processes (finite and infinite). It is shown that it becomes possible to write down finite, infinite, and infinitesimal numbers by a finite number of symbols as numerals belonging to a positional numeral system with an infinite radix described by a specific ad hoc introduced axiom. Numerous examples of the usage of the introduced computational tools are given during the lecture. In particular, algorithms for solving optimization problems and ODEs are considered among the computational applications of the Infinity Computer. Numerical experiments executed on a software prototype of the Infinity Computer are discussed.

  12. Supercomputing - Use Cases, Advances, The Future (1/2)

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Supercomputing has become a staple of science and the poster child for aggressive developments in silicon technology, energy efficiency and programming. In this series we examine the key components of supercomputing setups and the various advances – recent and past – that made headlines and delivered bigger and bigger machines. We also take a closer look at the future prospects of supercomputing, and the extent of its overlap with high throughput computing, in the context of main use cases ranging from oil exploration to market simulation. On the first day, we will focus on the history and theory of supercomputing, the top500 list and the hardware that makes supercomputers tick. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP an...

  13. Development of the general interpolants method for the CYBER 200 series of supercomputers

    Science.gov (United States)

    Stalnaker, J. F.; Robinson, M. A.; Spradley, L. W.; Kurzius, S. C.; Thoenes, J.

    1988-01-01

    The General Interpolants Method (GIM) is a 3-D, time-dependent, hybrid procedure for generating numerical analogs of the conservation laws. This study is directed toward the development and application of the GIM computer code for fluid dynamic research applications as implemented for the Cyber 200 series of supercomputers. An elliptic and quasi-parabolic version of the GIM code are discussed. Turbulence models, algebraic and differential equations, were added to the basic viscous code. An equilibrium reacting chemistry model and an implicit finite difference scheme are also included.

  14. Validation of the diagnoses of panic disorder and phobic disorders in the US National Comorbidity Survey Replication Adolescent (NCS-A) supplement.

    Science.gov (United States)

    Green, Jennifer Greif; Avenevoli, Shelli; Finkelman, Matthew; Gruber, Michael J; Kessler, Ronald C; Merikangas, Kathleen R; Sampson, Nancy A; Zaslavsky, Alan M

    2011-06-01

    Validity of the adolescent version of the World Health Organization Composite International Diagnostic Interview (CIDI) Version 3.0, a fully-structured research diagnostic interview designed to be used by trained lay interviewers, is assessed in comparison to independent clinical diagnoses based on the Schedule for Affective Disorders and Schizophrenia for School-age Children (K-SADS). This assessment is carried out in the clinical reappraisal sub-sample (n = 347) of the US National Comorbidity Survey Adolescent (NCS-A) supplement, a large (n = 10,148) community epidemiological survey of the prevalence and correlates of adolescent mental disorders in the United States. The diagnoses considered are panic disorder and phobic disorders (social phobia, specific phobia, agoraphobia). CIDI diagnoses are found to have good concordance with K-SADS diagnoses [area under the receiver operating characteristic curve (AUC) = 0.81-0.94], although the CIDI diagnoses are consistency somewhat higher than the K-SADS diagnoses. Data are also presented on criterion-level concordance in an effort to pinpoint CIDI question series that might be improved in future modifications of the instrument. Finally, data are presented on the factor structure of the fears associated with social phobia, the only disorder in this series where substantial controversy exists about disorder subtypes. Copyright © 2011 John Wiley & Sons, Ltd.

  15. Visualization on supercomputing platform level II ASC milestone (3537-1B) results from Sandia.

    Energy Technology Data Exchange (ETDEWEB)

    Geveci, Berk (Kitware, Inc., Clifton Park, NY); Fabian, Nathan; Marion, Patrick (Kitware, Inc., Clifton Park, NY); Moreland, Kenneth D.

    2010-09-01

    This report provides documentation for the completion of the Sandia portion of the ASC Level II Visualization on the platform milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratories and Los Alamos National Laboratories. This milestone contains functionality required for performing visualization directly on a supercomputing platform, which is necessary for peta-scale visualization. Sandia's contribution concerns in-situ visualization, running a visualization in tandem with a solver. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors(GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the performance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. Scientific simulation on parallel supercomputers is traditionally performed in four

  16. Argonne Leadership Computing Facility 2011 annual report : Shaping future supercomputing.

    Energy Technology Data Exchange (ETDEWEB)

    Papka, M.; Messina, P.; Coffey, R.; Drugan, C. (LCF)

    2012-08-16

    The ALCF's Early Science Program aims to prepare key applications for the architecture and scale of Mira and to solidify libraries and infrastructure that will pave the way for other future production applications. Two billion core-hours have been allocated to 16 Early Science projects on Mira. The projects, in addition to promising delivery of exciting new science, are all based on state-of-the-art, petascale, parallel applications. The project teams, in collaboration with ALCF staff and IBM, have undertaken intensive efforts to adapt their software to take advantage of Mira's Blue Gene/Q architecture, which, in a number of ways, is a precursor to future high-performance-computing architecture. The Argonne Leadership Computing Facility (ALCF) enables transformative science that solves some of the most difficult challenges in biology, chemistry, energy, climate, materials, physics, and other scientific realms. Users partnering with ALCF staff have reached research milestones previously unattainable, due to the ALCF's world-class supercomputing resources and expertise in computation science. In 2011, the ALCF's commitment to providing outstanding science and leadership-class resources was honored with several prestigious awards. Research on multiscale brain blood flow simulations was named a Gordon Bell Prize finalist. Intrepid, the ALCF's BG/P system, ranked No. 1 on the Graph 500 list for the second consecutive year. The next-generation BG/Q prototype again topped the Green500 list. Skilled experts at the ALCF enable researchers to conduct breakthrough science on the Blue Gene system in key ways. The Catalyst Team matches project PIs with experienced computational scientists to maximize and accelerate research in their specific scientific domains. The Performance Engineering Team facilitates the effective use of applications on the Blue Gene system by assessing and improving the algorithms used by applications and the techniques used to

  17. Recent results from the Swinburne supercomputer software correlator

    Science.gov (United States)

    Tingay, Steven; et al.

    I will descrcibe the development of software correlators on the Swinburne Beowulf supercomputer and recent work using the Cray XD-1 machine. I will also describe recent Australian and global VLBI experiments that have been processed on the Swinburne software correlator, along with imaging results from these data. The role of the software correlator in Australia's eVLBI project will be discussed.

  18. Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer

    Science.gov (United States)

    Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division

    2016-06-01

    Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.

  19. Access to Supercomputers. Higher Education Panel Report 69.

    Science.gov (United States)

    Holmstrom, Engin Inel

    This survey was conducted to provide the National Science Foundation with baseline information on current computer use in the nation's major research universities, including the actual and potential use of supercomputers. Questionnaires were sent to 207 doctorate-granting institutions; after follow-ups, 167 institutions (91% of the institutions…

  20. The Sky's the Limit When Super Students Meet Supercomputers.

    Science.gov (United States)

    Trotter, Andrew

    1991-01-01

    In a few select high schools in the U.S., supercomputers are allowing talented students to attempt sophisticated research projects using simultaneous simulations of nature, culture, and technology not achievable by ordinary microcomputers. Schools can get their students online by entering contests and seeking grants and partnerships with…

  1. Feynman diagrams sampling for quantum field theories on the QPACE 2 supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Rappl, Florian

    2016-08-01

    This work discusses the application of Feynman diagram sampling in quantum field theories. The method uses a computer simulation to sample the diagrammatic space obtained in a series expansion. For running large physical simulations powerful computers are obligatory, effectively splitting the thesis in two parts. The first part deals with the method of Feynman diagram sampling. Here the theoretical background of the method itself is discussed. Additionally, important statistical concepts and the theory of the strong force, quantum chromodynamics, are introduced. This sets the context of the simulations. We create and evaluate a variety of models to estimate the applicability of diagrammatic methods. The method is then applied to sample the perturbative expansion of the vertex correction. In the end we obtain the value for the anomalous magnetic moment of the electron. The second part looks at the QPACE 2 supercomputer. This includes a short introduction to supercomputers in general, as well as a closer look at the architecture and the cooling system of QPACE 2. Guiding benchmarks of the InfiniBand network are presented. At the core of this part, a collection of best practices and useful programming concepts are outlined, which enables the development of efficient, yet easily portable, applications for the QPACE 2 system.

  2. Unique Methodologies for Nano/Micro Manufacturing Job Training Via Desktop Supercomputer Modeling and Simulation

    Energy Technology Data Exchange (ETDEWEB)

    Kimball, Clyde [Northern Illinois Univ., DeKalb, IL (United States); Karonis, Nicholas [Northern Illinois Univ., DeKalb, IL (United States); Lurio, Laurence [Northern Illinois Univ., DeKalb, IL (United States); Piot, Philippe [Northern Illinois Univ., DeKalb, IL (United States); Xiao, Zhili [Northern Illinois Univ., DeKalb, IL (United States); Glatz, Andreas [Northern Illinois Univ., DeKalb, IL (United States); Pohlman, Nicholas [Northern Illinois Univ., DeKalb, IL (United States); Hou, Minmei [Northern Illinois Univ., DeKalb, IL (United States); Demir, Veysel [Northern Illinois Univ., DeKalb, IL (United States); Song, Jie [Northern Illinois Univ., DeKalb, IL (United States); Duffin, Kirk [Northern Illinois Univ., DeKalb, IL (United States); Johns, Mitrick [Northern Illinois Univ., DeKalb, IL (United States); Sims, Thomas [Northern Illinois Univ., DeKalb, IL (United States); Yin, Yanbin [Northern Illinois Univ., DeKalb, IL (United States)

    2012-11-21

    This project establishes an initiative in high speed (Teraflop)/large-memory desktop supercomputing for modeling and simulation of dynamic processes important for energy and industrial applications. It provides a training ground for employment of current students in an emerging field with skills necessary to access the large supercomputing systems now present at DOE laboratories. It also provides a foundation for NIU faculty to quantum leap beyond their current small cluster facilities. The funding extends faculty and student capability to a new level of analytic skills with concomitant publication avenues. The components of the Hewlett Packard computer obtained by the DOE funds create a hybrid combination of a Graphics Processing System (12 GPU/Teraflops) and a Beowulf CPU system (144 CPU), the first expandable via the NIU GAEA system to ~60 Teraflops integrated with a 720 CPU Beowulf system. The software is based on access to the NVIDIA/CUDA library and the ability through MATLAB multiple licenses to create additional local programs. A number of existing programs are being transferred to the CPU Beowulf Cluster. Since the expertise necessary to create the parallel processing applications has recently been obtained at NIU, this effort for software development is in an early stage. The educational program has been initiated via formal tutorials and classroom curricula designed for the coming year. Specifically, the cost focus was on hardware acquisitions and appointment of graduate students for a wide range of applications in engineering, physics and computer science.

  3. 高阶精度CFD应用在天河2系统上的异构并行模拟与性能优化%Heterogeneous Computing and Optimization on Tianhe-2 Supercomputer System for High-Order Accurate CFD Applications

    Institute of Scientific and Technical Information of China (English)

    王勇献; 张理论; 车永刚; 徐传福; 刘巍; 程兴华

    2015-01-01

    There still exist great challenges when simulating the large‐scale computational fluid dynamics ( CFD ) applications on the contemporary supercomputer systems with many‐core heterogeneous architecture like Tianhe‐2 ,which is also one of the research hotspots in this field .In this paper ,we focus on exploring the techniques of efficient parallel simulations on the heterogeneous high‐performance computing ( HPC ) platform for large‐scale CFD applications with high‐order accurate scheme .Some approaches and strategies of performance optimization matched with both the characteristic of CFD application and the architectures of heterogeneous HPC platform are proposed from the perspective of task decomposition , exploration of parallelism , optimization for multi‐threaded running ,vectorization by employing single‐instruction multiple‐data (SIMD) ,optimization for the cooperation of both CPUs and co‐processors ,and so on .To evaluate the performance of these techniques ,some numerical experiments are performed on Tianhe‐2 supercomputer system with the maximum number of grid points achieving 1 .228 × 1011 ,and the total amount of processors and/or co‐processors being 590000 .Such a large‐scale CFD simulation with high‐order accurate scheme has to our best knowledge never been attempted before .It shows that the optimized code can get the speedup of 2 .6X on CPU and co‐processor hybrid platform than that on the CPU platform only ,and perfect scalability is also observed from the test results . The present work redefines the frontier of high performance computing for fluid dynamics simulations on heterogeneous platform .%在当前主流的众核异构高性能计算机平台上开展超大规模计算流体力学(computational fluid dynamics ,CFD)应用的高效并行数值模拟仍然面临着一系列挑战性技术问题,也是该领域的热点研究问题之一.面向天河2高性能异构并行计算

  4. Graph visualization for the analysis of the structure and dynamics of extreme-scale supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Berkbigler, K. P. (Kathryn P.); Bush, B. W. (Brian W.); Davis, Kei,; Hoisie, A. (Adolfy); Smith, S. A. (Steve A.)

    2002-01-01

    We are exploring the development and application of information visualization techniques for the analysis of new extreme-scale supercomputer architectures. Modern supercomputers typically comprise very large clusters of commodity SMPs interconnected by possibly dense and often nonstandard networks. The scale, complexity, and inherent nonlocality of the structure and dynamics of this hardware, and the systems and applications distributed over it, challenge traditional analysis methods. As part of the a la carte team at Los Alamos National Laboratory, who are simulating these advanced architectures, we are exploring advanced visualization techniques and creating tools to provide intuitive exploration, discovery, and analysis of these simulations. This work complements existing and emerging algorithmic analysis tools. Here we gives background on the problem domain, a description of a prototypical computer architecture of interest (on the order of 10,000 processors connected by a quaternary fat-tree network), and presentations of several visualizations of the simulation data that make clear the flow of data in the interconnection network.

  5. Extending ATLAS Computing to Commercial Clouds and Supercomputers

    CERN Document Server

    Nilsson, P; The ATLAS collaboration; Filipcic, A; Klimentov, A; Maeno, T; Oleynik, D; Panitkin, S; Wenaus, T; Wu, W

    2014-01-01

    The Large Hadron Collider will resume data collection in 2015 with substantially increased computing requirements relative to its first 2009-2013 run. A near doubling of the energy and the data rate, high level of event pile-up, and detector upgrades will mean the number and complexity of events to be analyzed will increase dramatically. A naive extrapolation of the Run 1 experience would suggest that a 5-6 fold increase in computing resources are needed - impossible within the anticipated flat computing budgets in the near future. Consequently ATLAS is engaged in an ambitious program to expand its computing to all available resources, notably including opportunistic use of commercial clouds and supercomputers. Such resources present new challenges in managing heterogeneity, supporting data flows, parallelizing workflows, provisioning software, and other aspects of distributed computing, all while minimizing operational load. We will present the ATLAS experience to date with clouds and supercomputers, and des...

  6. Integration of Titan supercomputer at OLCF with ATLAS production system

    CERN Document Server

    Panitkin, Sergey; The ATLAS collaboration

    2016-01-01

    The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this talk we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for job...

  7. Supercomputers ready for use as discovery machines for neuroscience

    OpenAIRE

    Kunkel, Susanne; Schmidt, Maximilian; Helias, Moritz; Eppler, Jochen Martin; Igarashi, Jun; Masumoto, Gen; Fukai, Tomoki; Ishii, Shin; Plesser, Hans Ekkehard; Morrison, Abigail; Diesmann, Markus

    2013-01-01

    NEST is a widely used tool to simulate biological spiking neural networks [1]. The simulator is subject to continuous development, which is driven by the requirements of the current neuroscientific questions. At present, a major part of the software development focuses on the improvement of the simulator's fundamental data structures in order to enable brain-scale simulations on supercomputers such as the Blue Gene system in Jülich and the K computer in Kobe. Based on our memory-u...

  8. Scientists turn to supercomputers for knowledge about universe

    CERN Multimedia

    White, G

    2003-01-01

    The DOE is funding the computers at the Center for Astrophysical Thermonuclear Flashes which is based at the University of Chicago and uses supercomputers at the nation's weapons labs to study explosions in and on certain stars. The DOE is picking up the project's bill in the hope that the work will help the agency learn to better simulate the blasts of nuclear warheads (1 page).

  9. Study of ATLAS TRT performance with GRID and supercomputers

    Science.gov (United States)

    Krasnopevtsev, D. V.; Klimentov, A. A.; Mashinistov, R. Yu.; Belyaev, N. L.; Ryabinkin, E. A.

    2016-09-01

    One of the most important studies dedicated to be solved for ATLAS physical analysis is a reconstruction of proton-proton events with large number of interactions in Transition Radiation Tracker. Paper includes Transition Radiation Tracker performance results obtained with the usage of the ATLAS GRID and Kurchatov Institute's Data Processing Center including Tier-1 grid site and supercomputer as well as analysis of CPU efficiency during these studies.

  10. From Thread to Transcontinental Computer: Disturbing Lessons in Distributed Supercomputing

    CERN Document Server

    Groen, Derek

    2015-01-01

    We describe the political and technical complications encountered during the astronomical CosmoGrid project. CosmoGrid is a numerical study on the formation of large scale structure in the universe. The simulations are challenging due to the enormous dynamic range in spatial and temporal coordinates, as well as the enormous computer resources required. In CosmoGrid we dealt with the computational requirements by connecting up to four supercomputers via an optical network and make them operate as a single machine. This was challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as three of them are located scattered across Europe and fourth one is in Tokyo. The co-scheduling of multiple computers and the 'gridification' of the code enabled us to achieve an efficiency of up to $93\\%$ for this distributed intercontinental supercomputer. In this work, we find that high-performance computing on a grid can be done much more effectively if the sites involved are will...

  11. Proceedings of the first energy research power supercomputer users symposium

    Energy Technology Data Exchange (ETDEWEB)

    1991-01-01

    The Energy Research Power Supercomputer Users Symposium was arranged to showcase the richness of science that has been pursued and accomplished in this program through the use of supercomputers and now high performance parallel computers over the last year: this report is the collection of the presentations given at the Symposium. Power users'' were invited by the ER Supercomputer Access Committee to show that the use of these computational tools and the associated data communications network, ESNet, go beyond merely speeding up computations. Today the work often directly contributes to the advancement of the conceptual developments in their fields and the computational and network resources form the very infrastructure of today's science. The Symposium also provided an opportunity, which is rare in this day of network access to computing resources, for the invited users to compare and discuss their techniques and approaches with those used in other ER disciplines. The significance of new parallel architectures was highlighted by the interesting evening talk given by Dr. Stephen Orszag of Princeton University.

  12. Extracting the Textual and Temporal Structure of Supercomputing Logs

    Energy Technology Data Exchange (ETDEWEB)

    Jain, S; Singh, I; Chandra, A; Zhang, Z; Bronevetsky, G

    2009-05-26

    Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.

  13. OpenMC:Towards Simplifying Programming for TianHe Supercomputers

    Institute of Scientific and Technical Information of China (English)

    廖湘科; 杨灿群; 唐滔; 易会战; 王锋; 吴强; 薛京灵

    2014-01-01

    Modern petascale and future exascale systems are massively heterogeneous architectures. Developing produc-tive intra-node programming models is crucial toward addressing their programming challenge. We introduce a directive-based intra-node programming model, OpenMC, and show that this new model can achieve ease of programming, high performance, and the degree of portability desired for heterogeneous nodes, especially those in TianHe supercomputers. While existing models are geared towards offloading computations to accelerators (typically one), OpenMC aims to more uniformly and adequately exploit the potential offered by multiple CPUs and accelerators in a compute node. OpenMC achieves this by providing a unified abstraction of hardware resources as workers and facilitating the exploitation of asyn-chronous task parallelism on the workers. We present an overview of OpenMC, a prototyping implementation, and results from some initial comparisons with OpenMP and hand-written code in developing six applications on two types of nodes from TianHe supercomputers.

  14. Visualization at Supercomputing Centers: The Tale of Little Big Iron and the Three Skinny Guys

    Energy Technology Data Exchange (ETDEWEB)

    Bethel, E. Wes; van Rosendale, John; Southard, Dale; Gaither, Kelly; Childs, Hank; Brugger, Eric; Ahern, Sean

    2010-12-01

    Supercomputing Centers (SC's) are unique resources that aim to enable scientific knowledge discovery through the use of large computational resources, the Big Iron. Design, acquisition, installation, and management of the Big Iron are activities that are carefully planned and monitored. Since these Big Iron systems produce a tsunami of data, it is natural to co-locate visualization and analysis infrastructure as part of the same facility. This infrastructure consists of hardware (Little Iron) and staff (Skinny Guys). Our collective experience suggests that design, acquisition, installation, and management of the Little Iron and Skinny Guys does not receive the same level of treatment as that of the Big Iron. The main focus of this article is to explore different aspects of planning, designing, fielding, and maintaining the visualization and analysis infrastructure at supercomputing centers. Some of the questions we explore in this article include:"How should the Little Iron be sized to adequately support visualization and analysis of data coming off the Big Iron?" What sort of capabilities does it need to have?" Related questions concern the size of visualization support staff:"How big should a visualization program be (number of persons) and what should the staff do?" and"How much of the visualization should be provided as a support service, and how much should applications scientists be expected to do on their own?"

  15. The QCDOC supercomputer: hardware, software, and performance

    CERN Document Server

    Boyle, P A; Wettig, T

    2003-01-01

    An overview is given of the QCDOC architecture, a massively parallel and highly scalable computer optimized for lattice QCD using system-on-a-chip technology. The heart of a single node is the PowerPC-based QCDOC ASIC, developed in collaboration with IBM Research, with a peak speed of 1 GFlop/s. The nodes communicate via high-speed serial links in a 6-dimensional mesh with nearest-neighbor connections. We find that highly optimized four-dimensional QCD code obtains over 50% efficiency in cycle accurate simulations of QCDOC, even for problems of fixed computational difficulty run on tens of thousands of nodes. We also provide an overview of the QCDOC operating system, which manages and runs QCDOC applications on partitions of variable dimensionality. Finally, the SciDAC activity for QCDOC and the message-passing interface QMP specified as a part of the SciDAC effort are discussed for QCDOC. We explain how to make optimal use of QMP routines on QCDOC in conjunction with existing C and C++ lattice QCD codes, inc...

  16. Supercomputer modeling of volcanic eruption dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Kieffer, S.W. [Arizona State Univ., Tempe, AZ (United States); Valentine, G.A. [Los Alamos National Lab., NM (United States); Woo, Mahn-Ling [Arizona State Univ., Tempe, AZ (United States)

    1995-06-01

    Our specific goals are to: (1) provide a set of models based on well-defined assumptions about initial and boundary conditions to constrain interpretations of observations of active volcanic eruptions--including movies of flow front velocities, satellite observations of temperature in plumes vs. time, and still photographs of the dimensions of erupting plumes and flows on Earth and other planets; (2) to examine the influence of subsurface conditions on exit plane conditions and plume characteristics, and to compare the models of subsurface fluid flow with seismic constraints where possible; (3) to relate equations-of-state for magma-gas mixtures to flow dynamics; (4) to examine, in some detail, the interaction of the flowing fluid with the conduit walls and ground topography through boundary layer theory so that field observations of erosion and deposition can be related to fluid processes; and (5) to test the applicability of existing two-phase flow codes for problems related to the generation of volcanic long-period seismic signals; (6) to extend our understanding and simulation capability to problems associated with emplacement of fragmental ejecta from large meteorite impacts.

  17. Operational numerical weather prediction on a GPU-accelerated cluster supercomputer

    Science.gov (United States)

    Lapillonne, Xavier; Fuhrer, Oliver; Spörri, Pascal; Osuna, Carlos; Walser, André; Arteaga, Andrea; Gysi, Tobias; Rüdisühli, Stefan; Osterried, Katherine; Schulthess, Thomas

    2016-04-01

    The local area weather prediction model COSMO is used at MeteoSwiss to provide high resolution numerical weather predictions over the Alpine region. In order to benefit from the latest developments in computer technology the model was optimized and adapted to run on Graphical Processing Units (GPUs). Thanks to these model adaptations and the acquisition of a dedicated hybrid supercomputer a new set of operational applications have been introduced, COSMO-1 (1 km deterministic), COSMO-E (2 km ensemble) and KENDA (data assimilation) at MeteoSwiss. These new applications correspond to an increase of a factor 40x in terms of computational load as compared to the previous operational setup. We present an overview of the porting approach of the COSMO model to GPUs together with a detailed description of and performance results on the new hybrid Cray CS-Storm computer, Piz Kesch.

  18. Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

    Energy Technology Data Exchange (ETDEWEB)

    Widener, Patrick (University of New Mexico); Jaconette, Steven (Northwestern University); Bridges, Patrick G. (University of New Mexico); Xia, Lei (Northwestern University); Dinda, Peter (Northwestern University); Cui, Zheng.; Lange, John (Northwestern University); Hudson, Trammell B.; Levenhagen, Michael J.; Pedretti, Kevin Thomas Tauke; Brightwell, Ronald Brian

    2009-09-01

    Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.

  19. Dust modelling and forecasting in the Barcelona Supercomputing Center: Activities and developments

    Energy Technology Data Exchange (ETDEWEB)

    Perez, C; Baldasano, J M; Jimenez-Guerrero, P; Jorba, O; Haustein, K; Basart, S [Earth Sciences Department. Barcelona Supercomputing Center. Barcelona (Spain); Cuevas, E [Izanaa Atmospheric Research Center. Agencia Estatal de Meteorologia, Tenerife (Spain); Nickovic, S [Atmospheric Research and Environment Branch, World Meteorological Organization, Geneva (Switzerland)], E-mail: carlos.perez@bsc.es

    2009-03-01

    The Barcelona Supercomputing Center (BSC) is the National Supercomputer Facility in Spain, hosting MareNostrum, one of the most powerful Supercomputers in Europe. The Earth Sciences Department of BSC operates daily regional dust and air quality forecasts and conducts intensive modelling research for short-term operational prediction. This contribution summarizes the latest developments and current activities in the field of sand and dust storm modelling and forecasting.

  20. Numerical simulations of astrophysical problems on massively parallel supercomputers

    Science.gov (United States)

    Kulikov, Igor; Chernykh, Igor; Glinsky, Boris

    2016-10-01

    In this paper, we propose the last version of the numerical model for simulation of astrophysical objects dynamics, and a new realization of our AstroPhi code for Intel Xeon Phi based RSC PetaStream supercomputers. The co-design of a computational model for the description of astrophysical objects is described. The parallel implementation and scalability tests of the AstroPhi code are presented. We achieve a 73% weak scaling efficiency with using of 256x Intel Xeon Phi accelerators with 61440 threads.

  1. AENEAS A Custom-built Parallel Supercomputer for Quantum Gravity

    CERN Document Server

    Hamber, H W

    1998-01-01

    Accurate Quantum Gravity calculations, based on the simplicial lattice formulation, are computationally very demanding and require vast amounts of computer resources. A custom-made 64-node parallel supercomputer capable of performing up to $2 \\times 10^{10}$ floating point operations per second has been assembled entirely out of commodity components, and has been operational for the last ten months. It will allow the numerical computation of a variety of quantities of physical interest in quantum gravity and related field theories, including the estimate of the critical exponents in the vicinity of the ultraviolet fixed point to an accuracy of a few percent.

  2. Reliability Lessons Learned From GPU Experience With The Titan Supercomputer at Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Gallarno, George [Christian Brothers University; Rogers, James H [ORNL; Maxwell, Don E [ORNL

    2015-01-01

    The high computational capability of graphics processing units (GPUs) is enabling and driving the scientific discovery process at large-scale. The world s second fastest supercomputer for open science, Titan, has more than 18,000 GPUs that computational scientists use to perform scientific simu- lations and data analysis. Understanding of GPU reliability characteristics, however, is still in its nascent stage since GPUs have only recently been deployed at large-scale. This paper presents a detailed study of GPU errors and their impact on system operations and applications, describing experiences with the 18,688 GPUs on the Titan supercom- puter as well as lessons learned in the process of efficient operation of GPUs at scale. These experiences are helpful to HPC sites which already have large-scale GPU clusters or plan to deploy GPUs in the future.

  3. Mixed precision numerical weather prediction on hybrid GPU-CPU supercomputers

    Science.gov (United States)

    Lapillonne, Xavier; Osuna, Carlos; Spoerri, Pascal; Osterried, Katherine; Charpilloz, Christophe; Fuhrer, Oliver

    2017-04-01

    A new version of the climate and weather model COSMO that runs faster on traditional high performance computing systems with CPUs as well as on heterogeneous architectures using graphics processing units (GPUs) has been developed. The model was in addition adapted to be able to run in "single precision" mode. After discussing the key changes introduced in this new model version and the tools used in the porting approach, we present 3 applications, namely the MeteoSwiss operational weather prediction system, COSMO-LEPS and the CALMO project, which already take advantage of the performance improvement, up to a factor 4, by running on GPU system and using the single precision mode. We discuss how the code changes open new perspectives for scientific research and can enable researchers to get access to a new class of supercomputers.

  4. A special purpose silicon compiler for designing supercomputing VLSI systems

    Science.gov (United States)

    Venkateswaran, N.; Murugavel, P.; Kamakoti, V.; Shankarraman, M. J.; Rangarajan, S.; Mallikarjun, M.; Karthikeyan, B.; Prabhakar, T. S.; Satish, V.; Venkatasubramaniam, P. R.

    1991-01-01

    Design of general/special purpose supercomputing VLSI systems for numeric algorithm execution involves tackling two important aspects, namely their computational and communication complexities. Development of software tools for designing such systems itself becomes complex. Hence a novel design methodology has to be developed. For designing such complex systems a special purpose silicon compiler is needed in which: the computational and communicational structures of different numeric algorithms should be taken into account to simplify the silicon compiler design, the approach is macrocell based, and the software tools at different levels (algorithm down to the VLSI circuit layout) should get integrated. In this paper a special purpose silicon (SPS) compiler based on PACUBE macrocell VLSI arrays for designing supercomputing VLSI systems is presented. It is shown that turn-around time and silicon real estate get reduced over the silicon compilers based on PLA's, SLA's, and gate arrays. The first two silicon compiler characteristics mentioned above enable the SPS compiler to perform systolic mapping (at the macrocell level) of algorithms whose computational structures are of GIPOP (generalized inner product outer product) form. Direct systolic mapping on PLA's, SLA's, and gate arrays is very difficult as they are micro-cell based. A novel GIPOP processor is under development using this special purpose silicon compiler.

  5. The TeraGyroid Experiment – Supercomputing 2003

    Directory of Open Access Journals (Sweden)

    R.J. Blake

    2005-01-01

    Full Text Available Amphiphiles are molecules with hydrophobic tails and hydrophilic heads. When dispersed in solvents, they self assemble into complex mesophases including the beautiful cubic gyroid phase. The goal of the TeraGyroid experiment was to study defect pathways and dynamics in these gyroids. The UK's supercomputing and USA's TeraGrid facilities were coupled together, through a dedicated high-speed network, into a single computational Grid for research work that peaked around the Supercomputing 2003 conference. The gyroids were modeled using lattice Boltzmann methods with parameter spaces explored using many 1283 and 3grid point simulations, this data being used to inform the world's largest three-dimensional time dependent simulation with 10243-grid points. The experiment generated some 2 TBytes of useful data. In terms of Grid technology the project demonstrated the migration of simulations (using Globus middleware to and fro across the Atlantic exploiting the availability of resources. Integration of the systems accelerated the time to insight. Distributed visualisation of the output datasets enabled the parameter space of the interactions within the complex fluid to be explored from a number of sites, informed by discourse over the Access Grid. The project was sponsored by EPSRC (UK and NSF (USA with trans-Atlantic optical bandwidth provided by British Telecommunications.

  6. Calibrating Building Energy Models Using Supercomputer Trained Machine Learning Agents

    Energy Technology Data Exchange (ETDEWEB)

    Sanyal, Jibonananda [ORNL; New, Joshua Ryan [ORNL; Edwards, Richard [ORNL; Parker, Lynne Edwards [ORNL

    2014-01-01

    Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrofit purposes. EnergyPlus is the flagship Department of Energy software that performs BEM for different types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manually by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building energy modeling unfeasible for smaller projects. In this paper, we describe the Autotune research which employs machine learning algorithms to generate agents for the different kinds of standard reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of EnergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-effective calibration of building models.

  7. Optimizing Linpack Benchmark on GPU-Accelerated Petascale Supercomputer

    Institute of Scientific and Technical Information of China (English)

    Feng Wang; Can-Qun Yang; Yun-Fei Du; Juan Chen; Hui-Zhan Yi; Wei-Xia Xu

    2011-01-01

    In this paper we present the programming of the Linpack benchmark on TianHe-1 system,the first petascale supercomputer system of China,and the largest GPU-accelerated heterogeneous system ever attempted before.A hybrid programming model consisting of MPI,OpenMP and streaming computing is described to explore the task parallel,thread parallel and data parallel of the Linpack.We explain how we optimized the load distribution across the CPUs and GPUs using the two-level adaptive method and describe the implementation in details.To overcome the low-bandwidth between the CPU and GPU communication,we present a software pipelining technique to hide the communication overhead.Combined with other traditional optimizations,the Linpack we developed achieved 196.7 GFLOPS on a single compute element of TianHe-1.This result is 70.1% of the peak compute capability,3.3 times faster than the result by using the vendor's library.On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0.563 PFLOPS,which made TianHe-1 the 5th fastest supercomputer on the Top500 list in November,2009.

  8. The PVM (Parallel Virtual Machine) system: Supercomputer level concurrent computation on a network of IBM RS/6000 power stations

    Energy Technology Data Exchange (ETDEWEB)

    Sunderam, V.S. (Emory Univ., Atlanta, GA (USA). Dept. of Mathematics and Computer Science); Geist, G.A. (Oak Ridge National Lab., TN (USA))

    1991-01-01

    The PVM (Parallel Virtual Machine) system enables supercomputer level concurrent computations to be performed on interconnected networks of heterogeneous computer systems. Specifically, a network of 13 IBM RS/6000 powerstations has been successfully used to execute production quality runs of superconductor modeling codes at more than 250 Mflops. This work demonstrates the effectiveness of cooperative concurrent processing for high performance applications, and shows that supercomputer level computations may be attained at a fraction of the cost on distributed computing platforms. This paper describes the PVM programming environment and user facilities, as they apply to hardware platforms comprising a network of IBM RS/6000 powerstations. The salient design features of PVM will be discussed; including heterogeneity, scalability, multilanguage support, provisions for fault tolerance, the use of multiprocessors and scalar machines, an interactive graphical front end, and support for profiling, tracing, and visual analysis. The PVM system has been used extensively, and a range of production quality concurrent applications have been successfully executed using PVM on a variety of networked platforms. The paper will mention representative examples, and discuss two in detail. The first is a material sciences problem that was originally developed on a Cray 2. This application code calculates the electronic structure of metallic alloys from first principles and is based on the KKR-CPA algorithm. The second is a molecular dynamics simulation for calculating materials properties. Performance results for both applicants on networks of RS/6000 powerstations will be presented, and accompanied by discussions of the other advantages of PVM and its potential as a complement or alternative to conventional supercomputers.

  9. Convergence: Computing and communications

    Energy Technology Data Exchange (ETDEWEB)

    Catlett, C. [National Center for Supercomputing Applications, Champaign, IL (United States)

    1996-12-31

    This paper highlights the operations of the National Center for Supercomputing Applications (NCSA). NCSA is developing and implementing a national strategy to create, use, and transfer advanced computing and communication tools and information technologies for science, engineering, education, and business. The primary focus of the presentation is historical and expected growth in the computing capacity, personal computer performance, and Internet and WorldWide Web sites. Data are presented to show changes over the past 10 to 20 years in these areas. 5 figs., 4 tabs.

  10. Integration of Titan supercomputer at OLCF with ATLAS Production System

    CERN Document Server

    Barreiro Megino, Fernando Harald; The ATLAS collaboration

    2017-01-01

    The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS ex- periment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this talk we will describe a project aimed at integration of ATLAS Production System with Titan supercom- puter at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modi ed PanDA Pilot framework for ...

  11. Modeling the weather with a data flow supercomputer

    Science.gov (United States)

    Dennis, J. B.; Gao, G.-R.; Todd, K. W.

    1984-01-01

    A static concept of data flow architecture is considered for a supercomputer for weather modeling. The machine level instructions are loaded into specific memory locations before computation is initiated, with only one instruction active at a time. The machine would have processing element, functional unit, array memory, memory routing and distribution routing network elements all contained on microprocessors. A value-oriented algorithmic language (VAL) would be employed and would have, as basic operations, simple functions deriving results from operand values. Details of the machine language format, computations with an array and file processing procedures are outlined. A global weather model is discussed in terms of a static architecture and the potential computation rate is analyzed. The results indicate that detailed design studies are warranted to quantify costs and parts fabrication requirements.

  12. Direct numerical simulation of turbulence using GPU accelerated supercomputers

    Science.gov (United States)

    Khajeh-Saeed, Ali; Blair Perot, J.

    2013-02-01

    Direct numerical simulations of turbulence are optimized for up to 192 graphics processors. The results from two large GPU clusters are compared to the performance of corresponding CPU clusters. A number of important algorithm changes are necessary to access the full computational power of graphics processors and these adaptations are discussed. It is shown that the handling of subdomain communication becomes even more critical when using GPU based supercomputers. The potential for overlap of MPI communication with GPU computation is analyzed and then optimized. Detailed timings reveal that the internal calculations are now so efficient that the operations related to MPI communication are the primary scaling bottleneck at all but the very largest problem sizes that can fit on the hardware. This work gives a glimpse of the CFD performance issues will dominate many hardware platform in the near future.

  13. Refinement of herpesvirus B-capsid structure on parallel supercomputers.

    Science.gov (United States)

    Zhou, Z H; Chiu, W; Haskell, K; Spears, H; Jakana, J; Rixon, F J; Scott, L R

    1998-01-01

    Electron cryomicroscopy and icosahedral reconstruction are used to obtain the three-dimensional structure of the 1250-A-diameter herpesvirus B-capsid. The centers and orientations of particles in focal pairs of 400-kV, spot-scan micrographs are determined and iteratively refined by common-lines-based local and global refinement procedures. We describe the rationale behind choosing shared-memory multiprocessor computers for executing the global refinement, which is the most computationally intensive step in the reconstruction procedure. This refinement has been implemented on three different shared-memory supercomputers. The speedup and efficiency are evaluated by using test data sets with different numbers of particles and processors. Using this parallel refinement program, we refine the herpesvirus B-capsid from 355-particle images to 13-A resolution. The map shows new structural features and interactions of the protein subunits in the three distinct morphological units: penton, hexon, and triplex of this T = 16 icosahedral particle.

  14. Solving global shallow water equations on heterogeneous supercomputers.

    Science.gov (United States)

    Fu, Haohuan; Gan, Lin; Yang, Chao; Xue, Wei; Wang, Lanning; Wang, Xinliang; Huang, Xiaomeng; Yang, Guangwen

    2017-01-01

    The scientific demand for more accurate modeling of the climate system calls for more computing power to support higher resolutions, inclusion of more component models, more complicated physics schemes, and larger ensembles. As the recent improvements in computing power mostly come from the increasing number of nodes in a system and the integration of heterogeneous accelerators, how to scale the computing problems onto more nodes and various kinds of accelerators has become a challenge for the model development. This paper describes our efforts on developing a highly scalable framework for performing global atmospheric modeling on heterogeneous supercomputers equipped with various accelerators, such as GPU (Graphic Processing Unit), MIC (Many Integrated Core), and FPGA (Field Programmable Gate Arrays) cards. We propose a generalized partition scheme of the problem domain, so as to keep a balanced utilization of both CPU resources and accelerator resources. With optimizations on both computing and memory access patterns, we manage to achieve around 8 to 20 times speedup when comparing one hybrid GPU or MIC node with one CPU node with 12 cores. Using a customized FPGA-based data-flow engines, we see the potential to gain another 5 to 8 times improvement on performance. On heterogeneous supercomputers, such as Tianhe-1A and Tianhe-2, our framework is capable of achieving ideally linear scaling efficiency, and sustained double-precision performances of 581 Tflops on Tianhe-1A (using 3750 nodes) and 3.74 Pflops on Tianhe-2 (using 8644 nodes). Our study also provides an evaluation on the programming paradigm of various accelerator architectures (GPU, MIC, FPGA) for performing global atmospheric simulation, to form a picture about both the potential performance benefits and the programming efforts involved.

  15. Calculation of Free Energy Landscape in Multi-Dimensions with Hamiltonian-Exchange Umbrella Sampling on Petascale Supercomputer.

    Science.gov (United States)

    Jiang, Wei; Luo, Yun; Maragliano, Luca; Roux, Benoît

    2012-11-13

    An extremely scalable computational strategy is described for calculations of the potential of mean force (PMF) in multidimensions on massively distributed supercomputers. The approach involves coupling thousands of umbrella sampling (US) simulation windows distributed to cover the space of order parameters with a Hamiltonian molecular dynamics replica-exchange (H-REMD) algorithm to enhance the sampling of each simulation. In the present application, US/H-REMD is carried out in a two-dimensional (2D) space and exchanges are attempted alternatively along the two axes corresponding to the two order parameters. The US/H-REMD strategy is implemented on the basis of parallel/parallel multiple copy protocol at the MPI level, and therefore can fully exploit computing power of large-scale supercomputers. Here the novel technique is illustrated using the leadership supercomputer IBM Blue Gene/P with an application to a typical biomolecular calculation of general interest, namely the binding of calcium ions to the small protein Calbindin D9k. The free energy landscape associated with two order parameters, the distance between the ion and its binding pocket and the root-mean-square deviation (rmsd) of the binding pocket relative the crystal structure, was calculated using the US/H-REMD method. The results are then used to estimate the absolute binding free energy of calcium ion to Calbindin D9k. The tests demonstrate that the 2D US/H-REMD scheme greatly accelerates the configurational sampling of the binding pocket, thereby improving the convergence of the potential of mean force calculation.

  16. Non-preconditioned conjugate gradient on cell and FPCA-based hybrid supercomputer nodes

    Energy Technology Data Exchange (ETDEWEB)

    Dubois, David H [Los Alamos National Laboratory; Dubois, Andrew J [Los Alamos National Laboratory; Boorman, Thomas M [Los Alamos National Laboratory; Connor, Carolyn M [Los Alamos National Laboratory

    2009-03-10

    This work presents a detailed implementation of a double precision, Non-Preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{trademark} in conjunction with x86 Opteron{trademark} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.

  17. Non-preconditioned conjugate gradient on cell and FPGA based hybrid supercomputer nodes

    Energy Technology Data Exchange (ETDEWEB)

    Dubois, David H [Los Alamos National Laboratory; Dubois, Andrew J [Los Alamos National Laboratory; Boorman, Thomas M [Los Alamos National Laboratory; Connor, Carolyn M [Los Alamos National Laboratory

    2009-01-01

    This work presents a detailed implementation of a double precision, non-preconditioned, Conjugate Gradient algorithm on a Roadrunner heterogeneous supercomputer node. These nodes utilize the Cell Broadband Engine Architecture{sup TM} in conjunction with x86 Opteron{sup TM} processors from AMD. We implement a common Conjugate Gradient algorithm, on a variety of systems, to compare and contrast performance. Implementation results are presented for the Roadrunner hybrid supercomputer, SRC Computers, Inc. MAPStation SRC-6 FPGA enhanced hybrid supercomputer, and AMD Opteron only. In all hybrid implementations wall clock time is measured, including all transfer overhead and compute timings.

  18. High temporal resolution mapping of seismic noise sources using heterogeneous supercomputers

    Science.gov (United States)

    Gokhberg, Alexey; Ermert, Laura; Paitz, Patrick; Fichtner, Andreas

    2017-04-01

    Time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems. Significant interest in seismic noise source maps with high temporal resolution (days) is expected to come from a number of domains, including natural resources exploration, analysis of active earthquake fault zones and volcanoes, as well as geothermal and hydrocarbon reservoir monitoring. Currently, knowledge of noise sources is insufficient for high-resolution subsurface monitoring applications. Near-real-time seismic data, as well as advanced imaging methods to constrain seismic noise sources have recently become available. These methods are based on the massive cross-correlation of seismic noise records from all available seismic stations in the region of interest and are therefore very computationally intensive. Heterogeneous massively parallel supercomputing systems introduced in the recent years combine conventional multi-core CPU with GPU accelerators and provide an opportunity for manifold increase and computing performance. Therefore, these systems represent an efficient platform for implementation of a noise source mapping solution. We present the first results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service that provides seismic noise source maps for Central Europe with high temporal resolution (days to few weeks depending on frequency and data availability). The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept in order to provide the interested external researchers the regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for

  19. Requirements for supercomputing in energy research: The transition to massively parallel computing

    Energy Technology Data Exchange (ETDEWEB)

    1993-02-01

    This report discusses: The emergence of a practical path to TeraFlop computing and beyond; requirements of energy research programs at DOE; implementation: supercomputer production computing environment on massively parallel computers; and implementation: user transition to massively parallel computing.

  20. Novel Supercomputing Approaches for High Performance Linear Algebra Using FPGAs Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Supercomputing plays a major role in many areas of science and engineering, and it has had tremendous impact for decades in areas such as aerospace, defense, energy,...

  1. SUPERCOMPUTERS FOR AIDING ECONOMIC PROCESSES WITH REFERENCE TO THE FINANCIAL SECTOR

    Directory of Open Access Journals (Sweden)

    Jerzy Balicki

    2014-12-01

    Full Text Available The article discusses the use of supercomputers to support business processes with particular emphasis on the financial sector. A reference was made to the selected projects that support economic development. In particular, we propose the use of supercomputers to perform artificial intel-ligence methods in banking. The proposed methods combined with modern technology enables a significant increase in the competitiveness of enterprises and banks by adding new functionality.

  2. A novel VLSI processor architecture for supercomputing arrays

    Science.gov (United States)

    Venkateswaran, N.; Pattabiraman, S.; Devanathan, R.; Ahmed, Ashaf; Venkataraman, S.; Ganesh, N.

    1993-01-01

    Design of the processor element for general purpose massively parallel supercomputing arrays is highly complex and cost ineffective. To overcome this, the architecture and organization of the functional units of the processor element should be such as to suit the diverse computational structures and simplify mapping of complex communication structures of different classes of algorithms. This demands that the computation and communication structures of different class of algorithms be unified. While unifying the different communication structures is a difficult process, analysis of a wide class of algorithms reveals that their computation structures can be expressed in terms of basic IP,IP,OP,CM,R,SM, and MAA operations. The execution of these operations is unified on the PAcube macro-cell array. Based on this PAcube macro-cell array, we present a novel processor element called the GIPOP processor, which has dedicated functional units to perform the above operations. The architecture and organization of these functional units are such to satisfy the two important criteria mentioned above. The structure of the macro-cell and the unification process has led to a very regular and simpler design of the GIPOP processor. The production cost of the GIPOP processor is drastically reduced as it is designed on high performance mask programmable PAcube arrays.

  3. Accelerating Science Impact through Big Data Workflow Management and Supercomputing

    Science.gov (United States)

    De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Ryabinkin, E.; Wenaus, T.

    2016-02-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed Analysis)Workload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF), is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF's Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.

  4. Micro-mechanical Simulations of Soils using Massively Parallel Supercomputers

    Directory of Open Access Journals (Sweden)

    David W. Washington

    2004-06-01

    Full Text Available In this research a computer program, Trubal version 1.51, based on the Discrete Element Method was converted to run on a Connection Machine (CM-5,a massively parallel supercomputer with 512 nodes, to expedite the computational times of simulating Geotechnical boundary value problems. The dynamic memory algorithm in Trubal program did not perform efficiently in CM-2 machine with the Single Instruction Multiple Data (SIMD architecture. This was due to the communication overhead involving global array reductions, global array broadcast and random data movement. Therefore, a dynamic memory algorithm in Trubal program was converted to a static memory arrangement and Trubal program was successfully converted to run on CM-5 machines. The converted program was called "TRUBAL for Parallel Machines (TPM." Simulating two physical triaxial experiments and comparing simulation results with Trubal simulations validated the TPM program. With a 512 nodes CM-5 machine TPM produced a nine-fold speedup demonstrating the inherent parallelism within algorithms based on the Discrete Element Method.

  5. Astrophysical Supercomputing with GPUs: Critical Decisions for Early Adopters

    Science.gov (United States)

    Fluke, Christopher J.; Barnes, David G.; Barsdell, Benjamin R.; Hassan, Amr H.

    2011-01-01

    General-purpose computing on graphics processing units (GPGPU) is dramatically changing the landscape of high performance computing in astronomy. In this paper, we identify and investigate several key decision areas, with a goal of simplifying the early adoption of GPGPU in astronomy. We consider the merits of OpenCL as an open standard in order to reduce risks associated with coding in a native, vendor-specific programming environment, and present a GPU programming philosophy based on using brute force solutions. We assert that effective use of new GPU-based supercomputing facilities will require a change in approach from astronomers. This will likely include improved programming training, an increased need for software development best practice through the use of profiling and related optimisation tools, and a greater reliance on third-party code libraries. As with any new technology, those willing to take the risks and make the investment of time and effort to become early adopters of GPGPU in astronomy, stand to reap great benefits.

  6. Astrophysical Supercomputing with GPUs: Critical Decisions for Early Adopters

    CERN Document Server

    Fluke, Christopher J; Barsdell, Benjamin R; Hassan, Amr H

    2010-01-01

    General purpose computing on graphics processing units (GPGPU) is dramatically changing the landscape of high performance computing in astronomy. In this paper, we identify and investigate several key decision areas, with a goal of simplyfing the early adoption of GPGPU in astronomy. We consider the merits of OpenCL as an open standard in order to reduce risks associated with coding in a native, vendor-specific programming environment, and present a GPU programming philosophy based on using brute force solutions. We assert that effective use of new GPU-based supercomputing facilities will require a change in approach from astronomers. This will likely include improved programming training, an increased need for software development best-practice through the use of profiling and related optimisation tools, and a greater reliance on third-party code libraries. As with any new technology, those willing to take the risks, and make the investment of time and effort to become early adopters of GPGPU in astronomy, s...

  7. Developing Fortran Code for Kriging on the Stampede Supercomputer

    Science.gov (United States)

    Hodgess, Erin

    2016-04-01

    Kriging is easily accessible in the open source statistical language R (R Core Team, 2015) in the gstat (Pebesma, 2004) package. It works very well, but can be slow on large data sets, particular if the prediction space is large as well. We are working on the Stampede supercomputer at the Texas Advanced Computing Center to develop code using a combination of R and the Message Passage Interface (MPI) bindings to Fortran. We have a function similar to the autofitVariogram found in the automap (Hiemstra {et al}, 2008) package and it is very effective. We are comparing R with MPI/Fortran, MPI/Fortran alone, and R with the Rmpi package, which uses bindings to C. We will present results from simulation studies and real-world examples. References: Hiemstra, P.H., Pebesma, E.J., Twenhofel, C.J.W. and G.B.M. Heuvelink, 2008. Real-time automatic interpolation of ambient gamma dose rates from the Dutch Radioactivity Monitoring Network. Computers and Geosciences, accepted for publication. Pebesma, E.J., 2004. Multivariable geostatistics in S: the gstat package. Computers and Geosciences, 30: 683-691. R Core Team, 2015. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. https://www.R-project.org/.

  8. Using the multistage cube network topology in parallel supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Siegel, H.J.; Nation, W.G. (Purdue Univ., Lafayette, IN (USA). School of Electrical Engineering); Kruskal, C.P. (Maryland Univ., College Park, MD (USA). Dept. of Computer Science); Napolitano, L.M. Jr. (Sandia National Labs., Livermore, CA (USA))

    1989-12-01

    A variety of approaches to designing the interconnection network to support communications among the processors and memories of supercomputers employing large-scale parallel processing have been proposed and/or implemented. These approaches are often based on the multistage cube topology. This topology is the subject of much ongoing research and study because of the ways in which the multistage cube can be used. The attributes of the topology that make it useful are described. These include O(N log{sub 2} N) cost for an N input/output network, decentralized control, a variety of implementation options, good data permuting capability to support single instruction stream/multiple data stream (SIMD) parallelism, good throughput to support multiple instruction stream/multiple data stream (MIMD) parallelism, and ability to be partitioned into independent subnetworks to support reconfigurable systems. Examples of existing systems that use multistage cube networks are overviewed. The multistage cube topology can be converted into a single-stage network by associating with each switch in the network a processor (and a memory). Properties of systems that use the multistage cube network in this way are also examined.

  9. Accelerating Science Impact through Big Data Workflow Management and Supercomputing

    Directory of Open Access Journals (Sweden)

    De K.

    2016-01-01

    Full Text Available The Large Hadron Collider (LHC, operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed AnalysisWorkload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF, is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.

  10. Supercomputers ready for use as discovery machines for neuroscience

    Directory of Open Access Journals (Sweden)

    Moritz eHelias

    2012-11-01

    Full Text Available NEST is a widely used tool to simulate biological spiking neural networks. Here we explain theimprovements, guided by a mathematical model of memory consumption, that enable us to exploitfor the first time the computational power of the K supercomputer for neuroscience. Multi-threadedcomponents for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling.K is capable of simulating networks corresponding to a brain area with 10^8 neurons and 10^12 synapsesin the worst case scenario of random connectivity; for larger networks of the brain its hierarchicalorganization can be exploited to constrain the number of communicating computer nodes. Wediscuss the limits of the software technology, comparing maximum-□lling scaling plots for K andthe JUGENE BG/P system. The usability of these machines for network simulations has becomecomparable to running simulations on a single PC. Turn-around times in the range of minutes evenfor the largest systems enable a quasi-interactive working style and render simulations on this scalea practical tool for computational neuroscience.

  11. Supercomputers ready for use as discovery machines for neuroscience.

    Science.gov (United States)

    Helias, Moritz; Kunkel, Susanne; Masumoto, Gen; Igarashi, Jun; Eppler, Jochen Martin; Ishii, Shin; Fukai, Tomoki; Morrison, Abigail; Diesmann, Markus

    2012-01-01

    NEST is a widely used tool to simulate biological spiking neural networks. Here we explain the improvements, guided by a mathematical model of memory consumption, that enable us to exploit for the first time the computational power of the K supercomputer for neuroscience. Multi-threaded components for wiring and simulation combine 8 cores per MPI process to achieve excellent scaling. K is capable of simulating networks corresponding to a brain area with 10(8) neurons and 10(12) synapses in the worst case scenario of random connectivity; for larger networks of the brain its hierarchical organization can be exploited to constrain the number of communicating computer nodes. We discuss the limits of the software technology, comparing maximum filling scaling plots for K and the JUGENE BG/P system. The usability of these machines for network simulations has become comparable to running simulations on a single PC. Turn-around times in the range of minutes even for the largest systems enable a quasi interactive working style and render simulations on this scale a practical tool for computational neuroscience.

  12. A user-friendly web portal for T-Coffee on supercomputers

    Directory of Open Access Journals (Sweden)

    Koetsier Jos

    2011-05-01

    Full Text Available Abstract Background Parallel T-Coffee (PTC was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.

  13. A user-friendly web portal for T-Coffee on supercomputers.

    Science.gov (United States)

    Rius, Josep; Cores, Fernando; Solsona, Francesc; van Hemert, Jano I; Koetsier, Jos; Notredame, Cedric

    2011-05-12

    Parallel T-Coffee (PTC) was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.

  14. A Parallel Supercomputer Implementation of a Biological Inspired Neural Network and its use for Pattern Recognition

    Science.gov (United States)

    de Ladurantaye, Vincent; Lavoie, Jean; Bergeron, Jocelyn; Parenteau, Maxime; Lu, Huizhong; Pichevar, Ramin; Rouat, Jean

    2012-02-01

    A parallel implementation of a large spiking neural network is proposed and evaluated. The neural network implements the binding by synchrony process using the Oscillatory Dynamic Link Matcher (ODLM). Scalability, speed and performance are compared for 2 implementations: Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA) running on clusters of multicore supercomputers and NVIDIA graphical processing units respectively. A global spiking list that represents at each instant the state of the neural network is described. This list indexes each neuron that fires during the current simulation time so that the influence of their spikes are simultaneously processed on all computing units. Our implementation shows a good scalability for very large networks. A complex and large spiking neural network has been implemented in parallel with success, thus paving the road towards real-life applications based on networks of spiking neurons. MPI offers a better scalability than CUDA, while the CUDA implementation on a GeForce GTX 285 gives the best cost to performance ratio. When running the neural network on the GTX 285, the processing speed is comparable to the MPI implementation on RQCHP's Mammouth parallel with 64 notes (128 cores).

  15. 369 TFlop/s molecular dynamics simulations on the Roadrunner general-purpose heterogeneous supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Swaminarayan, Sriram [Los Alamos National Laboratory; Germann, Timothy C [Los Alamos National Laboratory; Kadau, Kai [Los Alamos National Laboratory; Fossum, Gordon C [IBM CORPORATION

    2008-01-01

    The authors present timing and performance numbers for a short-range parallel molecular dynamics (MD) code, SPaSM, that has been rewritten for the heterogeneous Roadrunner supercomputer. Each Roadrunner compute node consists of two AMD Opteron dual-core microprocessors and four PowerXCell 8i enhanced Cell microprocessors, so that there are four MPI ranks per node, each with one Opteron and one Cell. The interatomic forces are computed on the Cells (each with one PPU and eight SPU cores), while the Opterons are used to direct inter-rank communication and perform I/O-heavy periodic analysis, visualization, and checkpointing tasks. The performance measured for our initial implementation of a standard Lennard-Jones pair potential benchmark reached a peak of 369 Tflop/s double-precision floating-point performance on the full Roadrunner system (27.7% of peak), corresponding to 124 MFlop/Watt/s at a price of approximately 3.69 MFlops/dollar. They demonstrate an initial target application, the jetting and ejection of material from a shocked surface.

  16. PFLOTRAN: Reactive Flow & Transport Code for Use on Laptops to Leadership-Class Supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Hammond, Glenn E.; Lichtner, Peter C.; Lu, Chuan; Mills, Richard T.

    2012-04-18

    PFLOTRAN, a next-generation reactive flow and transport code for modeling subsurface processes, has been designed from the ground up to run efficiently on machines ranging from leadership-class supercomputers to laptops. Based on an object-oriented design, the code is easily extensible to incorporate additional processes. It can interface seamlessly with Fortran 9X, C and C++ codes. Domain decomposition parallelism is employed, with the PETSc parallel framework used to manage parallel solvers, data structures and communication. Features of the code include a modular input file, implementation of high-performance I/O using parallel HDF5, ability to perform multiple realization simulations with multiple processors per realization in a seamless manner, and multiple modes for multiphase flow and multicomponent geochemical transport. Chemical reactions currently implemented in the code include homogeneous aqueous complexing reactions and heterogeneous mineral precipitation/dissolution, ion exchange, surface complexation and a multirate kinetic sorption model. PFLOTRAN has demonstrated petascale performance using 2{sup 17} processor cores with over 2 billion degrees of freedom. Accomplishments achieved to date include applications to the Hanford 300 Area and modeling CO{sub 2} sequestration in deep geologic formations.

  17. Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu

    2011-03-29

    The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.

  18. Cyberdyn supercomputer - a tool for imaging geodinamic processes

    Science.gov (United States)

    Pomeran, Mihai; Manea, Vlad; Besutiu, Lucian; Zlagnean, Luminita

    2014-05-01

    More and more physical processes developed within the deep interior of our planet, but with significant impact on the Earth's shape and structure, become subject to numerical modelling by using high performance computing facilities. Nowadays, worldwide an increasing number of research centers decide to make use of such powerful and fast computers for simulating complex phenomena involving fluid dynamics and get deeper insight to intricate problems of Earth's evolution. With the CYBERDYN cybernetic infrastructure (CCI), the Solid Earth Dynamics Department in the Institute of Geodynamics of the Romanian Academy boldly steps into the 21st century by entering the research area of computational geodynamics. The project that made possible this advancement, has been jointly supported by EU and Romanian Government through the Structural and Cohesion Funds. It lasted for about three years, ending October 2013. CCI is basically a modern high performance Beowulf-type supercomputer (HPCC), combined with a high performance visualization cluster (HPVC) and a GeoWall. The infrastructure is mainly structured around 1344 cores and 3 TB of RAM. The high speed interconnect is provided by a Qlogic InfiniBand switch, able to transfer up to 40 Gbps. The CCI storage component is a 40 TB Panasas NAS. The operating system is Linux (CentOS). For control and maintenance, the Bright Cluster Manager package is used. The SGE job scheduler manages the job queues. CCI has been designed for a theoretical peak performance up to 11.2 TFlops. Speed tests showed that a high resolution numerical model (256 × 256 × 128 FEM elements) could be resolved with a mean computational speed of 1 time step at 30 seconds, by employing only a fraction of the computing power (20%). After passing the mandatory tests, the CCI has been involved in numerical modelling of various scenarios related to the East Carpathians tectonic and geodynamic evolution, including the Neogene magmatic activity, and the intriguing

  19. High-Performance Computing: Industry Uses of Supercomputers and High-Speed Networks. Report to Congressional Requesters.

    Science.gov (United States)

    General Accounting Office, Washington, DC. Information Management and Technology Div.

    This report was prepared in response to a request for information on supercomputers and high-speed networks from the Senate Committee on Commerce, Science, and Transportation, and the House Committee on Science, Space, and Technology. The following information was requested: (1) examples of how various industries are using supercomputers to…

  20. Robust Machine Learning Applied to Terascale Astronomical Datasets

    CERN Document Server

    Ball, Nicholas M; Myers, Adam D

    2007-01-01

    We present recent results from the Laboratory for Cosmological Data Mining (http://lcdm.astro.uiuc.edu) at the National Center for Supercomputing Applications (NCSA) to provide robust classifications and photometric redshifts for objects in the terascale-class Sloan Digital Sky Survey (SDSS). Through a combination of machine learning in the form of decision trees, k-nearest neighbor, and genetic algorithms, the use of supercomputing resources at NCSA, and the cyberenvironment Data-to-Knowledge, we are able to provide improved classifications for over 100 million objects in the SDSS, improved photometric redshifts, and a full exploitation of the powerful k-nearest neighbor algorithm. This work is the first to apply the full power of these algorithms to contemporary terascale astronomical datasets, and the improvement over existing results is demonstrable. We discuss issues that we have encountered in dealing with data on the terascale, and possible solutions that can be implemented to deal with upcoming petasc...

  1. Robust Machine Learning Applied to Terascale Astronomical Datasets

    Science.gov (United States)

    Ball, N. M.; Brunner, R. J.; Myers, A. D.

    2008-08-01

    We present recent results from the Laboratory for Cosmological Data Mining {http://lcdm.astro.uiuc.edu} at the National Center for Supercomputing Applications (NCSA) to provide robust classifications and photometric redshifts for objects in the terascale-class Sloan Digital Sky Survey (SDSS). Through a combination of machine learning in the form of decision trees, k-nearest neighbor, and genetic algorithms, the use of supercomputing resources at NCSA, and the cyberenvironment Data-to-Knowledge, we are able to provide improved classifications for over 100 million objects in the SDSS, improved photometric redshifts, and a full exploitation of the powerful k-nearest neighbor algorithm. This work is the first to apply the full power of these algorithms to contemporary terascale astronomical datasets, and the improvement over existing results is demonstrable. We discuss issues that we have encountered in dealing with data on the terascale, and possible solutions that can be implemented to deal with upcoming petascale datasets.

  2. Argonne National Lab deploys Force10 networks' massively dense ethernet switch for supercomputing cluster

    CERN Multimedia

    2003-01-01

    "Force10 Networks, Inc. today announced that Argonne National Laboratory (Argonne, IL) has successfully deployed Force10 E-Series switch/routers to connect to the TeraGrid, the world's largest supercomputing grid, sponsored by the National Science Foundation (NSF)" (1/2 page).

  3. Design and performance characterization of electronic structure calculations on massively parallel supercomputers

    DEFF Research Database (Denmark)

    Romero, N. A.; Glinsvad, Christian; Larsen, Ask Hjorth

    2013-01-01

    Density function theory (DFT) is the most widely employed electronic structure method because of its favorable scaling with system size and accuracy for a broad range of molecular and condensed-phase systems. The advent of massively parallel supercomputers has enhanced the scientific community's ...

  4. The impact of the U.S. supercomputing initiative will be global

    Energy Technology Data Exchange (ETDEWEB)

    Crawford, Dona [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-15

    Last July, President Obama issued an executive order that created a coordinated federal strategy for HPC research, development, and deployment called the U.S. National Strategic Computing Initiative (NSCI). However, this bold, necessary step toward building the next generation of supercomputers has inaugurated a new era for U.S. high performance computing (HPC).

  5. Congressional Panel Seeks To Curb Access of Foreign Students to U.S. Supercomputers.

    Science.gov (United States)

    Kiernan, Vincent

    1999-01-01

    Fearing security problems, a congressional committee on Chinese espionage recommends that foreign students and other foreign nationals be barred from using supercomputers at national laboratories unless they first obtain export licenses from the federal government. University officials dispute the data on which the report is based and find the…

  6. [Experience in simulating the structural and dynamic features of small proteins using table supercomputers].

    Science.gov (United States)

    Kondrat'ev, M S; Kabanov, A V; Komarov, V M; Khechinashvili, N N; Samchenko, A A

    2011-01-01

    The results of theoretical studies of the structural and dynamic features of peptides and small proteins have been presented that were carried out by quantum chemical and molecular dynamics methods in high-performance graphic stations, "table supercomputers", using distributed calculations by the CUDA technology.

  7. Enabling Loosely-Coupled Serial Job Execution on the IBM BlueGene/P Supercomputer and the SiCortex SC5832

    CERN Document Server

    Raicu, Ioan; Wilde, Mike; Foster, Ian

    2008-01-01

    Our work addresses the enabling of the execution of highly parallel computations composed of loosely coupled serial jobs with no modifications to the respective applications, on large-scale systems. This approach allows new-and potentially far larger-classes of application to leverage systems such as the IBM Blue Gene/P supercomputer and similar emerging petascale architectures. We present here the challenges of I/O performance encountered in making this model practical, and show results using both micro-benchmarks and real applications on two large-scale systems, the BG/P and the SiCortex SC5832. Our preliminary benchmarks show that we can scale to 4096 processors on the Blue Gene/P and 5832 processors on the SiCortex with high efficiency, and can achieve thousands of tasks/sec sustained execution rates for parallel workloads of ordinary serial applications. We measured applications from two domains, economic energy modeling and molecular dynamics.

  8. Interactive steering of supercomputing simulation for aerodynamic noise radiated from square cylinder; Supercomputer wo mochiita steering system ni yoru kakuchu kara hoshasareru kurikion no suchi kaiseki

    Energy Technology Data Exchange (ETDEWEB)

    Yokono, Y. [Toshiba Corp., Tokyo (Japan); Fujita, H. [Tokyo Inst. of Technology, Tokyo (Japan). Precision Engineering Lab.

    1995-03-25

    This paper describes extensive computer simulation for aerodynamic noise radiated from a square cylinder using an interactive steering supercomputing simulation system. The unsteady incompressible three-dimensional Navier-Stokes equations are solved by the finite volume method using a steering system which can visualize the numerical process during calculation and alter the numerical parameter. Using the fluctuating surface pressure of the square cylinder, the farfield sound pressure is calculated based on Lighthill-Curle`s equation. The results are compared with those of low noise wind tunnel experiments, and good agreement is observed for the peak spectrum frequency of the sound pressure level. 14 refs., 10 figs.

  9. Explaining the Gap between Theoretical Peak Performance and Real Performance for Supercomputer Architectures

    Directory of Open Access Journals (Sweden)

    W. Schönauer

    1994-01-01

    Full Text Available The basic architectures of vector and parallel computers and their properties are presented followed by a discussion of memory size and arithmetic operations in the context of memory bandwidth. For a single operation micromeasurements of the vector triad for the IBM 3090 VF and the CRAY Y-MP/8 are presented, revealing in detail the losses for this operation. The global performance of a whole supercomputer is then considered by identifying reduction factors that reduce the theoretical peak performance to the poor real performance. The responsibilities of the manufacturer and of the user for these losses are discussed. The price-performance ratio for different architectures as of January 1991 is briefly mentioned. Finally a user-friendly architecture for a supercomputer is proposed.

  10. HACC: Simulating Sky Surveys on State-of-the-Art Supercomputing Architectures

    CERN Document Server

    Habib, Salman; Finkel, Hal; Frontiere, Nicholas; Heitmann, Katrin; Daniel, David; Fasel, Patricia; Morozov, Vitali; Zagaris, George; Peterka, Tom; Vishwanath, Venkatram; Lukic, Zarija; Sehrish, Saba; Liao, Wei-keng

    2014-01-01

    Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the 'Dark Universe', dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers that enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of prog...

  11. Direct exploitation of a top 500 Supercomputer for Analysis of CMS Data

    Science.gov (United States)

    Cabrillo, I.; Cabellos, L.; Marco, J.; Fernandez, J.; Gonzalez, I.

    2014-06-01

    The Altamira Supercomputer hosted at the Instituto de Fisica de Cantatbria (IFCA) entered in operation in summer 2012. Its last generation FDR Infiniband network used (for message passing) in parallel jobs, supports the connection to General Parallel File System (GPFS) servers, enabling an efficient simultaneous processing of multiple data demanding jobs. Sharing a common GPFS system and a single LDAP-based identification with the existing Grid clusters at IFCA allows CMS researchers to exploit the large instantaneous capacity of this supercomputer to execute analysis jobs. The detailed experience describing this opportunistic use for skimming and final analysis of CMS 2012 data for a specific physics channel, resulting in an order of magnitude reduction of the waiting time, is presented.

  12. Sandia`s network for supercomputing `95: Validating the progress of Asynchronous Transfer Mode (ATM) switching

    Energy Technology Data Exchange (ETDEWEB)

    Pratt, T.J.; Vahle, O.; Gossage, S.A.

    1996-04-01

    The Advanced Networking Integration Department at Sandia National Laboratories has used the annual Supercomputing conference sponsored by the IEEE and ACM for the past three years as a forum to demonstrate and focus communication and networking developments. For Supercomputing `95, Sandia elected: to demonstrate the functionality and capability of an AT&T Globeview 20Gbps Asynchronous Transfer Mode (ATM) switch, which represents the core of Sandia`s corporate network, to build and utilize a three node 622 megabit per second Paragon network, and to extend the DOD`s ACTS ATM Internet from Sandia, New Mexico to the conference`s show floor in San Diego, California, for video demonstrations. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations supports Sandia`s overall strategies in ATM networking.

  13. BSMBench: a flexible and scalable supercomputer benchmark from computational particle physics

    CERN Document Server

    Bennett, Ed; Del Debbio, Luigi; Jordan, Kirk; Patella, Agostino; Pica, Claudio; Rago, Antonio

    2016-01-01

    Benchmarking plays a central role in the evaluation of High Performance Computing architectures. Several benchmarks have been designed that allow users to stress various components of supercomputers. In order for the figures they provide to be useful, benchmarks need to be representative of the most common real-world scenarios. In this work, we introduce BSMBench, a benchmarking suite derived from Monte Carlo code used in computational particle physics. The advantage of this suite (which can be freely downloaded from http://www.bsmbench.org/) over others is the capacity to vary the relative importance of computation and communication. This enables the tests to simulate various practical situations. To showcase BSMBench, we perform a wide range of tests on various architectures, from desktop computers to state-of-the-art supercomputers, and discuss the corresponding results. Possible future directions of development of the benchmark are also outlined.

  14. Towards 21st century stellar models: Star clusters, supercomputing and asteroseismology

    Science.gov (United States)

    Campbell, S. W.; Constantino, T. N.; D'Orazi, V.; Meakin, C.; Stello, D.; Christensen-Dalsgaard, J.; Kuehn, C.; De Silva, G. M.; Arnett, W. D.; Lattanzio, J. C.; MacLean, B. T.

    2016-09-01

    Stellar models provide a vital basis for many aspects of astronomy and astrophysics. Recent advances in observational astronomy - through asteroseismology, precision photometry, high-resolution spectroscopy, and large-scale surveys - are placing stellar models under greater quantitative scrutiny than ever. The model limitations are being exposed and the next generation of stellar models is needed as soon as possible. The current uncertainties in the models propagate to the later phases of stellar evolution, hindering our understanding of stellar populations and chemical evolution. Here we give a brief overview of the evolution, importance, and substantial uncertainties of core helium burning stars in particular and then briefly discuss a range of methods, both theoretical and observational, that we are using to advance the modelling. This study uses observational data from from HST, VLT, AAT, Kepler, and supercomputing resources in Australia provided by the National Computational Infrastructure (NCI) and Pawsey Supercomputing Centre.

  15. Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

    Energy Technology Data Exchange (ETDEWEB)

    Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Brightwell, Ron [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-05-01

    While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In this paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.

  16. TSP:A Heterogeneous Multiprocessor Supercomputing System Based on i860XP

    Institute of Scientific and Technical Information of China (English)

    黄国勇; 李三立

    1994-01-01

    Numerous new RISC processors provide support for supercomputing.By using the “mini-Cray” i860 superscalar processor,an add-on board has been developed to boost the performance of a real time system.A parallel heterogeneous multiprocessor surercomputing system,TSP,is constructed.In this paper,we present the system design consideration and described the architecture of the TSP and its features.

  17. US Department of Energy High School Student Supercomputing Honors Program: A follow-up assessment

    Energy Technology Data Exchange (ETDEWEB)

    1987-01-01

    The US DOE High School Student Supercomputing Honors Program was designed to recognize high school students with superior skills in mathematics and computer science and to provide them with formal training and experience with advanced computer equipment. This document reports on the participants who attended the first such program, which was held at the National Magnetic Fusion Energy Computer Center at the Lawrence Livermore National Laboratory (LLNL) during August 1985.

  18. Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

    Science.gov (United States)

    Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

    2011-01-01

    The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.

  19. Sparse matrix-vector multiplication on a reconfigurable supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Dubois, David H [Los Alamos National Laboratory; Dubois, Andrew J [Los Alamos National Laboratory; Boorman, Thomas M [Los Alamos National Laboratory; Connor, Carolyn M [Los Alamos National Laboratory; Poole, Steve [ORNL

    2008-01-01

    Double precision floating point Sparse Matrix-Vector Multiplication (SMVM) is a critical computational kernel used in iterative solvers for systems of sparse linear equations. The poor data locality exhibited by sparse matrices along with the high memory bandwidth requirements of SMVM result in poor performance on general purpose processors. Field Programmable Gate Arrays (FPGAs) offer a possible alternative with their customizable and application-targeted memory sub-system and processing elements. In this work we investigate two separate implementations of the SMVM on an SRC-6 MAPStation workstation. The first implementation investigates the peak performance capability, while the second implementation balances the amount of instantiated logic with the available sustained bandwidth of the FPGA subsystem. Both implementations yield the same sustained performance with the second producing a much more efficient solution. The metrics of processor and application balance are introduced to help provide some insight into the efficiencies of the FPGA and CPU based solutions explicitly showing the tight coupling of the available bandwidth to peak floating point performance. Due to the FPGA's ability to balance the amount of implemented logic to the available memory bandwidth it can provide a much more efficient solution. Finally, making use of the lessons learned implementing the SMVM, we present an fully implemented nonpreconditioned Conjugate Gradient Algorithm utilizing the second SMVM design.

  20. Internal fluid mechanics research on supercomputers for aerospace propulsion systems

    Science.gov (United States)

    Miller, Brent A.; Anderson, Bernhard H.; Szuch, John R.

    1988-01-01

    The Internal Fluid Mechanics Division of the NASA Lewis Research Center is combining the key elements of computational fluid dynamics, aerothermodynamic experiments, and advanced computational technology to bring internal computational fluid mechanics (ICFM) to a state of practical application for aerospace propulsion systems. The strategies used to achieve this goal are to: (1) pursue an understanding of flow physics, surface heat transfer, and combustion via analysis and fundamental experiments, (2) incorporate improved understanding of these phenomena into verified 3-D CFD codes, and (3) utilize state-of-the-art computational technology to enhance experimental and CFD research. Presented is an overview of the ICFM program in high-speed propulsion, including work in inlets, turbomachinery, and chemical reacting flows. Ongoing efforts to integrate new computer technologies, such as parallel computing and artificial intelligence, into high-speed aeropropulsion research are described.

  1. Radio Synthesis Imaging - A High Performance Computing and Communications Project

    Science.gov (United States)

    Crutcher, Richard M.

    The National Science Foundation has funded a five-year High Performance Computing and Communications project at the National Center for Supercomputing Applications (NCSA) for the direct implementation of several of the computing recommendations of the Astronomy and Astrophysics Survey Committee (the "Bahcall report"). This paper is a summary of the project goals and a progress report. The project will implement a prototype of the next generation of astronomical telescope systems - remotely located telescopes connected by high-speed networks to very high performance, scalable architecture computers and on-line data archives, which are accessed by astronomers over Gbit/sec networks. Specifically, a data link has been installed between the BIMA millimeter-wave synthesis array at Hat Creek, California and NCSA at Urbana, Illinois for real-time transmission of data to NCSA. Data are automatically archived, and may be browsed and retrieved by astronomers using the NCSA Mosaic software. In addition, an on-line digital library of processed images will be established. BIMA data will be processed on a very high performance distributed computing system, with I/O, user interface, and most of the software system running on the NCSA Convex C3880 supercomputer or Silicon Graphics Onyx workstations connected by HiPPI to the high performance, massively parallel Thinking Machines Corporation CM-5. The very computationally intensive algorithms for calibration and imaging of radio synthesis array observations will be optimized for the CM-5 and new algorithms which utilize the massively parallel architecture will be developed. Code running simultaneously on the distributed computers will communicate using the Data Transport Mechanism developed by NCSA. The project will also use the BLANCA Gbit/s testbed network between Urbana and Madison, Wisconsin to connect an Onyx workstation in the University of Wisconsin Astronomy Department to the NCSA CM-5, for development of long

  2. Combining density functional theory calculations, supercomputing, and data-driven methods to design new materials (Conference Presentation)

    Science.gov (United States)

    Jain, Anubhav

    2017-04-01

    Density functional theory (DFT) simulations solve for the electronic structure of materials starting from the Schrödinger equation. Many case studies have now demonstrated that researchers can often use DFT to design new compounds in the computer (e.g., for batteries, catalysts, and hydrogen storage) before synthesis and characterization in the lab. In this talk, I will focus on how DFT calculations can be executed on large supercomputing resources in order to generate very large data sets on new materials for functional applications. First, I will briefly describe the Materials Project, an effort at LBNL that has virtually characterized over 60,000 materials using DFT and has shared the results with over 17,000 registered users. Next, I will talk about how such data can help discover new materials, describing how preliminary computational screening led to the identification and confirmation of a new family of bulk AMX2 thermoelectric compounds with measured zT reaching 0.8. I will outline future plans for how such data-driven methods can be used to better understand the factors that control thermoelectric behavior, e.g., for the rational design of electronic band structures, in ways that are different from conventional approaches.

  3. HACC: Simulating sky surveys on state-of-the-art supercomputing architectures

    Science.gov (United States)

    Habib, Salman; Pope, Adrian; Finkel, Hal; Frontiere, Nicholas; Heitmann, Katrin; Daniel, David; Fasel, Patricia; Morozov, Vitali; Zagaris, George; Peterka, Tom; Vishwanath, Venkatram; Lukić, Zarija; Sehrish, Saba; Liao, Wei-keng

    2016-01-01

    Current and future surveys of large-scale cosmic structure are associated with a massive and complex datastream to study, characterize, and ultimately understand the physics behind the two major components of the 'Dark Universe', dark energy and dark matter. In addition, the surveys also probe primordial perturbations and carry out fundamental measurements, such as determining the sum of neutrino masses. Large-scale simulations of structure formation in the Universe play a critical role in the interpretation of the data and extraction of the physics of interest. Just as survey instruments continue to grow in size and complexity, so do the supercomputers that enable these simulations. Here we report on HACC (Hardware/Hybrid Accelerated Cosmology Code), a recently developed and evolving cosmology N-body code framework, designed to run efficiently on diverse computing architectures and to scale to millions of cores and beyond. HACC can run on all current supercomputer architectures and supports a variety of programming models and algorithms. It has been demonstrated at scale on Cell- and GPU-accelerated systems, standard multi-core node clusters, and Blue Gene systems. HACC's design allows for ease of portability, and at the same time, high levels of sustained performance on the fastest supercomputers available. We present a description of the design philosophy of HACC, the underlying algorithms and code structure, and outline implementation details for several specific architectures. We show selected accuracy and performance results from some of the largest high resolution cosmological simulations so far performed, including benchmarks evolving more than 3.6 trillion particles.

  4. Integration of PanDA workload management system with Titan supercomputer at OLCF

    Science.gov (United States)

    De, K.; Klimentov, A.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.

    2015-12-01

    The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, the future LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). The current approach utilizes a modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multicore worker nodes. It also gives PanDA new capability to collect, in real time, information about unused worker nodes on Titan, which allows precise definition of the size and duration of jobs submitted to Titan according to available free resources. This capability significantly reduces PanDA job wait time while improving Titan's utilization efficiency. This implementation was tested with a variety of Monte-Carlo workloads on Titan and is being tested on several other supercomputing platforms. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.

  5. Statistical correlations and risk analyses techniques for a diving dual phase bubble model and data bank using massively parallel supercomputers.

    Science.gov (United States)

    Wienke, B R; O'Leary, T R

    2008-05-01

    Linking model and data, we detail the LANL diving reduced gradient bubble model (RGBM), dynamical principles, and correlation with data in the LANL Data Bank. Table, profile, and meter risks are obtained from likelihood analysis and quoted for air, nitrox, helitrox no-decompression time limits, repetitive dive tables, and selected mixed gas and repetitive profiles. Application analyses include the EXPLORER decompression meter algorithm, NAUI tables, University of Wisconsin Seafood Diver tables, comparative NAUI, PADI, Oceanic NDLs and repetitive dives, comparative nitrogen and helium mixed gas risks, USS Perry deep rebreather (RB) exploration dive,world record open circuit (OC) dive, and Woodville Karst Plain Project (WKPP) extreme cave exploration profiles. The algorithm has seen extensive and utilitarian application in mixed gas diving, both in recreational and technical sectors, and forms the bases forreleased tables and decompression meters used by scientific, commercial, and research divers. The LANL Data Bank is described, and the methods used to deduce risk are detailed. Risk functions for dissolved gas and bubbles are summarized. Parameters that can be used to estimate profile risk are tallied. To fit data, a modified Levenberg-Marquardt routine is employed with L2 error norm. Appendices sketch the numerical methods, and list reports from field testing for (real) mixed gas diving. A Monte Carlo-like sampling scheme for fast numerical analysis of the data is also detailed, as a coupled variance reduction technique and additional check on the canonical approach to estimating diving risk. The method suggests alternatives to the canonical approach. This work represents a first time correlation effort linking a dynamical bubble model with deep stop data. Supercomputing resources are requisite to connect model and data in application.

  6. Scalable parallel programming for high performance seismic simulation on petascale heterogeneous supercomputers

    Science.gov (United States)

    Zhou, Jun

    The 1994 Northridge earthquake in Los Angeles, California, killed 57 people, injured over 8,700 and caused an estimated $20 billion in damage. Petascale simulations are needed in California and elsewhere to provide society with a better understanding of the rupture and wave dynamics of the largest earthquakes at shaking frequencies required to engineer safe structures. As the heterogeneous supercomputing infrastructures are becoming more common, numerical developments in earthquake system research are particularly challenged by the dependence on the accelerator elements to enable "the Big One" simulations with higher frequency and finer resolution. Reducing time to solution and power consumption are two primary focus area today for the enabling technology of fault rupture dynamics and seismic wave propagation in realistic 3D models of the crust's heterogeneous structure. This dissertation presents scalable parallel programming techniques for high performance seismic simulation running on petascale heterogeneous supercomputers. A real world earthquake simulation code, AWP-ODC, one of the most advanced earthquake codes to date, was chosen as the base code in this research, and the testbed is based on Titan at Oak Ridge National Laboraratory, the world's largest hetergeneous supercomputer. The research work is primarily related to architecture study, computation performance tuning and software system scalability. An earthquake simulation workflow has also been developed to support the efficient production sets of simulations. The highlights of the technical development are an aggressive performance optimization focusing on data locality and a notable data communication model that hides the data communication latency. This development results in the optimal computation efficiency and throughput for the 13-point stencil code on heterogeneous systems, which can be extended to general high-order stencil codes. Started from scratch, the hybrid CPU/GPU version of AWP

  7. Accelerating Virtual High-Throughput Ligand Docking: current technology and case study on a petascale supercomputer.

    Science.gov (United States)

    Ellingson, Sally R; Dakshanamurthy, Sivanesan; Brown, Milton; Smith, Jeremy C; Baudry, Jerome

    2014-04-25

    In this paper we give the current state of high-throughput virtual screening. We describe a case study of using a task-parallel MPI (Message Passing Interface) version of Autodock4 [1], [2] to run a virtual high-throughput screen of one-million compounds on the Jaguar Cray XK6 Supercomputer at Oak Ridge National Laboratory. We include a description of scripts developed to increase the efficiency of the predocking file preparation and postdocking analysis. A detailed tutorial, scripts, and source code for this MPI version of Autodock4 are available online at http://www.bio.utk.edu/baudrylab/autodockmpi.htm.

  8. A New Hydrodynamic Model for Numerical Simulation of Interacting Galaxies on Intel Xeon Phi Supercomputers

    Science.gov (United States)

    Kulikov, Igor; Chernykh, Igor; Tutukov, Alexander

    2016-05-01

    This paper presents a new hydrodynamic model of interacting galaxies based on the joint solution of multicomponent hydrodynamic equations, first moments of the collisionless Boltzmann equation and the Poisson equation for gravity. Using this model, it is possible to formulate a unified numerical method for solving hyperbolic equations. This numerical method has been implemented for hybrid supercomputers with Intel Xeon Phi accelerators. The collision of spiral and disk galaxies considering the star formation process, supernova feedback and molecular hydrogen formation is shown as a simulation result.

  9. Scheduling Supercomputers.

    Science.gov (United States)

    1983-02-01

    no task is scheduled with overlap. Let numpi be the total number of preemptions and idle slots of size at most to that are introduced. We see that if...no usable block remains on Qm-*, then numpi < m-k. Otherwise, numpi ! m-k-1. If j>n when this procedure terminates, then all tasks have been scheduled

  10. Grassroots Supercomputing

    CERN Multimedia

    Buchanan, Mark

    2005-01-01

    What started out as a way for SETI to plow through its piles or radio-signal data from deep space has turned into a powerful research tool as computer users acrosse the globe donate their screen-saver time to projects as diverse as climate-change prediction, gravitational-wave searches, and protein folding (4 pages)

  11. Performance Analysis and Scaling Behavior of the Terrestrial Systems Modeling Platform TerrSysMP in Large-Scale Supercomputing Environments

    Science.gov (United States)

    Kollet, S. J.; Goergen, K.; Gasper, F.; Shresta, P.; Sulis, M.; Rihani, J.; Simmer, C.; Vereecken, H.

    2013-12-01

    In studies of the terrestrial hydrologic, energy and biogeochemical cycles, integrated multi-physics simulation platforms take a central role in characterizing non-linear interactions, variances and uncertainties of system states and fluxes in reciprocity with observations. Recently developed integrated simulation platforms attempt to honor the complexity of the terrestrial system across multiple time and space scales from the deeper subsurface including groundwater dynamics into the atmosphere. Technically, this requires the coupling of atmospheric, land surface, and subsurface-surface flow models in supercomputing environments, while ensuring a high-degree of efficiency in the utilization of e.g., standard Linux clusters and massively parallel resources. A systematic performance analysis including profiling and tracing in such an application is crucial in the understanding of the runtime behavior, to identify optimum model settings, and is an efficient way to distinguish potential parallel deficiencies. On sophisticated leadership-class supercomputers, such as the 28-rack 5.9 petaFLOP IBM Blue Gene/Q 'JUQUEEN' of the Jülich Supercomputing Centre (JSC), this is a challenging task, but even more so important, when complex coupled component models are to be analysed. Here we want to present our experience from coupling, application tuning (e.g. 5-times speedup through compiler optimizations), parallel scaling and performance monitoring of the parallel Terrestrial Systems Modeling Platform TerrSysMP. The modeling platform consists of the weather prediction system COSMO of the German Weather Service; the Community Land Model, CLM of NCAR; and the variably saturated surface-subsurface flow code ParFlow. The model system relies on the Multiple Program Multiple Data (MPMD) execution model where the external Ocean-Atmosphere-Sea-Ice-Soil coupler (OASIS3) links the component models. TerrSysMP has been instrumented with the performance analysis tool Scalasca and analyzed

  12. Integration of PanDA workload management system with Titan supercomputer at OLCF

    CERN Document Server

    Panitkin, Sergey; The ATLAS collaboration; Klimentov, Alexei; Oleynik, Danila; Petrosyan, Artem; Schovancova, Jaroslava; Vaniachine, Alexandre; Wenaus, Torre

    2015-01-01

    The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently uses more than 100,000 cores at well over 100 Grid sites with a peak performance of 0.3 petaFLOPS, next LHC data taking run will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multi-core worker nodes. It also gives PanDA new capability to collect, in real tim...

  13. Integration of PanDA workload management system with Titan supercomputer at OLCF

    CERN Document Server

    De, Kaushik; Oleynik, Danila; Panitkin, Sergey; Petrosyan, Artem; Vaniachine, Alexandre; Wenaus, Torre; Schovancova, Jaroslava

    2015-01-01

    The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, next LHC data taking run will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modi ed PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multi-core worker nodes. It also gives PanDA new capability to collect, in real time, information about unused...

  14. PREFACE: HITES 2012: 'Horizons of Innovative Theories, Experiments, and Supercomputing in Nuclear Physics'

    Science.gov (United States)

    Hecht, K. T.

    2012-12-01

    This volume contains the contributions of the speakers of an international conference in honor of Jerry Draayer's 70th birthday, entitled 'Horizons of Innovative Theories, Experiments and Supercomputing in Nuclear Physics'. The list of contributors includes not only international experts in these fields, but also many former collaborators, former graduate students, and former postdoctoral fellows of Jerry Draayer, stressing innovative theories such as special symmetries and supercomputing, both of particular interest to Jerry. The organizers of the conference intended to honor Jerry Draayer not only for his seminal contributions in these fields, but also for his administrative skills at departmental, university, national and international level. Signed: Ted Hecht University of Michigan Conference photograph Scientific Advisory Committee Ani AprahamianUniversity of Notre Dame Baha BalantekinUniversity of Wisconsin Bruce BarrettUniversity of Arizona Umit CatalyurekOhio State Unversity David DeanOak Ridge National Laboratory Jutta Escher (Chair)Lawrence Livermore National Laboratory Jorge HirschUNAM, Mexico David RoweUniversity of Toronto Brad Sherill & Michigan State University Joel TohlineLouisiana State University Edward ZganjarLousiana State University Organizing Committee Jeff BlackmonLouisiana State University Mark CaprioUniversity of Notre Dame Tomas DytrychLouisiana State University Ana GeorgievaINRNE, Bulgaria Kristina Launey (Co-chair)Louisiana State University Gabriella PopaOhio University Zanesville James Vary (Co-chair)Iowa State University Local Organizing Committee Laura LinhardtLouisiana State University Charlie RascoLouisiana State University Karen Richard (Coordinator)Louisiana State University

  15. Groundwater cooling of a supercomputer in Perth, Western Australia: hydrogeological simulations and thermal sustainability

    Science.gov (United States)

    Sheldon, Heather A.; Schaubs, Peter M.; Rachakonda, Praveen K.; Trefry, Michael G.; Reid, Lynn B.; Lester, Daniel R.; Metcalfe, Guy; Poulet, Thomas; Regenauer-Lieb, Klaus

    2015-12-01

    Groundwater cooling (GWC) is a sustainable alternative to conventional cooling technologies for supercomputers. A GWC system has been implemented for the Pawsey Supercomputing Centre in Perth, Western Australia. Groundwater is extracted from the Mullaloo Aquifer at 20.8 °C and passes through a heat exchanger before returning to the same aquifer. Hydrogeological simulations of the GWC system were used to assess its performance and sustainability. Simulations were run with cooling capacities of 0.5 or 2.5 Mega Watts thermal (MWth), with scenarios representing various combinations of pumping rate, injection temperature and hydrogeological parameter values. The simulated system generates a thermal plume in the Mullaloo Aquifer and overlying Superficial Aquifer. Thermal breakthrough (transfer of heat from injection to production wells) occurred in 2.7-4.3 years for a 2.5 MWth system. Shielding (reinjection of cool groundwater between the injection and production wells) resulted in earlier thermal breakthrough but reduced the rate of temperature increase after breakthrough, such that shielding was beneficial after approximately 5 years pumping. Increasing injection temperature was preferable to increasing flow rate for maintaining cooling capacity after thermal breakthrough. Thermal impacts on existing wells were small, with up to 10 wells experiencing a temperature increase ≥ 0.1 °C (largest increase 6 °C).

  16. Frequently updated noise threat maps created with use of supercomputing grid

    Directory of Open Access Journals (Sweden)

    Szczodrak Maciej

    2014-09-01

    Full Text Available An innovative supercomputing grid services devoted to noise threat evaluation were presented. The services described in this paper concern two issues, first is related to the noise mapping, while the second one focuses on assessment of the noise dose and its influence on the human hearing system. The discussed serviceswere developed within the PL-Grid Plus Infrastructure which accumulates Polish academic supercomputer centers. Selected experimental results achieved by the usage of the services proposed were presented. The assessment of the environmental noise threats includes creation of the noise maps using either ofline or online data, acquired through a grid of the monitoring stations. A concept of estimation of the source model parameters based on the measured sound level for the purpose of creating frequently updated noise maps was presented. Connecting the noise mapping grid service with a distributed sensor network enables to automatically update noise maps for a specified time period. Moreover, a unique attribute of the developed software is the estimation of the auditory effects evoked by the exposure to noise. The estimation method uses a modified psychoacoustic model of hearing and is based on the calculated noise level values and on the given exposure period. Potential use scenarios of the grid services for research or educational purpose were introduced. Presentation of the results of predicted hearing threshold shift caused by exposure to excessive noise can raise the public awareness of the noise threats.

  17. Supercomputer Assisted Generation of Machine Learning Agents for the Calibration of Building Energy Models

    Energy Technology Data Exchange (ETDEWEB)

    Sanyal, Jibonananda [ORNL; New, Joshua Ryan [ORNL; Edwards, Richard [ORNL

    2013-01-01

    Building Energy Modeling (BEM) is an approach to model the energy usage in buildings for design and retrot pur- poses. EnergyPlus is the agship Department of Energy software that performs BEM for dierent types of buildings. The input to EnergyPlus can often extend in the order of a few thousand parameters which have to be calibrated manu- ally by an expert for realistic energy modeling. This makes it challenging and expensive thereby making building en- ergy modeling unfeasible for smaller projects. In this paper, we describe the \\Autotune" research which employs machine learning algorithms to generate agents for the dierent kinds of standard reference buildings in the U.S. building stock. The parametric space and the variety of building locations and types make this a challenging computational problem necessitating the use of supercomputers. Millions of En- ergyPlus simulations are run on supercomputers which are subsequently used to train machine learning algorithms to generate agents. These agents, once created, can then run in a fraction of the time thereby allowing cost-eective cali- bration of building models.

  18. Federal Market Information Technology in the Post Flash Crash Era: Roles for Supercomputing

    Energy Technology Data Exchange (ETDEWEB)

    Bethel, E. Wes; Leinweber, David; Ruebel, Oliver; Wu, Kesheng

    2011-09-16

    This paper describes collaborative work between active traders, regulators, economists, and supercomputing researchers to replicate and extend investigations of the Flash Crash and other market anomalies in a National Laboratory HPC environment. Our work suggests that supercomputing tools and methods will be valuable to market regulators in achieving the goal of market safety, stability, and security. Research results using high frequency data and analytics are described, and directions for future development are discussed. Currently the key mechanism for preventing catastrophic market action are “circuit breakers.” We believe a more graduated approach, similar to the “yellow light” approach in motorsports to slow down traffic, might be a better way to achieve the same goal. To enable this objective, we study a number of indicators that could foresee hazards in market conditions and explore options to confirm such predictions. Our tests confirm that Volume Synchronized Probability of Informed Trading (VPIN) and a version of volume Herfindahl-Hirschman Index (HHI) for measuring market fragmentation can indeed give strong signals ahead of the Flash Crash event on May 6 2010. This is a preliminary step toward a full-fledged early-warning system for unusual market conditions.

  19. Robust Machine Learning Applied to Terascale Astronomical Datasets

    CERN Document Server

    Ball, Nicholas M; Myers, Adam D

    2008-01-01

    We present recent results from the LCDM (Laboratory for Cosmological Data Mining; http://lcdm.astro.uiuc.edu) collaboration between UIUC Astronomy and NCSA to deploy supercomputing cluster resources and machine learning algorithms for the mining of terascale astronomical datasets. This is a novel application in the field of astronomy, because we are using such resources for data mining, and not just performing simulations. Via a modified implementation of the NCSA cyberenvironment Data-to-Knowledge, we are able to provide improved classifications for over 100 million stars and galaxies in the Sloan Digital Sky Survey, improved distance measures, and a full exploitation of the simple but powerful k-nearest neighbor algorithm. A driving principle of this work is that our methods should be extensible from current terascale datasets to upcoming petascale datasets and beyond. We discuss issues encountered to-date, and further issues for the transition to petascale. In particular, disk I/O will become a major limit...

  20. Large-scale Particle Simulations for Debris Flows using Dynamic Load Balance on a GPU-rich Supercomputer

    Science.gov (United States)

    Tsuzuki, Satori; Aoki, Takayuki

    2016-04-01

    Numerical simulations for debris flows including a countless of objects is one of important topics in fluid dynamics and many engineering applications. Particle-based method is a promising approach to carry out the simulations for flows interacting with objects. In this paper, we propose an efficient method to realize a large-scale simulation for fluid-structure interaction by combining SPH (Smoothed Particle Hydrodynamics) method for fluid with DEM (Discrete Element Method) for objects on a multi-GPU system. By applying space filling curves to decomposition of the computational domain, we are able to contain the same number of particles in each decomposed domain. In our implementation, several techniques for particle counting and data movement have been introduced. Fragmentation of the memory used for particles happens during the time-integration and the frequency of de-fragmentation is examined by taking account for computational load balance and the communication cost between CPU and GPU. A link-list technique of the particle interaction is introduced to save the memory drastically. It is found that the sorting of particle data for the neighboring particle list using linked-list method improves the memory access greatly with a certain interval. The weak and strong scalabilities for a SPH simulation using 111 Million particles was measured from 4 GPUs to 512 GPUs for three types of space filling curves. A large-scale debris flow simulation of tsunami with 10,368 floating rubbles using 117 Million particles were successfully carried out with 256 GPUs on the TSUBAME 2.5 supercomputer at Tokyo Institute of Technology.

  1. Intel 80860 or I860: The million transistor RISC microprocessor chip with supercomputer capability. April 1988-September 1989 (Citations from the Computer data base). Report for April 1988-September 1989

    Energy Technology Data Exchange (ETDEWEB)

    1989-10-01

    This bibliography contains citations concerning Intel's new microprocessor which has more than a million transistors and is capable of performing up to 80 million floating-point operations per second (80 mflops). The I860 (originally code named the N-10 during development) is to be used in workstation type applications. It will be suited for problems such as fluid dynamics, molecular modeling, structural analysis, and economic modeling which requires supercomputer number crunching and advanced graphics. (Contains 64 citations fully indexed and including a title list.)

  2. A Framework for HI Spectral Source Finding Using Distributed-Memory Supercomputing

    CERN Document Server

    Westerlund, Stefan

    2014-01-01

    The latest generation of radio astronomy interferometers will conduct all sky surveys with data products consisting of petabytes of spectral line data. Traditional approaches to identifying and parameterising the astrophysical sources within this data will not scale to datasets of this magnitude, since the performance of workstations will not keep up with the real-time generation of data. For this reason, it is necessary to employ high performance computing systems consisting of a large number of processors connected by a high-bandwidth network. In order to make use of such supercomputers substantial modifications must be made to serial source finding code. To ease the transition, this work presents the Scalable Source Finder Framework, a framework providing storage access, networking communication and data composition functionality, which can support a wide range of source finding algorithms provided they can be applied to subsets of the entire image. Additionally, the Parallel Gaussian Source Finder was imp...

  3. Diskless supercomputers: Scalable, reliable I/O for the Tera-Op technology base

    Science.gov (United States)

    Katz, Randy H.; Ousterhout, John K.; Patterson, David A.

    1993-01-01

    Computing is seeing an unprecedented improvement in performance; over the last five years there has been an order-of-magnitude improvement in the speeds of workstation CPU's. At least another order of magnitude seems likely in the next five years, to machines with 500 MIPS or more. The goal of the ARPA Teraop program is to realize even larger, more powerful machines, executing as many as a trillion operations per second. Unfortunately, we have seen no comparable breakthroughs in I/O performance; the speeds of I/O devices and the hardware and software architectures for managing them have not changed substantially in many years. We have completed a program of research to demonstrate hardware and software I/O architectures capable of supporting the kinds of internetworked 'visualization' workstations and supercomputers that will appear in the mid 1990s. The project had three overall goals: high performance, high reliability, and scalable, multipurpose system.

  4. An Optimized Parallel FDTD Topology for Challenging Electromagnetic Simulations on Supercomputers

    Directory of Open Access Journals (Sweden)

    Shugang Jiang

    2015-01-01

    Full Text Available It may not be a challenge to run a Finite-Difference Time-Domain (FDTD code for electromagnetic simulations on a supercomputer with more than 10 thousands of CPU cores; however, to make FDTD code work with the highest efficiency is a challenge. In this paper, the performance of parallel FDTD is optimized through MPI (message passing interface virtual topology, based on which a communication model is established. The general rules of optimal topology are presented according to the model. The performance of the method is tested and analyzed on three high performance computing platforms with different architectures in China. Simulations including an airplane with a 700-wavelength wingspan, and a complex microstrip antenna array with nearly 2000 elements are performed very efficiently using a maximum of 10240 CPU cores.

  5. Large-scale integrated super-computing platform for next generation virtual drug discovery.

    Science.gov (United States)

    Mitchell, Wayne; Matsumoto, Shunji

    2011-08-01

    Traditional drug discovery starts by experimentally screening chemical libraries to find hit compounds that bind to protein targets, modulating their activity. Subsequent rounds of iterative chemical derivitization and rescreening are conducted to enhance the potency, selectivity, and pharmacological properties of hit compounds. Although computational docking of ligands to targets has been used to augment the empirical discovery process, its historical effectiveness has been limited because of the poor correlation of ligand dock scores and experimentally determined binding constants. Recent progress in super-computing, coupled to theoretical insights, allows the calculation of the Gibbs free energy, and therefore accurate binding constants, for usually large ligand-receptor systems. This advance extends the potential of virtual drug discovery. A specific embodiment of the technology, integrating de novo, abstract fragment based drug design, sophisticated molecular simulation, and the ability to calculate thermodynamic binding constants with unprecedented accuracy, are discussed. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. A CPU/MIC Collaborated Parallel Framework for GROMACS on Tianhe-2 Supercomputer.

    Science.gov (United States)

    Peng, Shaoliang; Yang, Shunyun; Su, Wenhe; Zhang, Xiaoyu; Zhang, Tenglilang; Liu, Weiguo; Zhao, Xingming

    2017-06-16

    Molecular Dynamics (MD) is the simulation of the dynamic behavior of atoms and molecules. As the most popular software for molecular dynamics, GROMACS cannot work on large-scale data because of limit computing resources. In this paper, we propose a CPU and Intel® Xeon Phi Many Integrated Core (MIC) collaborated parallel framework to accelerate GROMACS using the offload mode on a MIC coprocessor, with which the performance of GROMACS is improved significantly, especially with the utility of Tianhe-2 supercomputer. Furthermore, we optimize GROMACS so that it can run on both the CPU and MIC at the same time. In addition, we accelerate multi-node GROMACS so that it can be used in practice. Benchmarking on real data, our accelerated GROMACS performs very well and reduces computation time significantly. Source code: https://github.com/tianhe2/gromacs-mic.

  7. Modern Gyrokinetic Particle-In-Cell Simulation of Fusion Plasmas on Top Supercomputers

    CERN Document Server

    Wang, Bei; Tang, William; Ibrahim, Khaled; Madduri, Kamesh; Williams, Samuel; Oliker, Leonid

    2015-01-01

    The Gyrokinetic Toroidal Code at Princeton (GTC-P) is a highly scalable and portable particle-in-cell (PIC) code. It solves the 5D Vlasov-Poisson equation featuring efficient utilization of modern parallel computer architectures at the petascale and beyond. Motivated by the goal of developing a modern code capable of dealing with the physics challenge of increasing problem size with sufficient resolution, new thread-level optimizations have been introduced as well as a key additional domain decomposition. GTC-P's multiple levels of parallelism, including inter-node 2D domain decomposition and particle decomposition, as well as intra-node shared memory partition and vectorization have enabled pushing the scalability of the PIC method to extreme computational scales. In this paper, we describe the methods developed to build a highly parallelized PIC code across a broad range of supercomputer designs. This particularly includes implementations on heterogeneous systems using NVIDIA GPU accelerators and Intel Xeon...

  8. Dawning Nebulae: A PetaFLOPS Supercomputer with a Heterogeneous Structure

    Institute of Scientific and Technical Information of China (English)

    Ning-Hui Sun; Jing Xing; Zhi-Gang Huo; Guang-Ming Tan; Jin Xiong; Bo Li; Can Ma

    2011-01-01

    Dawning Nebulae is a heterogeneous system composed of 9280 multi-core x86 CPUs and 4640 NVIDIA Fermi GPUs. With a Linpack performance of 1.271 petaFLOPS, it was ranked the second in the TOP500 List released in June 2010. In this paper, key issues in the system design of Dawning Nebulae are introduced. System tuning methodologies aiming at petaFLOPS Linpack result are presented, including algorithmic optimization and communication improvement. The design of its file I/O subsystem, including HVFS and the underlying DCFS3, is also described. Performance evaluations show that the Linpack efficiency of each node reaches 69.89%, and 1024-node aggregate read and write bandwidths exceed 100 GB/s and 70 GB/s respectively. The success of Dawning Nebulae has demonstrated the viability of CPU/GPU heterogeneous structure for future designs of supercomputers.

  9. Scalability Test of multiscale fluid-platelet model for three top supercomputers

    Science.gov (United States)

    Zhang, Peng; Zhang, Na; Gao, Chao; Zhang, Li; Gao, Yuxiang; Deng, Yuefan; Bluestein, Danny

    2016-07-01

    We have tested the scalability of three supercomputers: the Tianhe-2, Stampede and CS-Storm with multiscale fluid-platelet simulations, in which a highly-resolved and efficient numerical model for nanoscale biophysics of platelets in microscale viscous biofluids is considered. Three experiments involving varying problem sizes were performed: Exp-S: 680,718-particle single-platelet; Exp-M: 2,722,872-particle 4-platelet; and Exp-L: 10,891,488-particle 16-platelet. Our implementations of multiple time-stepping (MTS) algorithm improved the performance of single time-stepping (STS) in all experiments. Using MTS, our model achieved the following simulation rates: 12.5, 25.0, 35.5 μs/day for Exp-S and 9.09, 6.25, 14.29 μs/day for Exp-M on Tianhe-2, CS-Storm 16-K80 and Stampede K20. The best rate for Exp-L was 6.25 μs/day for Stampede. Utilizing current advanced HPC resources, the simulation rates achieved by our algorithms bring within reach performing complex multiscale simulations for solving vexing problems at the interface of biology and engineering, such as thrombosis in blood flow which combines millisecond-scale hematology with microscale blood flow at resolutions of micro-to-nanoscale cellular components of platelets. This study of testing the performance characteristics of supercomputers with advanced computational algorithms that offer optimal trade-off to achieve enhanced computational performance serves to demonstrate that such simulations are feasible with currently available HPC resources.

  10. Harnessing Petaflop-Scale Multi-Core Supercomputing for Problems in Space Science

    Science.gov (United States)

    Albright, B. J.; Yin, L.; Bowers, K. J.; Daughton, W.; Bergen, B.; Kwan, T. J.

    2008-12-01

    The particle-in-cell kinetic plasma code VPIC has been migrated successfully to the world's fastest supercomputer, Roadrunner, a hybrid multi-core platform built by IBM for the Los Alamos National Laboratory. How this was achieved will be described and examples of state-of-the-art calculations in space science, in particular, the study of magnetic reconnection, will be presented. With VPIC on Roadrunner, we have performed, for the first time, plasma PIC calculations with over one trillion particles, >100× larger than calculations considered "heroic" by community standards. This allows examination of physics at unprecedented scale and fidelity. Roadrunner is an example of an emerging paradigm in supercomputing: the trend toward multi-core systems with deep hierarchies and where memory bandwidth optimization is vital to achieving high performance. Getting VPIC to perform well on such systems is a formidable challenge: the core algorithm is memory bandwidth limited with low compute-to-data ratio and requires random access to memory in its inner loop. That we were able to get VPIC to perform and scale well, achieving >0.374 Pflop/s and linear weak scaling on real physics problems on up to the full 12240-core Roadrunner machine, bodes well for harnessing these machines for our community's needs in the future. Many of the design considerations encountered commute to other multi-core and accelerated (e.g., via GPU) platforms and we modified VPIC with flexibility in mind. These will be summarized and strategies for how one might adapt a code for such platforms will be shared. Work performed under the auspices of the U.S. DOE by the LANS LLC Los Alamos National Laboratory. Dr. Bowers is a LANL Guest Scientist; he is presently at D. E. Shaw Research LLC, 120 W 45th Street, 39th Floor, New York, NY 10036.

  11. Air Force Maui Optical and Supercomputing Site (AMOS) Application Briefs 2004

    Science.gov (United States)

    2004-01-01

    behavior of distal residues in HemAT-Hs. Significance and Vision: Characterization of this protein, from the signaling domain of aerotaxis , by theoretical...Bioinformatics project in collaboration with the University of Hawaii at MHPCC. The heme-containing globular protein from the aerotaxis region of

  12. Optimization of a Power Transient Stability Program on a Vector Supercomputer, Theory and Applications

    Science.gov (United States)

    1990-04-18

    System Solution Techniques . ....... 22 Introduction ... ............ 22 Gauss- Seidel Method . ........ 23 Newton-Raphson Method . ...... 25 Summary...is acceptable. Traditionally, load-flow and stability studies have been solved using either the Gauss- Seidel method or the Newton-Raphson method...Gauss- Seidel Method The Gauss- Seidel method solves each equation in the system in turn, by assuming a value for each variable, solving the equation

  13. A visualization environment for supercomputing-based applications in computational mechanics

    Energy Technology Data Exchange (ETDEWEB)

    Pavlakos, C.J.; Schoof, L.A.; Mareda, J.F.

    1993-06-01

    In this paper, we characterize a visualization environment that has been designed and prototyped for a large community of scientists and engineers, with an emphasis in superconducting-based computational mechanics. The proposed environment makes use of a visualization server concept to provide effective, interactive visualization to the user`s desktop. Benefits of using the visualization server approach are discussed. Some thoughts regarding desirable features for visualization server hardware architectures are also addressed. A brief discussion of the software environment is included. The paper concludes by summarizing certain observations which we have made regarding the implementation of such visualization environments.

  14. Easy Access to HPC Resources through the Application GUI

    KAUST Repository

    van Waveren, Matthijs

    2016-11-01

    The computing environment at the King Abdullah University of Science and Technology (KAUST) is growing in size and complexity. KAUST hosts the tenth fastest supercomputer in the world (Shaheen II) and several HPC clusters. Researchers can be inhibited by the complexity, as they need to learn new languages and execute many tasks in order to access the HPC clusters and the supercomputer. In order to simplify the access, we have developed an interface between the applications and the clusters and supercomputer that automates the transfer of input data and job submission and also the retrieval of results to the researcher’s local workstation. The innovation is that the user now submits his jobs from within the application GUI on his workstation, and does not have to directly log into the clusters or supercomputer anymore. This article details the solution and its benefits to the researchers.

  15. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP v1.0) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    Science.gov (United States)

    Gasper, F.; Goergen, K.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.; Kollet, S.

    2014-10-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing nonlinear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP v1.0) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm using the OASIS suite of external couplers, and require memory and load balancing considerations in the exchange of the coupling fields between different component models and the allocation of computational resources, respectively. Using the advanced profiling and tracing tool Scalasca to determine an optimum load balancing leads to a 19% speedup. In massively parallel supercomputer environments, the coupler OASIS-MCT is recommended, which resolves memory limitations that may be significant in case of very large computational domains and exchange fields as they occur in these specific test cases and in many applications in terrestrial research. However, model I/O and initialization in the petascale range still require major attention, as they constitute true big data challenges in light of future exascale computing resources. Based on a factor-two speedup due to compiler optimizations, a refactored coupling interface using OASIS-MCT and an optimum load balancing, the problem size in a weak scaling study can be increased by a factor of 64 from 512 to 32 768 processes while maintaining parallel efficiencies above 80% for the component models.

  16. Research center Juelich to install Germany's most powerful supercomputer new IBM System for science and research will achieve 5.8 trillion computations per second

    CERN Multimedia

    2002-01-01

    "The Research Center Juelich, Germany, and IBM today announced that they have signed a contract for the delivery and installation of a new IBM supercomputer at the Central Institute for Applied Mathematics" (1/2 page).

  17. Earth and environmental science in the 1980's: Part 1: Environmental data systems, supercomputer facilities and networks

    Science.gov (United States)

    1986-01-01

    Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section.

  18. Federal Coordinating Council on Science, Engineering and Technology, Committee on Computer Research and Applications, Subcommittee on Science and Engineering Computing: Annual report, 1987

    Energy Technology Data Exchange (ETDEWEB)

    1988-03-01

    In the past year the committee initiated efforts that resulted in the report ''A Research and Development Strategy for High Performance Computing'' and will presently provide a government-wide implementation plan to address the technological opportunities possible with the achievement of significantly enhanced supercomputer capability. The committee met on a regular basis to review government supported programs in research, development and application of new supercomputer technology. The Committee annually visits supercomputer manufacturers to be briefed on their plans for future generation machines. Cray Research and ETA Systems continue to make progress toward developing more advanced supercomputers. The US supercomputer manufacturers remain dependent upon their emerging Japanese competitors for high performance IC'S although progress has been made toward achieving more adequate domestic sourcing. Reports by the Defense Science Board and the National Security Council/Economic Policy Council, which addressed semiconductor issues, were completed during the year with advice and input from the Committee. IBM has re-entered the supercomputer marketplace. The current 3090 series with expandable vector processing capability has achieved a low end position in the supercomputer performance spectrum. Subsequent development and marketing by IBM of more powerful machines would have important and far reaching impact on the domestic and world supercomputer market. Computers with massively parallel architecture--thousands of processors--are entering the market place and are beginning to become more of a factor in the computational productivity scale.

  19. Assessment techniques for a learning-centered curriculum: evaluation design for adventures in supercomputing

    Energy Technology Data Exchange (ETDEWEB)

    Helland, B. [Ames Lab., IA (United States); Summers, B.G. [Oak Ridge National Lab., TN (United States)

    1996-09-01

    As the classroom paradigm shifts from being teacher-centered to being learner-centered, student assessments are evolving from typical paper and pencil testing to other methods of evaluation. Students should be probed for understanding, reasoning, and critical thinking abilities rather than their ability to return memorized facts. The assessment of the Department of Energy`s pilot program, Adventures in Supercomputing (AiS), offers one example of assessment techniques developed for learner-centered curricula. This assessment has employed a variety of methods to collect student data. Methods of assessment used were traditional testing, performance testing, interviews, short questionnaires via email, and student presentations of projects. The data obtained from these sources have been analyzed by a professional assessment team at the Center for Children and Technology. The results have been used to improve the AiS curriculum and establish the quality of the overall AiS program. This paper will discuss the various methods of assessment used and the results.

  20. Massively-parallel electrical-conductivity imaging of hydrocarbonsusing the Blue Gene/L supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Commer, M.; Newman, G.A.; Carazzone, J.J.; Dickens, T.A.; Green,K.E.; Wahrmund, L.A.; Willen, D.E.; Shiu, J.

    2007-05-16

    Large-scale controlled source electromagnetic (CSEM)three-dimensional (3D) geophysical imaging is now receiving considerableattention for electrical conductivity mapping of potential offshore oiland gas reservoirs. To cope with the typically large computationalrequirements of the 3D CSEM imaging problem, our strategies exploitcomputational parallelism and optimized finite-difference meshing. Wereport on an imaging experiment, utilizing 32,768 tasks/processors on theIBM Watson Research Blue Gene/L (BG/L) supercomputer. Over a 24-hourperiod, we were able to image a large scale marine CSEM field data setthat previously required over four months of computing time ondistributed clusters utilizing 1024 tasks on an Infiniband fabric. Thetotal initial data misfit could be decreased by 67 percent within 72completed inversion iterations, indicating an electrically resistiveregion in the southern survey area below a depth of 1500 m below theseafloor. The major part of the residual misfit stems from transmitterparallel receiver components that have an offset from the transmittersail line (broadside configuration). Modeling confirms that improvedbroadside data fits can be achieved by considering anisotropic electricalconductivities. While delivering a satisfactory gross scale image for thedepths of interest, the experiment provides important evidence for thenecessity of discriminating between horizontal and verticalconductivities for maximally consistent 3D CSEM inversions.

  1. Distributed computing as a virtual supercomputer: Tools to run and manage large-scale BOINC simulations

    Science.gov (United States)

    Giorgino, Toni; Harvey, M. J.; de Fabritiis, Gianni

    2010-08-01

    Distributed computing (DC) projects tackle large computational problems by exploiting the donated processing power of thousands of volunteered computers, connected through the Internet. To efficiently employ the computational resources of one of world's largest DC efforts, GPUGRID, the project scientists require tools that handle hundreds of thousands of tasks which run asynchronously and generate gigabytes of data every day. We describe RBoinc, an interface that allows computational scientists to embed the DC methodology into the daily work-flow of high-throughput experiments. By extending the Berkeley Open Infrastructure for Network Computing (BOINC), the leading open-source middleware for current DC projects, with mechanisms to submit and manage large-scale distributed computations from individual workstations, RBoinc turns distributed grids into cost-effective virtual resources that can be employed by researchers in work-flows similar to conventional supercomputers. The GPUGRID project is currently using RBoinc for all of its in silico experiments based on molecular dynamics methods, including the determination of binding free energies and free energy profiles in all-atom models of biomolecules.

  2. Benchmarking Further Single Board Computers for Building a Mini Supercomputer for Simulation of Telecommunication Systems

    Directory of Open Access Journals (Sweden)

    Gábor Lencse

    2016-01-01

    Full Text Available Parallel Discrete Event Simulation (PDES with the conservative synchronization method can be efficiently used for the performance analysis of telecommunication systems because of their good lookahead properties. For PDES, a cost effective execution platform may be built by using single board computers (SBCs, which offer relatively high computation capacity compared to their price or power consumption and especially to the space they take up. A benchmarking method is proposed and its operation is demonstrated by benchmarking ten different SBCs, namely Banana Pi, Beaglebone Black, Cubieboard2, Odroid-C1+, Odroid-U3+, Odroid-XU3 Lite, Orange Pi Plus, Radxa Rock Lite, Raspberry Pi Model B+, and Raspberry Pi 2 Model B+. Their benchmarking results are compared to find out which one should be used for building a mini supercomputer for parallel discrete-event simulation of telecommunication systems. The SBCs are also used to build a heterogeneous cluster and the performance of the cluster is tested, too.

  3. Bringing ATLAS production to HPC resources - A use case with the Hydra supercomputer of the Max Planck Society

    Science.gov (United States)

    Kennedy, J. A.; Kluth, S.; Mazzaferro, L.; Walker, Rodney

    2015-12-01

    The possible usage of HPC resources by ATLAS is now becoming viable due to the changing nature of these systems and it is also very attractive due to the need for increasing amounts of simulated data. In recent years the architecture of HPC systems has evolved, moving away from specialized monolithic systems, to a more generic linux type platform. This change means that the deployment of non HPC specific codes has become much easier. The timing of this evolution perfectly suits the needs of ATLAS and opens a new window of opportunity. The ATLAS experiment at CERN will begin a period of high luminosity data taking in 2015. This high luminosity phase will be accompanied by a need for increasing amounts of simulated data which is expected to exceed the capabilities of the current Grid infrastructure. ATLAS aims to address this need by opportunistically accessing resources such as cloud and HPC systems. This paper presents the results of a pilot project undertaken by ATLAS and the MPP/RZG to provide access to the HYDRA supercomputer facility. Hydra is the supercomputer of the Max Planck Society, it is a linux based supercomputer with over 80000 cores and 4000 physical nodes located at the RZG near Munich. This paper describes the work undertaken to integrate Hydra into the ATLAS production system by using the Nordugrid ARC-CE and other standard Grid components. The customization of these components and the strategies for HPC usage are discussed as well as possibilities for future directions.

  4. Seismic Sensors to Supercomputers: Internet Mapping and Computational Tools for Teaching and Learning about Earthquakes and the Structure of the Earth from Seismology

    Science.gov (United States)

    Meertens, C. M.; Seber, D.; Hamburger, M.

    2004-12-01

    The Internet has become an integral resource in the classrooms and homes of teachers and students. Widespread Web-access to seismic data and analysis tools enhances opportunities for teaching and learning about earthquakes and the structure of the earth from seismic tomography. We will present an overview and demonstration of the UNAVCO Voyager Java- and Javascript-based mapping tools (jules.unavco.org) and the Cornell University/San Diego Supercomputer Center (www.discoverourearth.org) Java-based data analysis and mapping tools. These map tools, datasets, and related educational websites have been developed and tested by collaborative teams of scientific programmers, research scientists, and educators. Dual-use by research and education communities ensures persistence of the tools and data, motivates on-going development, and encourages fresh content. With these tools are curricular materials and on-going evaluation processes that are essential for an effective application in the classroom. The map tools provide not only seismological data and tomographic models of the earth's interior, but also a wealth of associated map data such as topography, gravity, sea-floor age, plate tectonic motions and strain rates determined from GPS geodesy, seismic hazard maps, stress, and a host of geographical data. These additional datasets help to provide context and enable comparisons leading to an integrated view of the planet and the on-going processes that shape it. Emerging Cyberinfrastructure projects such as the NSF-funded GEON Information Technology Research project (www.geongrid.org) are developing grid/web services, advanced visualization software, distributed databases and data sharing methods, concept-based search mechanisms, and grid-computing resources for earth science and education. These developments in infrastructure seek to extend the access to data and to complex modeling tools from the hands of a few researchers to a much broader set of users. The GEON

  5. Influence of Earth crust composition on continental collision style in Precambrian conditions: Results of supercomputer modelling

    Science.gov (United States)

    Zavyalov, Sergey; Zakharov, Vladimir

    2016-04-01

    A number of issues concerning Precambrian geodynamics still remain unsolved because of uncertainity of many physical (thermal regime, lithosphere thickness, crust thickness, etc.) and chemical (mantle composition, crust composition) parameters, which differed considerably comparing to the present day values. In this work, we show results of numerical supercomputations based on petrological and thermomechanical 2D model, which simulates the process of collision between two continental plates, each 80-160 km thick, with various convergence rates ranging from 5 to 15 cm/year. In the model, the upper mantle temperature is 150-200 ⁰C higher than the modern value, while the continental crust radiogenic heat production is higher than the present value by the factor of 1.5. These settings correspond to Archean conditions. The present study investigates the dependence of collision style on various continental crust parameters, especially on crust composition. The 3 following archetypal settings of continental crust composition are examined: 1) completely felsic continental crust; 2) basic lower crust and felsic upper crust; 3) basic upper crust and felsic lower crust (hereinafter referred to as inverted crust). Modeling results show that collision with completely felsic crust is unlikely. In the case of basic lower crust, a continental subduction and subsequent continental rocks exhumation can take place. Therefore, formation of ultra-high pressure metamorphic rocks is possible. Continental subduction also occurs in the case of inverted continental crust. However, in the latter case, the exhumation of felsic rocks is blocked by upper basic layer and their subsequent interaction depends on their volume ratio. Thus, if the total inverted crust thickness is about 15 km and the thicknesses of the two layers are equal, felsic rocks cannot be exhumed. If the total thickness is 30 to 40 km and that of the felsic layer is 20 to 25 km, it breaks through the basic layer leading to

  6. Scalable geocomputation: evolving an environmental model building platform from single-core to supercomputers

    Science.gov (United States)

    Schmitz, Oliver; de Jong, Kor; Karssenberg, Derek

    2017-04-01

    There is an increasing demand to run environmental models on a big scale: simulations over large areas at high resolution. The heterogeneity of available computing hardware such as multi-core CPUs, GPUs or supercomputer potentially provides significant computing power to fulfil this demand. However, this requires detailed knowledge of the underlying hardware, parallel algorithm design and the implementation thereof in an efficient system programming language. Domain scientists such as hydrologists or ecologists often lack this specific software engineering knowledge, their emphasis is (and should be) on exploratory building and analysis of simulation models. As a result, models constructed by domain specialists mostly do not take full advantage of the available hardware. A promising solution is to separate the model building activity from software engineering by offering domain specialists a model building framework with pre-programmed building blocks that they combine to construct a model. The model building framework, consequently, needs to have built-in capabilities to make full usage of the available hardware. Developing such a framework providing understandable code for domain scientists and being runtime efficient at the same time poses several challenges on developers of such a framework. For example, optimisations can be performed on individual operations or the whole model, or tasks need to be generated for a well-balanced execution without explicitly knowing the complexity of the domain problem provided by the modeller. Ideally, a modelling framework supports the optimal use of available hardware whichsoever combination of model building blocks scientists use. We demonstrate our ongoing work on developing parallel algorithms for spatio-temporal modelling and demonstrate 1) PCRaster, an environmental software framework (http://www.pcraster.eu) providing spatio-temporal model building blocks and 2) parallelisation of about 50 of these building blocks using

  7. Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer.

    Science.gov (United States)

    Hines, Michael; Kumar, Sameer; Schürmann, Felix

    2011-01-01

    For neural network simulations on parallel machines, interprocessor spike communication can be a significant portion of the total simulation time. The performance of several spike exchange methods using a Blue Gene/P (BG/P) supercomputer has been tested with 8-128 K cores using randomly connected networks of up to 32 M cells with 1 k connections per cell and 4 M cells with 10 k connections per cell, i.e., on the order of 4·10(10) connections (K is 1024, M is 1024(2), and k is 1000). The spike exchange methods used are the standard Message Passing Interface (MPI) collective, MPI_Allgather, and several variants of the non-blocking Multisend method either implemented via non-blocking MPI_Isend, or exploiting the possibility of very low overhead direct memory access (DMA) communication available on the BG/P. In all cases, the worst performing method was that using MPI_Isend due to the high overhead of initiating a spike communication. The two best performing methods-the persistent Multisend method using the Record-Replay feature of the Deep Computing Messaging Framework DCMF_Multicast; and a two-phase multisend in which a DCMF_Multicast is used to first send to a subset of phase one destination cores, which then pass it on to their subset of phase two destination cores-had similar performance with very low overhead for the initiation of spike communication. Departure from ideal scaling for the Multisend methods is almost completely due to load imbalance caused by the large variation in number of cells that fire on each processor in the interval between synchronization. Spike exchange time itself is negligible since transmission overlaps with computation and is handled by a DMA controller. We conclude that ideal performance scaling will be ultimately limited by imbalance between incoming processor spikes between synchronization intervals. Thus, counterintuitively, maximization of load balance requires that the distribution of cells on processors should not reflect

  8. Proposal of a Desk-Side Supercomputer with Reconfigurable Data-Paths Using Rapid Single-Flux-Quantum Circuits

    Science.gov (United States)

    Takagi, Naofumi; Murakami, Kazuaki; Fujimaki, Akira; Yoshikawa, Nobuyuki; Inoue, Koji; Honda, Hiroaki

    We propose a desk-side supercomputer with large-scale reconfigurable data-paths (LSRDPs) using superconducting rapid single-flux-quantum (RSFQ) circuits. It has several sets of computing unit which consists of a general-purpose microprocessor, an LSRDP and a memory. An LSRDP consists of a lot of, e. g., a few thousand, floating-point units (FPUs) and operand routing networks (ORNs) which connect the FPUs. We reconfigure the LSRDP to fit a computation, i. e., a group of floating-point operations, which appears in a ‘for’ loop of numerical programs by setting the route in ORNs before the execution of the loop. We propose to implement the LSRDPs by RSFQ circuits. The processors and the memories can be implemented by semiconductor technology. We expect that a 10 TFLOPS supercomputer, as well as a refrigerating engine, will be housed in a desk-side rack, using a near-future RSFQ process technology, such as 0.35μm process.

  9. Coherent 40 Gb/s SP-16QAM and 80 Gb/s PDM-16QAM in an Optimal Supercomputer Optical Switch Fabric

    DEFF Research Database (Denmark)

    Karinou, Fotini; Borkowski, Robert; Zibar, Darko

    2013-01-01

    We demonstrate, for the first time, the feasibility of using 40 Gb/s SP-16QAM and 80 Gb/s PDM-16QAM in an optimized cell switching supercomputer optical interconnect architecture based on semiconductor optical amplifiers as ON/OFF gates....

  10. Car2x with software defined networks, network functions virtualization and supercomputers technical and scientific preparations for the Amsterdam Arena telecoms fieldlab

    NARCIS (Netherlands)

    Meijer R.J.; Cushing R.; De Laat C.; Jackson P.; Klous S.; Koning R.; Makkes M.X.; Meerwijk A.

    2015-01-01

    In the invited talk 'Car2x with SDN, NFV and supercomputers' we report about how our past work with SDN [1, 2] allows the design of a smart mobility fieldlab in the huge parking lot the Amsterdam Arena. We explain how we can engineer and test software that handle the complex conditions of the Car2X

  11. Car2x with software defined networks, network functions virtualization and supercomputers technical and scientific preparations for the Amsterdam Arena telecoms fieldlab

    NARCIS (Netherlands)

    Meijer R.J.; Cushing R.; De Laat C.; Jackson P.; Klous S.; Koning R.; Makkes M.X.; Meerwijk A.

    2015-01-01

    In the invited talk 'Car2x with SDN, NFV and supercomputers' we report about how our past work with SDN [1, 2] allows the design of a smart mobility fieldlab in the huge parking lot the Amsterdam Arena. We explain how we can engineer and test software that handle the complex conditions of the Car2X

  12. Nonperturbative Lattice Simulation of High Multiplicity Cross Section Bound in $\\phi^4_3$ on Beowulf Supercomputer

    CERN Document Server

    Charng, Y Y

    2001-01-01

    In this thesis, we have investigated the possibility of large cross sections at large multiplicity in weakly coupled three dimensional $\\phi^4$ theory using Monte Carlo Simulation methods. We have built a Beowulf Supercomputer for this purpose. We use spectral function sum rules to derive a bound on the total cross section where the quantity determining the bound can be measured by Monte Carlo simulation in Euclidean space. We determine the critical threshold energy for large high multiplicity cross section according to the analysis of M.B. Volosion and E.N. Argyres, R.M.P. Kleiss, and C.G. Papadopoulos. We compare the simulation results with the perturbation results and see no evidence for large cross section in the range where tree diagram estimates suggest they should exist.

  13. Use of World Wide Web and NCSA Mcsaic at Langley

    Science.gov (United States)

    Nelson, Michael

    1994-01-01

    A brief history of the use of the World Wide Web at Langley Research Center is presented along with architecture of the Langley Web. Benefits derived from the Web and some Langley projects that have employed the World Wide Web are discussed.

  14. CERN LHC events simulated using NCSA, TeraGrid systems

    CERN Multimedia

    2006-01-01

    "A team of physicists at the California Institute of Technology and the university of California, San Diego, is on the hunt for the Higgs boson, the subatomic particle thought to be responsible for mass."(1/2 page)

  15. High Performance Simulation of Large-Scale Red Sea Ocean Bottom Seismic Data on the Supercomputer Shaheen II

    KAUST Repository

    Tonellot, Thierry

    2017-02-27

    A combination of both shallow and deepwater, plus islands and coral reefs, are some of the main features contributing to the complexity of subsalt seismic exploration in the Red Sea transition zone. These features often result in degrading effects on seismic images. State-of-the-art ocean bottom acquisition technologies are therefore required to record seismic data with optimal fold and offset, as well as advanced processing and imaging techniques. Numerical simulations of such complex seismic data can help improve acquisition design and also help in customizing, validating and benchmarking the processing and imaging workflows that will be applied on the field data. Subsequently, realistic simulation of wave propagation is a computationally intensive process requiring a realistic model and an efficient 3D wave equation solver. Large-scale computing resources are also required to meet turnaround time compatible with a production time frame. In this work, we present the numerical simulation of an ocean bottom seismic survey to be acquired in the Red Sea transition zone starting in summer 2016. The survey\\'s acquisition geometry comprises nearly 300,000 unique shot locations and 21,000 unique receiver locations, covering about 760 km2. Using well log measurements and legacy 2D seismic lines in this area, a 3D P-wave velocity model was built, with a maximum depth of 7 km. The model was sampled at 10 m in each direction, resulting in more than 5 billion cells. Wave propagation in this model was performed using a 3D finite difference solver in the time domain based on a staggered grid velocity-pressure formulation of acoustodynamics. To ensure that the resulting data could be generated sufficiently fast, the King Abdullah University of Science and Technology (KAUST) supercomputer Shaheen II Cray XC40 was used. A total of 21,000 three-component (pressure and vertical and horizontal velocity) common receiver gathers with a 50 Hz maximum frequency were computed in less

  16. Hurricane Modeling and Supercomputing: Can a global mesoscale model be useful in improving forecasts of tropical cyclogenesis?

    Science.gov (United States)

    Shen, B.; Tao, W.; Atlas, R.

    2007-12-01

    Hurricane modeling, along with guidance from observations, has been used to help construct hurricane theories since the 1960s. CISK (conditional instability of the second kind, Charney and Eliassen 1964; Ooyama 1964,1969) and WISHE (wind-induced surface heat exchange, Emanuel 1986) are among the well-known theories being used to understand hurricane intensification. For hurricane genesis, observations have indicated the importance of large-scale flows (e.g., the Madden-Julian Oscillation or MJO, Maloney and Hartmann, 2000) on the modulation of hurricane activity. Recent modeling studies have focused on the role of the MJO and Rossby waves (e.g., Ferreira and Schubert, 1996; Aivyer and Molinari, 2003) and/or the interaction of small-scale vortices (e.g., Holland 1995; Simpson et al. 1997; Hendrick et al. 2004), of which determinism could be also built by large-scale flows. The aforementioned studies suggest a unified view on hurricane formation, consisting of multiscale processes such as scale transition (e.g., from the MJO to Equatorial Rossby Waves and from waves to vortices), and scale interactions among vortices, convection, and surface heat and moisture fluxes. To depict the processes in the unified view, a high-resolution global model is needed. During the past several years, supercomputers have enabled the deployment of ultra-high resolution global models, obtaining remarkable forecasts of hurricane track and intensity (Atlas et al. 2005; Shen et al. 2006). In this work, hurricane genesis is investigated with the aid of a global mesoscale model on the NASA Columbia supercomputer by conducting numerical experiments on the genesis of six consecutive tropical cyclones (TCs) in May 2002. These TCs include two pairs of twin TCs in the Indian Ocean, Supertyphoon Hagibis in the West Pacific Ocean and Hurricane Alma in the East Pacific Ocean. It is found that the model is capable of predicting the genesis of five of these TCs about two to three days in advance. Our

  17. An efficient highly parallel implementation of a large air pollution model on an IBM blue gene supercomputer

    Science.gov (United States)

    Ostromsky, Tz.; Georgiev, K.; Zlatev, Z.

    2012-10-01

    In this paper we discuss the efficient distributed-memory parallelization strategy of the Unified Danish Eulerian Model (UNI-DEM). We apply an improved decomposition strategy to the spatial domain in order to get more parallel tasks (based on the larger number of subdomains) with less communications between them (due to optimization of the overlapping area when the advection-diffusion problem is solved numerically). This kind of rectangular block partitioning (with a squareshape trend) allows us not only to increase significantly the number of potential parallel tasks, but also to reduce the local memory requirements per task, which is critical for the distributed-memory implementation of the higher-resolution/finergrid versions of UNI-DEM on some parallel systems, and particularly on the IBM BlueGene/P platform - our target hardware. We will show by experiments that our new parallel implementation can use rather efficiently the resources of the powerful IBM BlueGene/P supercomputer, the largest in Bulgaria, up to its full capacity. It turned out to be extremely useful in the large and computationally expensive numerical experiments, carried out to calculate some initial data for sensitivity analysis of the Danish Eulerian model.

  18. The ASCI Network for SC '99: A Step on the Path to a 100 Gigabit Per Second Supercomputing Network

    Energy Technology Data Exchange (ETDEWEB)

    PRATT,THOMAS J.; TARMAN,THOMAS D.; MARTINEZ,LUIS M.; MILLER,MARC M.; ADAMS,ROGER L.; CHEN,HELEN Y.; BRANDT,JAMES M.; WYCKOFF,PETER S.

    2000-07-24

    This document highlights the Discom{sup 2}'s Distance computing and communication team activities at the 1999 Supercomputing conference in Portland, Oregon. This conference is sponsored by the IEEE and ACM. Sandia, Lawrence Livermore and Los Alamos National laboratories have participated in this conference for eleven years. For the last four years the three laboratories have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives rubric. Communication support for the ASCI exhibit is provided by the ASCI DISCOM{sup 2} project. The DISCOM{sup 2} communication team uses this forum to demonstrate and focus communication and networking developments within the community. At SC 99, DISCOM built a prototype of the next generation ASCI network demonstrated remote clustering techniques, demonstrated the capabilities of the emerging Terabit Routers products, demonstrated the latest technologies for delivering visualization data to the scientific users, and demonstrated the latest in encryption methods including IP VPN technologies and ATM encryption research. The authors also coordinated the other production networking activities within the booth and between their demonstration partners on the exhibit floor. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations support Sandia's overall strategies in ASCI networking.

  19. BLAS (Basic Linear Algebra Subroutines), linear algebra modules, and supercomputers. Technical report for period ending 15 December 1984

    Energy Technology Data Exchange (ETDEWEB)

    Rice, J.R.

    1984-12-31

    On October 29 and 30, 1984 about 20 people met at Purdue University to consider extensions to the Basic Linear Algebra Subroutines (BLAS) and linear algebra software modules in general. The need for these extensions and new sets of modules is largely due to the advent of new supercomputer architectures which make it difficult for ordinary coding techniques to achieve even a significant fraction of the potential computing power. The workshop format was one of informal presentations with ample discussions followed by sessions of general discussions of the issues raised. This report is a summary of the presentations, the issues raised, the conclusions reached and the open issue discussions. Each participant had an opportunity to comment on this report, but it also clearly reflects the author's filtering of the extensive discussions. Section 2 describes seven proposals for linear algebra software modules and Section 3 describes four presentations on the use of such modules. Discussion summaries are given next; Section 4 for those where near concensus was reached and Section 5 where the issues were left open.

  20. Supercomputations and big-data analysis in strong-field ultrafast optical physics: filamentation of high-peak-power ultrashort laser pulses

    Science.gov (United States)

    Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.

    2016-06-01

    High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3  +  1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.

  1. Supercomputing for molecular dynamics simulations handling multi-trillion particles in nanofluidics

    CERN Document Server

    Heinecke, Alexander; Horsch, Martin; Bungartz, Hans-Joachim

    2015-01-01

    This work presents modern implementations of relevant molecular dynamics algorithms using ls1 mardyn, a simulation program for engineering applications. The text focuses strictly on HPC-related aspects, covering implementation on HPC architectures, taking Intel Xeon and Intel Xeon Phi clusters as representatives of current platforms. The work describes distributed and shared-memory parallelization on these platforms, including load balancing, with a particular focus on the efficient implementation of the compute kernels. The text also discusses the software-architecture of the resulting code.

  2. Application experiences with the Globus toolkit.

    Energy Technology Data Exchange (ETDEWEB)

    Brunett, S.

    1998-06-09

    The Globus grid toolkit is a collection of software components designed to support the development of applications for high-performance distributed computing environments, or ''computational grids'' [14]. The Globus toolkit is an implementation of a ''bag of services'' architecture, which provides application and tool developers not with a monolithic system but rather with a set of stand-alone services. Each Globus component provides a basic service, such as authentication, resource allocation, information, communication, fault detection, and remote data access. Different applications and tools can combine these services in different ways to construct ''grid-enabled'' systems. The Globus toolkit has been used to construct the Globus Ubiquitous Supercomputing Testbed, or GUSTO: a large-scale testbed spanning 20 sites and included over 4000 compute nodes for a total compute power of over 2 TFLOPS. Over the past six months, we and others have used this testbed to conduct a variety of application experiments, including multi-user collaborative environments (tele-immersion), computational steering, distributed supercomputing, and high throughput computing. The goal of this paper is to review what has been learned from these experiments regarding the effectiveness of the toolkit approach. To this end, we describe two of the application experiments in detail, noting what worked well and what worked less well. The two applications are a distributed supercomputing application, SF-Express, in which multiple supercomputers are harnessed to perform large distributed interactive simulations; and a tele-immersion application, CAVERNsoft, in which the focus is on connecting multiple people to a distributed simulated world.

  3. Report for CS 698-95 ?Directed Research ? Performance Modeling:? Using Queueing Network Modeling to Analyze the University of San Francisco Keck Cluster Supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Elliott, M L

    2005-09-28

    In today's world, the need for computing power is becoming more pressing daily. Our need to process, analyze, and store data is quickly exceeding the capabilities of small self-contained serial machines, such as the modern desktop PC. Initially, this gap was filled by the creation of supercomputers: large-scale self-contained parallel machines. However, current markets, as well as the costs to develop and maintain such machines, are quickly making such machines a rarity, used only in highly specialized environments. A third type of machine exists, however. This relatively new type of machine, known as a cluster, is built from common, and often inexpensive, commodity self-contained desktop machines. But how well do these clustered machines work? There have been many attempts to quantify the performance of clustered computers. One approach, Queueing Network Modeling (QNM), appears to be a potentially useful and rarely tried method of modeling such systems. QNM, which has its beginnings in the modeling of traffic patterns, has expanded, and is now used to model everything from CPU and disk services, to computer systems, to service rates in store checkout lines. This history of successful usage, as well as the correspondence of QNM components to commodity clusters, suggests that QNM can be a useful tool for both the cluster designer, interested in the best value for the cost, and the user of existing machines, interested in performance rates and time-to-solution. So, what is QNM? Queueing Network Modeling is an approach to computer system modeling where the computer is represented as a network of queues and evaluated analytically. How does this correspond to clusters? There is a neat one-to-one relationship between the components of a QNM model and a cluster. For example: A cluster is made from a combination of computational nodes and network switches. Both of these fit nicely with the QNM descriptions of service centers (delay, queueing, and load

  4. Modeling cardiovascular hemodynamics using the lattice Boltzmann method on massively parallel supercomputers

    Science.gov (United States)

    Randles, Amanda Elizabeth

    Accurate and reliable modeling of cardiovascular hemodynamics has the potential to improve understanding of the localization and progression of heart diseases, which are currently the most common cause of death in Western countries. However, building a detailed, realistic model of human blood flow is a formidable mathematical and computational challenge. The simulation must combine the motion of the fluid, the intricate geometry of the blood vessels, continual changes in flow and pressure driven by the heartbeat, and the behavior of suspended bodies such as red blood cells. Such simulations can provide insight into factors like endothelial shear stress that act as triggers for the complex biomechanical events that can lead to atherosclerotic pathologies. Currently, it is not possible to measure endothelial shear stress in vivo, making these simulations a crucial component to understanding and potentially predicting the progression of cardiovascular disease. In this thesis, an approach for efficiently modeling the fluid movement coupled to the cell dynamics in real-patient geometries while accounting for the additional force from the expansion and contraction of the heart will be presented and examined. First, a novel method to couple a mesoscopic lattice Boltzmann fluid model to the microscopic molecular dynamics model of cell movement is elucidated. A treatment of red blood cells as extended structures, a method to handle highly irregular geometries through topology driven graph partitioning, and an efficient molecular dynamics load balancing scheme are introduced. These result in a large-scale simulation of the cardiovascular system, with a realistic description of the complex human arterial geometry, from centimeters down to the spatial resolution of red-blood cells. The computational methods developed to enable scaling of the application to 294,912 processors are discussed, thus empowering the simulation of a full heartbeat. Second, further extensions to enable

  5. New Mexico High School Supercomputing Challenge, 1990--1995: Five years of making a difference to students, teachers, schools, and communities. Progress report

    Energy Technology Data Exchange (ETDEWEB)

    Foster, M.; Kratzer, D.

    1996-02-01

    The New Mexico High School Supercomputing Challenge is an academic program dedicated to increasing interest in science and math among high school students by introducing them to high performance computing. This report provides a summary and evaluation of the first five years of the program, describes the program and shows the impact that it has had on high school students, their teachers, and their communities. Goals and objectives are reviewed and evaluated, growth and development of the program are analyzed, and future directions are discussed.

  6. Super-computer architecture

    CERN Document Server

    Hockney, R W

    1977-01-01

    This paper examines the design of the top-of-the-range, scientific, number-crunching computers. The market for such computers is not as large as that for smaller machines, but on the other hand it is by no means negligible. The present work-horse machines in this category are the CDC 7600 and IBM 360/195, and over fifty of the former machines have been sold. The types of installation that form the market for such machines are not only the major scientific research laboratories in the major countries-such as Los Alamos, CERN, Rutherford laboratory-but also major universities or university networks. It is also true that, as with sports cars, innovations made to satisfy the top of the market today often become the standard for the medium-scale computer of tomorrow. Hence there is considerable interest in examining present developments in this area. (0 refs).

  7. Associative Memories for Supercomputers

    Science.gov (United States)

    1992-12-01

    Transform (FFT) is computed. The real part is extracted and a bias equal to its minimum is added to it in order to make all the values positive. Each...Transform (FM) is computed. The real part is extracted and a bias equal to its minimum is added to it in order to make all the values positive. Each...masque numero un de Figure 12: Photographic de Ia reconstruction obtenuc avec Ia plaquc IOCDL correspondant k Ia phase binaire. en rotition, montrant

  8. MPI/OpenMP Hybrid Parallel Algorithm of Resolution of Identity Second-Order Møller-Plesset Perturbation Calculation for Massively Parallel Multicore Supercomputers.

    Science.gov (United States)

    Katouda, Michio; Nakajima, Takahito

    2013-12-10

    A new algorithm for massively parallel calculations of electron correlation energy of large molecules based on the resolution of identity second-order Møller-Plesset perturbation (RI-MP2) technique is developed and implemented into the quantum chemistry software NTChem. In this algorithm, a Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) hybrid parallel programming model is applied to attain efficient parallel performance on massively parallel supercomputers. An in-core storage scheme of intermediate data of three-center electron repulsion integrals utilizing the distributed memory is developed to eliminate input/output (I/O) overhead. The parallel performance of the algorithm is tested on massively parallel supercomputers such as the K computer (using up to 45 992 central processing unit (CPU) cores) and a commodity Intel Xeon cluster (using up to 8192 CPU cores). The parallel RI-MP2/cc-pVTZ calculation of two-layer nanographene sheets (C150H30)2 (number of atomic orbitals is 9640) is performed using 8991 node and 71 288 CPU cores of the K computer.

  9. Building the interspace: Digital library infrastructure for a University Engineering Community

    Energy Technology Data Exchange (ETDEWEB)

    Schatz, B.

    1995-12-31

    A large-scale digital library is being constructed and evaluated at the University of Illinois, with the goal of bringing professional search and display to Internet information services. A testbed planned to grow to 10K documents and 100K users is being constructed in the Grainger Engineering Library Information Center, as a joint effort of the University Library and the National Center for Supercomputing Applications (NCSA), with evaluation and research by the Graduate School of Library and Information Science and the Department of Computer Science. The electronic collection will be articles from engineering and science journals and magazines, obtained directly from publishers in SGML format and displayed containing all text, figures, tables, and equations. The publisher partners include IEEE Computer Society, AIAA (Aerospace Engineering), American Physical Society, and Wiley & Sons. The software will be based upon NCSA Mosaic as a network engine connected to commercial SGML displayers and full-text searchers. The users will include faculty/students across the midwestern universities in the Big Ten, with evaluations via interviews, surveys, and transaction logs. Concurrently, research into scaling the testbed is being conducted. This includes efforts in computer science, information science, library science, and information systems. These efforts will evaluate different semantic retrieval technologies, including automatic thesaurus and subject classification graphs. New architectures will be designed and implemented for a next generation digital library infrastructure, the Interspace, which supports interaction with information spread across information spaces within the Net.

  10. Use of QUADRICS supercomputer as embedded simulator in emergency management systems; Utilizzo del calcolatore QUADRICS come simulatore in linea in un sistema di gestione delle emergenze

    Energy Technology Data Exchange (ETDEWEB)

    Bove, R.; Di Costanzo, G.; Ziparo, A. [ENEA, Centro Ricerche Casaccia, Rome (Italy). Dip. Energia

    1996-07-01

    The experience related to the implementation of a MRBT, atmospheric spreading model with a short duration releasing, are reported. This model was implemented on a QUADRICS-Q1 supercomputer. First is reported a description of the MRBT model. It is an analytical model to study the speadings of light gases realised in the atmosphere cause incidental releasing. The solution of diffusion equation is Gaussian like. It yield the concentration of pollutant substance released. The concentration is function of space and time. Thus the QUADRICS architecture is introduced. And the implementation of the model is described. At the end it will be consider the integration of the QUADRICS-based model as simulator in a emergency management system.

  11. Sandia`s network for Supercomputing `94: Linking the Los Alamos, Lawrence Livermore, and Sandia National Laboratories using switched multimegabit data service

    Energy Technology Data Exchange (ETDEWEB)

    Vahle, M.O.; Gossage, S.A.; Brenkosh, J.P. [Sandia National Labs., Albuquerque, NM (United States). Advanced Networking Integration Dept.

    1995-01-01

    Supercomputing `94, a high-performance computing and communications conference, was held November 14th through 18th, 1994 in Washington DC. For the past four years, Sandia National Laboratories has used this conference to showcase and focus its communications and networking endeavors. At the 1994 conference, Sandia built a Switched Multimegabit Data Service (SMDS) network running at 44.736 megabits per second linking its private SMDS network between its facilities in Albuquerque, New Mexico and Livermore, California to the convention center in Washington, D.C. For the show, the network was also extended from Sandia, New Mexico to Los Alamos National Laboratory and from Sandia, California to Lawrence Livermore National Laboratory. This paper documents and describes this network and how it was used at the conference.

  12. Tera Scale Systems and Applications

    Science.gov (United States)

    Niggley, Chuck; Ciotti, Bob; Parks, John W. (Technical Monitor)

    2002-01-01

    This presentation discusses NASA's efforts to develop tera scale systems designed to push the envelope of supercomputing research. Topics cover include: NASA's existing supercomputing facilities and capabilities, NASA's computational challenges in developing these systems, development of production supercomputer, and potential research projects which could benefit from these types of systems.

  13. Research of Customer Segmentation and Differentiated Services in Supercomputing Center%超级计算中心客户细分及差异化服务策略研究

    Institute of Scientific and Technical Information of China (English)

    赵芸卿

    2013-01-01

    This paper applies K-means method to analyze the data of the customers’ Super-Computer renting information in the Supercomputing Center, achieving a number of sorted groups, and puts differentiated services strategies forward accordingly. As a result, we can allocate our supercomputer resources according to these groups, making the services more effectively and conveniently.%本文以中国科学院计算机网络信息中心超级计算中心(以下简称超级计算中心)客户服务工作为研究对象,运用 K-means 算法对客户进行细分,进而对每类客户群提出相应的差异化服务策略。实施差异化服务策略可以更好地分配资源、提供更有效的客户服务。

  14. Implementation of a distributed adaptive routing algorithm on the intel IPSC (Intel Personal Supercomputer). Master's thesis

    Energy Technology Data Exchange (ETDEWEB)

    Farinelli, T.C.

    1987-12-01

    The purpose of this study was to examine the use of distributed adaptive routing algorithms on concurrent-class computers. The implemented routing algorithm allowed each node to select the next node based on two criteria: the fewest number of hops; and the smallest delay time. This study was limited to the comparison of a distributed adaptive routing algorithm, implemented at the applications layer, with the current static routing and with a simulation of the current routing implemented at the applications layer. The comparison with the simulated current static routing provides a measure of the possible performance gain had the adaptive routing algorithm been implemented at the network layer. Each of three configuration was comprised of four processes: a Host Process, a Routing Process, a Ring Control Process, and a Network Loading Process. The Host Process controlled the loading of the processes onto the IPSC, the Routing Process controlled the message routing, the Ring Control Process provided the baseline message passing, while the Network Loading Process provided communications congestion on selected links. The metric used to compare the Routing Process performance was the average delay time for passing a message around the ring.

  15. DCA++: A case for science driven application development for leadership computing platforms

    Energy Technology Data Exchange (ETDEWEB)

    Summers, Michael S; Alvarez, Gonzalo; Meredith, Jeremy; Maier, Thomas A [Computer Science and Mathematics Division, Oak Ridge National Laboratory, P. O. Box 2008, Mail Stop 6164, Oak Ridge, TN 37831 (United States); Schulthess, Thomas C, E-mail: schulthess@cscs.c [Swiss National Supercomputer Center and Institute for Theoretical Physics, ETH Zurich, CSCS MAN E 133, Galeria 2, CH-9628 Manno (Switzerland)

    2009-07-01

    The DCA++ code was one of the early science applications that ran on jaguar at the National Center for Computational Sciences, and the first application code to sustain a petaflop/s under production conditions on a general-purpose supercomputer. The code implements a quantum cluster method with a Quantum Monte Carlo kernel to solve the 2D Hubbard model for high-temperature superconductivity. It is implemented in C++, making heavy use of the generic programming model. In this paper, we discuss how this code was developed, reaching scalability and high efficiency on the world's fastest supercomputer in only a few years. We show how the use of generic concepts combined with systematic refactoring of codes is a better strategy for computational sciences than a comprehensive upfront design.

  16. Harnessing the power of the new SMP cluster architecture

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, S E; Cohen, R H; Curtis, B C; Dannevik, W P; Dimits, A M; Dinge, D; Eliason, D E; Hodsons, S; Jacobs, M; Mirin, A A; Porter, D H; Ruwart, T; Synne, I; Winkler, K; Woodward, P R

    1999-06-16

    In 1993, members of our team collaborated with Silicon Graphics to perform the first full-scale demonstration of the computational power of the SMP cluster supercomputer architecture. That demonstration involved the simulation of homogeneous, compressible turbulence on a uniform grid of a billion cells, using our PPM gas dynamics code. This computation was embarrassingly parallel, the ideal test case, and it achieved only 4.9 Gflop/s performance, slightly over half that achievable by this application on the most expensive supercomputers of that day. After four to five solid days of computation, when the prototype machine had to be dismantled, the simulation was only about 20% completed. Nevertheless, this computation gave us important new insights into compressible turbulence and also into a powerful new mode of cost-effective, commercially sustainable supercomputing [S]. In the intervening 6 years, the SMP cluster architecture has become a fundamental strategy for several large supercomputer centers in the US, including the DOE's ASCI centers at Los Alamos National Laboratory and at the Lawrence Livermore National Laboratory and the NSF's center NCSA at the University of Illinois. This SMP cluster architecture now underlies product offerings at the high-end of performance from SGI, IBM, and HP, among others. Nevertheless, despite many successes, it is our opinion that the computational science community is only now beginning to exploit the full promise of these new computing platforms. In this paper, we will briefly discuss two key architectural issues, vector computing and the flat multiprocessor architecture, which continue to drive spirited discussions among computational scientists, and then we will describe the hierarchical shared memory programming paradigm that we feel is best suited to the creative use of SMP cluster systems. Finally, we will give examples of recent large-scale simulations carried out by our team on these kinds of systems and

  17. Introducing "É VIVO! Virtual Eruptions on a Supercomputer". A DVD aimed at sharing results from numerical simulations of explosive eruptions

    Science.gov (United States)

    de'Michieli Vitturi, M.; Todesco, M.; Neri, A.; Esposti Ongaro, T.; Tola, E.; Rocco, G.

    2011-12-01

    We present a new DVD of the INGV outreach series, aimed at illustrating our research work on pyroclastic flow modeling. Pyroclastic flows (or pyroclastic density currents) are hot, devastating clouds of gas and ashes, generated during explosive eruptions. Understanding their dynamics and impact is crucial for a proper hazard assessment. We employ a 3D numerical model which describes the main features of the multi-phase and multi-component process, from the generation of the flows to their propagation along complex terrains. Our numerical results can be translated into color animations, which describe the temporal evolution of flow variables such as temperature or ash concentration. The animations provide a detailed and effective description of the natural phenomenon which can be used to present this geological process to a general public and to improve the hazard perception in volcanic areas. In our DVD, the computer animations are introduced and commented by professionals and researchers who deals at various levels with the study of pyroclastic flows and their impact. Their comments are taken as short interviews, mounted in a short video (about 10 minutes), which describes the natural process, as well as the model and its applications to some explosive volcanoes like Vesuvio, Campi Flegrei, Mt. St. Helens and Soufriere Hills (Montserrat). The ensemble of different voices and faces provides a direct sense of the multi-disciplinary effort involved in the assessment of pyroclastic flow hazard. The video also introduces the people who address this complex problem, and the personal involvement beyond the scientific results. The full, uncommented animations of the pyroclastic flow propagation on the different volcanic settings are also provided in the DVD, that is meant to be a general, flexible outreach tool.

  18. MAEviz: Seismic Risk Assessment Environment - bridging the gap between research and practice

    Science.gov (United States)

    Navarro, C.; Lee, J. S.; Tolbert, N.; Hampton, S.; McLaren, T.; Myers, J.

    2008-12-01

    In the field of hazard risk assessment, a new generation of tools is needed to allow researchers and practicing engineers the ability to leverage investments in new methodologies and software infrastructure while enabling customization to local conditions. MAEviz represents such a next generation of seismic risk assessment environment, based on the Mid-America Earthquake (MAE) Center research and designed to be extended, customized, and evolved to meet the needs of specific organizations and regions. It is built upon an extensible Open Services Gateway Initiative (OSGi) based GIS application platform and leverages distributed content management, workflow, and virtual-organization based design concepts. MAEviz has been developed as a collaboration between the Mid-America Earthquake (MAE) Center community and the National Center for Supercomputing Applications (NCSA) and is an implementation of the MAE Centers research Consequence-based Risk Management (CRM) methodology. MAEviz is open source and provides a modern GIS application interface with sophisticated visualization and reporting capabilities. It also incorporates mechanisms to integrate distributed data sources, provides approximately 50 reusable analyses, and has the ability to save and share scenarios to coordinate work in distributed teams. As an Eclipse Rich Client Platform (RCP) application, MAEviz is composed of multiple plugins and clearly defined extension points that leverage numerous open source libraries such as Geotools, iText, kTable, JFreeChart and the Visualization Toolkit (VTK) as well as middleware components developed at NCSA. This architecture enables MAEviz to rapidly be extended with new scientific analyses and allows reuse of the base GIS environment capabilities. MAEviz helps bridge the gap between researchers, practitioners and policy-makers by integrating the latest research findings and most accurate data, state-of-the-art methodologies in an extensible open source platform.

  19. Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Xu, Chuanfu, E-mail: xuchuanfu@nudt.edu.cn [College of Computer Science, National University of Defense Technology, Changsha 410073 (China); Deng, Xiaogang; Zhang, Lilun [College of Computer Science, National University of Defense Technology, Changsha 410073 (China); Fang, Jianbin [Parallel and Distributed Systems Group, Delft University of Technology, Delft 2628CD (Netherlands); Wang, Guangxue; Jiang, Yi [State Key Laboratory of Aerodynamics, P.O. Box 211, Mianyang 621000 (China); Cao, Wei; Che, Yonggang; Wang, Yongxian; Wang, Zhenghua; Liu, Wei; Cheng, Xinghua [College of Computer Science, National University of Defense Technology, Changsha 410073 (China)

    2014-12-01

    Programming and optimizing complex, real-world CFD codes on current many-core accelerated HPC systems is very challenging, especially when collaborating CPUs and accelerators to fully tap the potential of heterogeneous systems. In this paper, with a tri-level hybrid and heterogeneous programming model using MPI + OpenMP + CUDA, we port and optimize our high-order multi-block structured CFD software HOSTA on the GPU-accelerated TianHe-1A supercomputer. HOSTA adopts two self-developed high-order compact definite difference schemes WCNS and HDCS that can simulate flows with complex geometries. We present a dual-level parallelization scheme for efficient multi-block computation on GPUs and perform particular kernel optimizations for high-order CFD schemes. The GPU-only approach achieves a speedup of about 1.3 when comparing one Tesla M2050 GPU with two Xeon X5670 CPUs. To achieve a greater speedup, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present a novel scheme to balance the loads between the store-poor GPU and the store-rich CPU. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per TianHe-1A node for HOSTA by 2.3×, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Further, to scale HOSTA on TianHe-1A, we propose a gather/scatter optimization to minimize PCI-e data transfer times for ghost and singularity data of 3D grid blocks, and overlap the collaborative computation and communication as far as possible using some advanced CUDA and MPI features. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 TianHe-1A nodes. With our method, we have successfully simulated an EET high-lift airfoil configuration containing 800M cells and China's large civil airplane configuration containing 150M cells. To our best knowledge, those are the largest-scale CPU–GPU collaborative simulations

  20. Advanced Architectures for Astrophysical Supercomputing

    CERN Document Server

    Barsdell, Benjamin R; Fluke, Christopher J

    2010-01-01

    Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of $O(100\\times)$ in general-purpose computation -- performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.

  1. Supercomputing "Grid" passes latest test

    CERN Multimedia

    Dumé, Belle

    2005-01-01

    When the Large Hadron Collider (LHC) comes online at the CERN in 2007, it will produce more data than any other experiment in the history of physics. Particle physicists have now passed another milestone in their preparations for the LHC by sustaining a continuous flow of 600 megabytes of dat per second (MB/s) for 10 days from the Geneva laboratory to seven sites in Europe and the US (1/2 page)

  2. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP in a massively parallel supercomputing environment – a case study on JUQUEEN (IBM Blue Gene/Q

    Directory of Open Access Journals (Sweden)

    F. Gasper

    2014-06-01

    Full Text Available Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing non-linear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP on JUQUEEN (IBM Blue Gene/Q of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD paradigm and require memory and load balancing considerations in the exchange of the coupling fields between different component models and allocation of computational resources, respectively. These considerations can be reached with advanced profiling and tracing tools leading to the efficient use of massively parallel computing environments, which is then mainly determined by the parallel performance of individual component models. However, the problem of model I/O and initialization in the peta-scale range requires major attention, because this constitutes a true big data challenge in the perspective of future exa-scale capabilities, which is unsolved.

  3. Implementation and scaling of the fully coupled Terrestrial Systems Modeling Platform (TerrSysMP) in a massively parallel supercomputing environment - a case study on JUQUEEN (IBM Blue Gene/Q)

    Science.gov (United States)

    Gasper, F.; Goergen, K.; Kollet, S.; Shrestha, P.; Sulis, M.; Rihani, J.; Geimer, M.

    2014-06-01

    Continental-scale hyper-resolution simulations constitute a grand challenge in characterizing non-linear feedbacks of states and fluxes of the coupled water, energy, and biogeochemical cycles of terrestrial systems. Tackling this challenge requires advanced coupling and supercomputing technologies for earth system models that are discussed in this study, utilizing the example of the implementation of the newly developed Terrestrial Systems Modeling Platform (TerrSysMP) on JUQUEEN (IBM Blue Gene/Q) of the Jülich Supercomputing Centre, Germany. The applied coupling strategies rely on the Multiple Program Multiple Data (MPMD) paradigm and require memory and load balancing considerations in the exchange of the coupling fields between different component models and allocation of computational resources, respectively. These considerations can be reached with advanced profiling and tracing tools leading to the efficient use of massively parallel computing environments, which is then mainly determined by the parallel performance of individual component models. However, the problem of model I/O and initialization in the peta-scale range requires major attention, because this constitutes a true big data challenge in the perspective of future exa-scale capabilities, which is unsolved.

  4. Load Balancing Scientific Applications

    Energy Technology Data Exchange (ETDEWEB)

    Pearce, Olga Tkachyshyn [Texas A & M Univ., College Station, TX (United States)

    2014-12-01

    The largest supercomputers have millions of independent processors, and concurrency levels are rapidly increasing. For ideal efficiency, developers of the simulations that run on these machines must ensure that computational work is evenly balanced among processors. Assigning work evenly is challenging because many large modern parallel codes simulate behavior of physical systems that evolve over time, and their workloads change over time. Furthermore, the cost of imbalanced load increases with scale because most large-scale scientific simulations today use a Single Program Multiple Data (SPMD) parallel programming model, and an increasing number of processors will wait for the slowest one at the synchronization points. To address load imbalance, many large-scale parallel applications use dynamic load balance algorithms to redistribute work evenly. The research objective of this dissertation is to develop methods to decide when and how to load balance the application, and to balance it effectively and affordably. We measure and evaluate the computational load of the application, and develop strategies to decide when and how to correct the imbalance. Depending on the simulation, a fast, local load balance algorithm may be suitable, or a more sophisticated and expensive algorithm may be required. We developed a model for comparison of load balance algorithms for a specific state of the simulation that enables the selection of a balancing algorithm that will minimize overall runtime.

  5. Status and future perspective of applications of high temperature superconductors

    Science.gov (United States)

    Tanaka, Shoji

    The material research on the high temperature superconductivity for the past ten years gave us sufficient information on the new phenomena of these new materials. It seems that new applications in a very wide range of industries are increasing rapidly. In this report three main topics of the applications are given ; [a] progress of the superconducting bulk materials and their applications to the flywheel electricity storage system and others, [b] progress in the development of superconducting tapes and their applications to power cables, the high field superconducting magnet for the SMES and for the pulling system of large silicon single crystal, and [c] development of new superconducting electronic devices (SFQ) and the possiblity of the application to next generation supercomputers. These examples show the great capability of the superconductivity technology and it is expected that the real superconductivity industry will take off around the year of 2005.

  6. An Object-Oriented Smartphone Application for Structural Finite Element Analysis

    Directory of Open Access Journals (Sweden)

    B.J. Mac Donald

    2014-08-01

    Full Text Available Smartphones are becoming increasingly ubiquitous both in general society and the workplace. Recent increases in mobile processing power have shown the current generation of smartphones has equivalent processing power to a supercomputer from the early 1990s. Many industries have abandoned desktop computing and are now entirely reliant on mobile devices. Given these facts it is logical that smartphones are considered as the next platform for finite element analysis (FEA. This paper presents an architecture for a smartphone FEA application using object-oriented programming. A MVC design pattern is adopted and a demonstration FEA application for the Android smartphone platform is presented.

  7. Simplifying the Access to HPC Resources by Integrating them in the Application GUI

    KAUST Repository

    van Waveren, Matthijs

    2016-06-22

    The computing landscape of KAUST is increasing in complexity. Researchers have access to the 9th fastest supercomputer in the world (Shaheen II) and several other HPC clusters. They work on local Windows, Mac, or Linux workstations. In order to facilitate the access of the HPC systems, we have developed interfaces to several research applications that automate input data transfer, job submission and retrieval of results. The user now submits his jobs to the cluster from within the application GUI on his workstation, and does not have to physically go onto the cluster anymore.

  8. Surety applications in transportation

    Energy Technology Data Exchange (ETDEWEB)

    Matalucci, R.V.; Miyoshi, D.S.

    1998-01-01

    Infrastructure surety can make a valuable contribution to the transportation engineering industry. The lessons learned at Sandia National Laboratories in developing surety principles and technologies for the nuclear weapons complex and the nuclear power industry hold direct applications to the safety, security, and reliability of the critical infrastructure. This presentation introduces the concepts of infrastructure surety, including identification of the normal, abnormal, and malevolent threats to the transportation infrastructure. National problems are identified and examples of failures and successes in response to environmental loads and other structural and systemic vulnerabilities are presented. The infrastructure surety principles developed at Sandia National Laboratories are described. Currently available technologies including (a) three-dimensional computer-assisted drawing packages interactively combined with virtual reality systems, (b) the complex calculational and computational modeling and code-coupling capabilities associated with the new generation of supercomputers, and (c) risk-management methodologies with application to solving the national problems associated with threats to the critical transportation infrastructure are discussed.

  9. Detecting Silent Data Corruption for Extreme-Scale Applications through Data Mining

    Energy Technology Data Exchange (ETDEWEB)

    Bautista-Gomez, Leonardo [Argonne National Lab. (ANL), Argonne, IL (United States); Cappello, Franck [Argonne National Lab. (ANL), Argonne, IL (United States)

    2014-01-16

    Supercomputers allow scientists to study natural phenomena by means of computer simulations. Next-generation machines are expected to have more components and, at the same time, consume several times less energy per operation. These trends are pushing supercomputer construction to the limits of miniaturization and energy-saving strategies. Consequently, the number of soft errors is expected to increase dramatically in the coming years. While mechanisms are in place to correct or at least detect some soft errors, a significant percentage of those errors pass unnoticed by the hardware. Such silent errors are extremely damaging because they can make applications silently produce wrong results. In this work we propose a technique that leverages certain properties of high-performance computing applications in order to detect silent errors at the application level. Our technique detects corruption solely based on the behavior of the application datasets and is completely application-agnostic. We propose multiple corruption detectors, and we couple them to work together in a fashion transparent to the user. We demonstrate that this strategy can detect the majority of the corruptions, while incurring negligible overhead. We show that with the help of these detectors, applications can have up to 80% of coverage against data corruption.

  10. Design of charging model using supercomputing CAE cloud platform of user feedback mechanism%基于用户反馈机制的超级计算CAE云平台计费模型设计

    Institute of Scientific and Technical Information of China (English)

    马亿旿; 池鹏; 陈磊; 梁小林; 蔡立军

    2015-01-01

    As the traditional charging model of CAE cloud platform has many shortcomings, such as user behavior and feedback are not considered, the single charging model can not support differentiated services, and it has poor business flexibility, a charging model was proposed based on plug-in in the supercomputer CAE cloud platform and a charging algorithm was put forward based on user feedback mechanism. The plug-in accounting model regards service as a basic unit, and provides different charging solutions for user’s service by a form of plug-in unit, which makes it easy to solve those problems, and to some extent, it strengthens the characteristic of the strong business dynamics of supercomputer CAE cloud platform. The charging algorithm can dynamically adjust the user's charging parameters according to the historical behavior and feedback of user mechanism, and reduce service costs by the activity and the importance of user, which enhances the quality of services and user experience.%针对传统 CAE 云平台中计费算法未考虑用户行为与反馈等缺陷以及传统计费模型的模式单一、无法支撑差异化服务、业务灵活性差等缺点,建立一种插件式的超级计算 CAE 云平台计费模型,提出一种基于用户反馈机制的计费算法。插件式计费模型以服务为基本单位,通过插件的形式为用户的服务提供不同的计费方案,从而解决了传统计费模型的模式单一、灵活性差等缺陷,增强超级计算 CAE 云平台的业务动态性。基于用户反馈的计费算法能够根据用户的历史行为和反馈情况,动态调整用户的计费参数,实现了根据用户的活跃度和重要性来减少服务费用的目的,保证了服务质量,提升了用户体验。

  11. Compiler-Enhanced Incremental Checkpointing for OpenMP Applications

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Marques, D; Pingali, K; Rugina, R; McKee, S A

    2008-01-21

    As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures, enabling applications to periodically save their state and restart computation after a failure. Although a variety of automated system-level checkpointing solutions are currently available to HPC users, manual application-level checkpointing remains more popular due to its superior performance. This paper improves performance of automated checkpointing via a compiler analysis for incremental checkpointing. This analysis, which works with both sequential and OpenMP applications, reduces checkpoint sizes by as much as 80% and enables asynchronous checkpointing.

  12. Compiler-Enhanced Incremental Checkpointing for OpenMP Applications

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Marques, D; Pingali, K; McKee, S; Rugina, R

    2009-02-18

    As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety of causes. Checkpointing is a popular technique for tolerating such failures, enabling applications to periodically save their state and restart computation after a failure. Although a variety of automated system-level checkpointing solutions are currently available to HPC users, manual application-level checkpointing remains more popular due to its superior performance. This paper improves performance of automated checkpointing via a compiler analysis for incremental checkpointing. This analysis, which works with both sequential and OpenMP applications, significantly reduces checkpoint sizes and enables asynchronous checkpointing.

  13. Performance Analysis, Modeling and Scaling of HPC Applications and Tools

    Energy Technology Data Exchange (ETDEWEB)

    Bhatele, Abhinav [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-01-13

    E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research along the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.

  14. Application of lightweight threading techniques to computational chemistry

    Science.gov (United States)

    Thornley, John; Muller, Richard P.; Mainz, Daniel T.; Çağin, Tahir; Goddard, William A.

    2001-05-01

    The recent advent of inexpensive commodity multiprocessor computers with standardized operating system support for lightweight threads provides computational chemists and other scientists with an exciting opportunity to develop sophisticated new approaches to materials simulation. We contrast the flexible performance characteristics of lightweight threading with the restrictions of traditional scientific supercomputing, based on our experiences with multithreaded molecular dynamics simulation. Motivated by the results of our molecular dynamics experiments, we propose an approach to multi-scale materials simulation using highly dynamic thread creation and synchronization within and between concurrent simulations at many different scales. This approach will enable extremely realistic simulations, with computing resources dynamically directed to areas where they are needed. Multi-scale simulations of this kind require large amounts of processing power, but are too sophisticated to be expressed using traditional supercomputing programming models. As a result, we have developed a high-level programming system called Sthreads that allows highly dynamic, nested multithreaded algorithms to be expressed. Program development is simplified through the use of innovative synchronization operations that allow multithreaded programs to be tested and debugged using standard sequential methods and tools. For this reason, Sthreads is very well suited to the complex multi-scale simulation applications that we are developing.

  15. Numerical relativity in a distributed environment.

    Energy Technology Data Exchange (ETDEWEB)

    Benger, W.; Foster, I.; Novotny, J.; Seidel, E.; Shalf, J.; Smith, W.; Walker, P.

    1999-02-08

    We have found that the hardware and software infrastructure exists to simulate general relativity problems in a distributed computational environment, at some cost in performance. We examine two different issues for running the Cactus code in such a distributed environment The first issue is running a Cactus simulation on multiple parallel computer systems. Our objective is to perform larger simulations than are currently possible on a single parallel computer. We distribute Cactus simulations across multiple supercomputers using the mechanisms provided by the Globus toolkit. In particular, we use Globus mechanisms for authentication, access to remote computer systems, file transfer, and communication. The Cactus code uses MPI for communication and makes use of an MPI implementation layered atop Globus communication mechanisms. These communication mechanisms allow a MPI application to be executed on distributed resources. We find that without performing any code optimizations, our simulations ran 48% to 100% slower when using an Origin at the National Center for Supercomputing Applications (NCSA) and an Onyx2 at Argonne National Laboratory (ANL). We also ran simulations between Cray T3Es in Germany and a T3E at the San Diego Supercomputing Center (SDSC). Running between the T3Es in Germany resulted in an increase in execution time of 79% to 133%, and running between a German T3E and a T3E at the San Diego Supercomputing Center resulted in an execution time increase of 114% to 186%. We are very encouraged that we are able to run simulations on parallel computers that are geographically distributed, and we have identified several areas to investigate to improve the performance of Cactus simulations in this environment. The second issue we examine here is remote visualization and steering of the Cactus code. Cactus is a modular framework and we have implemented a module for this task. This module performs isosurfacing operations on the same parallel computers that are

  16. Fault Tolerance Assistant (FTA): An Exception Handling Programming Model for MPI Applications

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Aiman [Univ. of Chicago, IL (United States). Dept. of Computer Science; Laguna, Ignacio [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sato, Kento [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Islam, Tanzima [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mohror, Kathryn [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-05-23

    Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enables failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.

  17. Performance Measurement and Analysis of Large-Scale Parallel Applications on Leadership Computing Systems

    Directory of Open Access Journals (Sweden)

    Brian J.N. Wylie

    2008-01-01

    Full Text Available Developers of applications with large-scale computing requirements are currently presented with a variety of high-performance systems optimised for message-passing, however, effectively exploiting the available computing resources remains a major challenge. In addition to fundamental application scalability characteristics, application and system peculiarities often only manifest at extreme scales, requiring highly scalable performance measurement and analysis tools that are convenient to incorporate in application development and tuning activities. We present our experiences with a multigrid solver benchmark and state-of-the-art real-world applications for numerical weather prediction and computational fluid dynamics, on three quite different multi-thousand-processor supercomputer systems – Cray XT3/4, MareNostrum & Blue Gene/L – using the newly-developed SCALASCA toolset to quantify and isolate a range of significant performance issues.

  18. An overview of the activities of the OECD/NEA Task Force on adapting computer codes in nuclear applications to parallel architectures

    Energy Technology Data Exchange (ETDEWEB)

    Kirk, B.L. [Oak Ridge National Lab., TN (United States); Sartori, E. [OCDE/OECD NEA Data Bank, Issy-les-Moulineaux (France); Viedma, L.G. de [Consejo de Seguridad Nuclear, Madrid (Spain)

    1997-06-01

    Subsequent to the introduction of High Performance Computing in the developed countries, the Organization for Economic Cooperation and Development/Nuclear Energy Agency (OECD/NEA) created the Task Force on Adapting Computer Codes in Nuclear Applications to Parallel Architectures (under the guidance of the Nuclear Science Committee`s Working Party on Advanced Computing) to study the growth area in supercomputing and its applicability to the nuclear community`s computer codes. The result has been four years of investigation for the Task Force in different subject fields - deterministic and Monte Carlo radiation transport, computational mechanics and fluid dynamics, nuclear safety, atmospheric models and waste management.

  19. Power-aware applications for scientific cluster and distributed computing

    CERN Document Server

    Abdurachmanov, David; Eulisse, Giulio; Grosso, Paola; Hillegas, Curtis; Holzman, Burt; Klous, Sander; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton U...

  20. Applications of Discrete Molecular Dynamics in biology and medicine.

    Science.gov (United States)

    Proctor, Elizabeth A; Dokholyan, Nikolay V

    2016-04-01

    Discrete Molecular Dynamics (DMD) is a physics-based simulation method using discrete energetic potentials rather than traditional continuous potentials, allowing microsecond time scale simulations of biomolecular systems to be performed on personal computers rather than supercomputers or specialized hardware. With the ongoing explosion in processing power even in personal computers, applications of DMD have similarly multiplied. In the past two years, researchers have used DMD to model structures of disease-implicated protein folding intermediates, study assembly of protein complexes, predict protein-protein binding conformations, engineer rescue mutations in disease-causative protein mutants, design a protein conformational switch to control cell signaling, and describe the behavior of polymeric dispersants for environmental cleanup of oil spills, among other innovative applications.

  1. "0" and "1" of Supercomputer: Design of Yinhe Building in the University of Defense Technology%超级计算机的"0"与"1"——国防科技大学银河楼设计

    Institute of Scientific and Technical Information of China (English)

    宋明星; 魏春雨; 尹佳斌

    2011-01-01

    通过对国防科技大学银河楼设计创作中规划布局、空间构成、造型、适宜生态技术处理的分析,阐释了建筑设计手法与超级巨型机研究测试中心功能间的联系与思考.规划布局考虑了主体机房与南、北楼的衔接关系,通过中庭、花园、观光电梯等空间语汇进行了空间的构成,造型手法通过简洁的立柱与玻璃的对比反映计算机语言的0与1,同时在设计中多个平台采用了植被屋顶这一适宜生态技术.%Through the analysis of the planning, spatial composition, modeling and ecological technology of the design of Yinhe Building, this article analyzes the relationship between the architectural design and the demands of the supercomputer labs. The planning focuses on the relationship between the main computer room and the north and the south buildings. The spatial composition is achieved with architectural vocabularies such as atriums, gardens and panoramic lifts, etc. Modeling of the buildings reflects the computer languages: 0 and 1 through the contrast between the column and the glass. And appropriate ecological technology of green roofs is used on several decks.

  2. A Foundation for the Accurate Prediction of the Soft Error Vulnerability of Scientific Applications

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; de Supinski, B; Schulz, M

    2009-02-13

    Understanding the soft error vulnerability of supercomputer applications is critical as these systems are using ever larger numbers of devices that have decreasing feature sizes and, thus, increasing frequency of soft errors. As many large scale parallel scientific applications use BLAS and LAPACK linear algebra routines, the soft error vulnerability of these methods constitutes a large fraction of the applications overall vulnerability. This paper analyzes the vulnerability of these routines to soft errors by characterizing how their outputs are affected by injected errors and by evaluating several techniques for predicting how errors propagate from the input to the output of each routine. The resulting error profiles can be used to understand the fault vulnerability of full applications that use these routines.

  3. National Comorbidity Survey Replication Adolescent Supplement (NCS-A): III. Concordance of DSM-IV/CIDI Diagnoses with Clinical Reassessments

    Science.gov (United States)

    Kessler, Ronald C.; Avenevoli, Shelli; Green, Jennifer; Gruber, Michael J.; Guyer, Margaret; He, Yulei; Jin, Robert; Kaufman, Joan; Sampson, Nancy A.; Zaslavsky, Alan M.; Merikangas, Kathleen R.

    2009-01-01

    The Diagnostic and Statistical Manual of Mental Disorders (DSM-IV) diagnoses that was based on the World Health Organization's Composite International Diagnostic Interview (CIDI) and implemented in the National comorbidity survey replication adolescent supplement is found to have good individual-level concordance with diagnosis based on blinded…

  4. Havens: Explicit Reliable Memory Regions for HPC Applications

    Energy Technology Data Exchange (ETDEWEB)

    Hukerikar, Saurabh [ORNL; Engelmann, Christian [ORNL

    2016-01-01

    Supporting error resilience in future exascale-class supercomputing systems is a critical challenge. Due to transistor scaling trends and increasing memory density, scientific simulations are expected to experience more interruptions caused by transient errors in the system memory. Existing hardware-based detection and recovery techniques will be inadequate to manage the presence of high memory fault rates. In this paper we propose a partial memory protection scheme based on region-based memory management. We define the concept of regions called havens that provide fault protection for program objects. We provide reliability for the regions through a software-based parity protection mechanism. Our approach enables critical program objects to be placed in these havens. The fault coverage provided by our approach is application agnostic, unlike algorithm-based fault tolerance techniques.

  5. Ubiquitous Green Computing Techniques for High Demand Applications in Smart Environments

    Directory of Open Access Journals (Sweden)

    Jose M. Moya

    2012-08-01

    Full Text Available Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.

  6. Ubiquitous green computing techniques for high demand applications in Smart environments.

    Science.gov (United States)

    Zapater, Marina; Sanchez, Cesar; Ayala, Jose L; Moya, Jose M; Risco-Martín, José L

    2012-01-01

    Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.

  7. Computational Fluid Dynamics: Algorithms and Supercomputers

    Science.gov (United States)

    1988-03-01

    became an issue. Hanon Potash, the SCS architect, has often claimed that the key to designing a vector machine is to "super-impose" a scalar design and...of Thompson ([123], Chapter 6.8) is given in the next chapter. 5.4 ITERATIVE ALGORITHMS In order to illustrate restructuring of iterative methods for t...and development of grid generation using Laplace’s and Poisson’s equations has been done by Thompson (1979) and his co-workers [123]. Figure 6.1: Basic

  8. Using Supercomputers to Probe the Early Universe

    Energy Technology Data Exchange (ETDEWEB)

    Giorgi, Elena Edi [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-05-17

    For decades physicists have been trying to decipher the first moments after the Big Bang. Using very large telescopes, for example, scientists scan the skies and look at how fast galaxies move. Satellites study the relic radiation left from the Big Bang, called the cosmic microwave background radiation. And finally, particle colliders, like the Large Hadron Collider at CERN, allow researchers to smash protons together and analyze the debris left behind by such collisions. Physicists at Los Alamos National Laboratory, however, are taking a different approach: they are using computers. In collaboration with colleagues at University of California San Diego, the Los Alamos researchers developed a computer code, called BURST, that can simulate conditions during the first few minutes of cosmological evolution.

  9. Supercomputing:HPCMP, Performance Measures and Opportunities

    Science.gov (United States)

    2007-11-02

    28 PEs Redstone Technical Test Center (RTTC) SGI Origin 3900 24 PEs Simulations & Analysis Facility (SIMAF) Beowulf Cluster Linux...HPCMP Systems (MSRCs) HPC Center System Processors Army Research Laboratory (ARL) IBM P3 SGI Origin 3800 IBM P4 Linux Networx Cluster LNX1...Xeon Cluster IBM Opteron Cluster SGI Altix Cluster 1,280 PEs 256 PEs 512 PEs 768 PEs 128 PEs 256 PEs 2,100 PEs 2,372 PEs 256 PEs

  10. LAPACK: Linear algebra software for supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Bischof, C.H.

    1991-01-01

    This paper presents an overview of the LAPACK library, a portable, public-domain library to solve the most common linear algebra problems. This library provides a uniformly designed set of subroutines for solving systems of simultaneous linear equations, least-squares problems, and eigenvalue problems for dense and banded matrices. We elaborate on the design methodologies incorporated to make the LAPACK codes efficient on today's high-performance architectures. In particular, we discuss the use of block algorithms and the reliance on the Basic Linear Algebra Subprograms. We present performance results that show the suitability of the LAPACK approach for vector uniprocessors and shared-memory multiprocessors. We also address some issues that have to be dealt with in tuning LAPACK for specific architectures. Lastly, we present results that show that the LAPACK software can be adapted with little effort to distributed-memory environments, and we discuss future efforts resulting from this project. 31 refs., 10 figs., 2 tabs.

  11. Foundry provides the network backbone for supercomputing

    CERN Multimedia

    2003-01-01

    Some of the results from the fourth annual High-Performance Bandwidth Challenge, held in conjunction with SC2003, the international conference on high-performance computing and networking which occurred last week in Phoenix, AZ (1/2 page).

  12. Supercomputers and biological sequence comparison algorithms.

    Science.gov (United States)

    Core, N G; Edmiston, E W; Saltz, J H; Smith, R M

    1989-12-01

    Comparison of biological (DNA or protein) sequences provides insight into molecular structure, function, and homology and is increasingly important as the available databases become larger and more numerous. One method of increasing the speed of the calculations is to perform them in parallel. We present the results of initial investigations using two dynamic programming algorithms on the Intel iPSC hypercube and the Connection Machine as well as an inexpensive, heuristically-based algorithm on the Encore Multimax.

  13. Determining Application Runtimes Using Queueing Network Modeling

    Energy Technology Data Exchange (ETDEWEB)

    Elliott, Michael L. [Univ. of San Francisco, CA (United States)

    2006-12-14

    Determination of application times-to-solution for large-scale clustered computers continues to be a difficult problem in high-end computing, which will only become more challenging as multi-core consumer machines become more prevalent in the market. Both researchers and consumers of these multi-core systems desire reasonable estimates of how long their programs will take to run (time-to-solution, or TTS), and how many resources will be consumed in the execution. Currently there are few methods of determining these values, and those that do exist are either overly simplistic in their assumptions or require great amounts of effort to parameterize and understand. One previously untried method is queuing network modeling (QNM), which is easy to parameterize and solve, and produces results that typically fall within 10 to 30% of the actual TTS for our test cases. Using characteristics of the computer network (bandwidth, latency) and communication patterns (number of messages, message length, time spent in communication), the QNM model of the NAS-PB CG application was applied to MCR and ALC, supercomputers at LLNL, and the Keck Cluster at USF, with average errors of 2.41%, 3.61%, and -10.73%, respectively, compared to the actual TTS observed. While additional work is necessary to improve the predictive capabilities of QNM, current results show that QNM has a great deal of promise for determining application TTS for multi-processor computer systems.

  14. CaKernel – A Parallel Application Programming Framework for Heterogenous Computing Architectures

    Directory of Open Access Journals (Sweden)

    Marek Blazewicz

    2011-01-01

    Full Text Available With the recent advent of new heterogeneous computing architectures there is still a lack of parallel problem solving environments that can help scientists to use easily and efficiently hybrid supercomputers. Many scientific simulations that use structured grids to solve partial differential equations in fact rely on stencil computations. Stencil computations have become crucial in solving many challenging problems in various domains, e.g., engineering or physics. Although many parallel stencil computing approaches have been proposed, in most cases they solve only particular problems. As a result, scientists are struggling when it comes to the subject of implementing a new stencil-based simulation, especially on high performance hybrid supercomputers. In response to the presented need we extend our previous work on a parallel programming framework for CUDA – CaCUDA that now supports OpenCL. We present CaKernel – a tool that simplifies the development of parallel scientific applications on hybrid systems. CaKernel is built on the highly scalable and portable Cactus framework. In the CaKernel framework, Cactus manages the inter-process communication via MPI while CaKernel manages the code running on Graphics Processing Units (GPUs and interactions between them. As a non-trivial test case we have developed a 3D CFD code to demonstrate the performance and scalability of the automatically generated code.

  15. Medici 2: A Scalable Content Management System for Cultural Heritage Datasets

    Directory of Open Access Journals (Sweden)

    Constantinos Sophocleous

    2017-04-01

    Full Text Available Digitizing large collections of Cultural Heritage (CH resources and providing tools for their management, analysis and visualization is critical to CH research. A key element in achieving the above goal is to provide user-friendly software offering an abstract interface for interaction with a variety of digital content types. To address these needs, the Medici content management system is being developed in a collaborative effort between the National Center for Supercomputing Applications (NCSA at the University of Illinois at Urbana-Champaign, Bibliotheca Alexandrina (BA in Egypt, and the Cyprus Institute (CyI. The project is pursued in the framework of European Project “Linking Scientific Computing in Europe and Eastern Mediterranean 2” (LinkSCEEM2 and supported by work funded through the U.S. National Science Foundation (NSF, the U.S. National Archives and Records Administration (NARA, the U.S. National Institutes of Health (NIH, the U.S. National Endowment for the Humanities (NEH, the U.S. Office of Naval Research (ONR, the U.S. Environmental Protection Agency (EPA as well as other private sector efforts. Medici is a Web 2.0 environment integrating analysis tools for the auto-curation of un-curated digital data, allowing automatic processing of input (CH datasets, and visualization of both data and collections. It offers a simple user interface for dataset preprocessing, previewing, automatic metadata extraction, user input of metadata and provenance support, storage, archiving and management, representation and reproduction. Building on previous experience (Medici 1, NCSA, and CyI are working towards the improvement of the technical, performance and functionality aspects of the system. The current version of Medici (Medici 2 is the result of these efforts. It is a scalable, flexible, robust distributed framework with wide data format support (including 3D models and Reflectance Transformation Imaging-RTI and metadata functionality. We

  16. High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations

    Science.gov (United States)

    Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.

    2003-01-01

    Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.

  17. Automatic Energy Schemes for High Performance Applications

    Energy Technology Data Exchange (ETDEWEB)

    Sundriyal, Vaibhav [Iowa State Univ., Ames, IA (United States)

    2013-01-01

    Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-all and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform.

  18. PoPLAR: Portal for Petascale Lifescience Applications and Research

    Science.gov (United States)

    2013-01-01

    Background We are focusing specifically on fast data analysis and retrieval in bioinformatics that will have a direct impact on the quality of human health and the environment. The exponential growth of data generated in biology research, from small atoms to big ecosystems, necessitates an increasingly large computational component to perform analyses. Novel DNA sequencing technologies and complementary high-throughput approaches--such as proteomics, genomics, metabolomics, and meta-genomics--drive data-intensive bioinformatics. While individual research centers or universities could once provide for these applications, this is no longer the case. Today, only specialized national centers can deliver the level of computing resources required to meet the challenges posed by rapid data growth and the resulting computational demand. Consequently, we are developing massively parallel applications to analyze the growing flood of biological data and contribute to the rapid discovery of novel knowledge. Methods The efforts of previous National Science Foundation (NSF) projects provided for the generation of parallel modules for widely used bioinformatics applications on the Kraken supercomputer. We have profiled and optimized the code of some of the scientific community's most widely used desktop and small-cluster-based applications, including BLAST from the National Center for Biotechnology Information (NCBI), HMMER, and MUSCLE; scaled them to tens of thousands of cores on high-performance computing (HPC) architectures; made them robust and portable to next-generation architectures; and incorporated these parallel applications in science gateways with a web-based portal. Results This paper will discuss the various developmental stages, challenges, and solutions involved in taking bioinformatics applications from the desktop to petascale with a front-end portal for very-large-scale data analysis in the life sciences. Conclusions This research will help to bridge the gap

  19. Alya: Towards Exascale for Engineering Simulation Codes

    CERN Document Server

    Vazquez, Mariano; Koric, Seid; Artigues, Antoni; Aguado-Sierra, Jazmin; Aris, Ruth; Mira, Daniel; Calmet, Hadrien; Cucchietti, Fernando; Owen, Herbert; Taha, Ahmed; Cela, Jose Maria

    2014-01-01

    Alya is the BSC in-house HPC-based multi-physics simulation code. It is designed from scratch to run efficiently in parallel supercomputers, solving coupled problems. The target domain is engineering, with all its particular features: complex geome- tries and unstructured meshes, coupled multi-physics with exotic coupling schemes and Physical models, ill-posed problems, flexibility needs for rapidly including new models, etc. Since its conception in 2004, Alya has shown scaling behaviour in an increasing number of cores. In this paper, we present its performance up to 100.000 cores in Blue Waters, the NCSA supercomputer. The selected tests are representative of the engineering world, all the problematic features included: incompressible flow in a hu- man respiratory system, low Mach combustion problem in a kiln furnace and coupled electro-mechanical problem in a heart. We show scalability plots for all cases, discussing all the aspects of such kind of simulations, including solvers convergence.

  20. 3rd International Conference on Numerical Analysis and Optimization : Theory, Methods, Applications and Technology Transfer

    CERN Document Server

    Grandinetti, Lucio; Purnama, Anton

    2015-01-01

    Presenting the latest findings in the field of numerical analysis and optimization, this volume balances pure research with practical applications of the subject. Accompanied by detailed tables, figures, and examinations of useful software tools, this volume will equip the reader to perform detailed and layered analysis of complex datasets. Many real-world complex problems can be formulated as optimization tasks. Such problems can be characterized as large scale, unconstrained, constrained, non-convex, non-differentiable, and discontinuous, and therefore require adequate computational methods, algorithms, and software tools. These same tools are often employed by researchers working in current IT hot topics such as big data, optimization and other complex numerical algorithms on the cloud, devising special techniques for supercomputing systems. The list of topics covered include, but are not limited to: numerical analysis, numerical optimization, numerical linear algebra, numerical differential equations, opt...

  1. A meshfree method and its applications to elasto-plastic problems

    Institute of Scientific and Technical Information of China (English)

    ZHANG Ji-fa; ZHANG Wen-pu; ZHENG Yao

    2005-01-01

    Standard finite element approaches are still ineffective in handling extreme material deformation, such as cases of large deformations and moving discontinuities due to severe mesh distortion. Among meshfree methods developed to overcome the ineffectiveness, Reproducing Kernel Particle Method (RKPM) has demonstrated its great suitability for structural analysis.This paper presents applications of RKPM in elasto-plastic problems after a review of meshfree methods and an introduction to RKPM. A slope stability problem in geotechnical engineering is analyzed as an illustrative case. The corresponding numerical simulations are carried out on an SGI Onyx3900 supercomputer. Comparison of the RKPM and the FEM under identical conditions showed that the RKPM is more suitable for problems where there exists extremely large strain such as in the case of slope sliding.

  2. RZBENCH: Performance evaluation of current HPC architectures using low-level and application benchmarks

    CERN Document Server

    Hager, Georg; Zeiser, Thomas; Wellein, Gerhard

    2007-01-01

    RZBENCH is a benchmark suite that was specifically developed to reflect the requirements of scientific supercomputer users at the University of Erlangen-Nuremberg (FAU). It comprises a number of application and low-level codes under a common build infrastructure that fosters maintainability and expandability. This paper reviews the structure of the suite and briefly introduces the most relevant benchmarks. In addition, some widely known standard benchmark codes are reviewed in order to emphasize the need for a critical review of often-cited performance results. Benchmark data is presented for the HLRB-II at LRZ Munich and a local InfiniBand Woodcrest cluster as well as two uncommon system architectures: A bandwidth-optimized InfiniBand cluster based on single socket nodes ("Port Townsend") and an early version of Sun's highly threaded T2 architecture ("Niagara 2").

  3. A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

    CERN Document Server

    Hariri, F; Jocksch, A; Lanti, E; Progsch, J; Messmer, P; Brunner, S; Gheller, G; Villard, L

    2016-01-01

    We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandybridge ...

  4. Pulling the Internet Together with Mosaic.

    Science.gov (United States)

    Sheehan, Mark

    1995-01-01

    Presents the history of the Internet with specific emphasis on Mosaic; discusses hypertext and hypermedia information; and describes software and hardware requirements. Sidebars include information on the National Center for Super Computing Applications (NCSA); World Wide Web browsers for use in Windows, Macintosh, and X-Windows (UNIX); and…

  5. Artificial intelligence applications in the nuclear field: Achievements and prospects: The new challenge

    Energy Technology Data Exchange (ETDEWEB)

    Thomas, J.B. (Service d' Etudes de Reacteurs et de Mathematiques Appliquees, Centre d' Etudes de Saclay, 91 - Gif-sur-Yvette (France))

    1993-04-01

    Applications of Artificial Intelligence in the nuclear field started by developing expert systems dedicated to off-line problems of diagnosis, maintenance. It demonstrated the capability of solving limited but complex problems by use of explicit symbolic knowledge driven by simple logic of early 'inference engines'. A second step aimed at solving more ambitious problems related to plant design and operation, with improved methodologies and tools, generalized: Combining objects and first order logic, developing deep knowledge representation of plants, structuring the knowledge bases, extending the reasoning models towards time and assumption based truth maintenance. New limits did appear: For instance, the validation problem became critical. In order to work out problems faced in late eighties, powerful principles and methods are available: - Integrating available knowledge bases and developing background knowledge bases gathering conceptual knowledge used in several fields of applications. - Carrying on the development of high level reusable reasoning models and of distributed intelligence models and tools; - providing the systems with a self-assessment and self-criticism capability, by the cooperation of several agents reasoning at different levels. - Bringing in the neural networks and connecting them to knowledge base systems. The previous developments require extensive resources. Big projects that can bear their costs exist in the following areas: - 'Knowledge Aided Supercomputing', where the supervision of the computing process by intelligent software could ensure the synergy between modern, highly modular and versatile software, supercomputing capabilities, and the end user in charge of the specifications of the computation. In the field of Reactor Physics, a project of extended integration is specified in CEA (CARENE), in order to improve the connection between methods and numerical schemes available in APOLLO2, CRONOS2...

  6. Pervasive Restart In MOOSE-based Applications

    Energy Technology Data Exchange (ETDEWEB)

    Derek Gaston; Cody Permann; David Andrs; John Peterson; Andrew Slaughter; Jason Miller

    2014-01-01

    Multiphysics applications are inherently complicated. Solving for multiple, interacting physical phenomena involves the solution of multiple equations, and each equation has its own data dependencies. Feeding the correct data to these equations at exactly the right time requires extensive effort in software design. In an ideal world, multiphysics applications always run to completion and produce correct answers. Unfortunately, in reality, there can be many reasons why a simulation might fail: power outage, system failure, exceeding a runtime allotment on a supercomputer, failure of the solver to converge, etc. A failure after many hours spent computing can be a significant setback for a project. Therefore, the ability to “continue” a solve from the point of failure, rather than starting again from scratch, is an essential component of any high-quality simulation tool. This process of “continuation” is commonly termed “restart” in the computational community. While the concept of restarting an application sounds ideal, the aforementioned complexities and data dependencies present in multiphysics applications make its implementation decidedly non-trivial. A running multiphysics calculation accumulates an enormous amount of “state”: current time, solution history, material properties, status of mechanical contact, etc. This “state” data comes in many different forms, including scalar, tensor, vector, and arbitrary, application-specific data types. To be able to restart an application, you must be able to both store and retrieve this data, effectively recreating the state of the application before the failure. When utilizing the Multiphysics Object Oriented Simulation Environment (MOOSE) framework developed at Idaho National Laboratory, this state data is stored both internally within the framework itself (such as solution vectors and the current time) and within the applications that use the framework. In order to implement restart in MOOSE

  7. Predictive Performance Tuning of OpenACC Accelerated Applications

    KAUST Repository

    Siddiqui, Shahzeb

    2014-05-04

    Graphics Processing Units (GPUs) are gradually becoming mainstream in supercomputing as their capabilities to significantly accelerate a large spectrum of scientific applications have been clearly identified and proven. Moreover, with the introduction of high level programming models such as OpenACC [1] and OpenMP 4.0 [2], these devices are becoming more accessible and practical to use by a larger scientific community. However, performance optimization of OpenACC accelerated applications usually requires an in-depth knowledge of the hardware and software specifications. We suggest a prediction-based performance tuning mechanism [3] to quickly tune OpenACC parameters for a given application to dynamically adapt to the execution environment on a given system. This approach is applied to a finite difference kernel to tune the OpenACC gang and vector clauses for mapping the compute kernels into the underlying accelerator architecture. Our experiments show a significant performance improvement against the default compiler parameters and a faster tuning by an order of magnitude compared to the brute force search tuning.

  8. The Dark Energy Survey Data Management System

    Energy Technology Data Exchange (ETDEWEB)

    Mohr, Joseph J.; /Illinois U., Urbana, Astron. Dept. /Illinois U., Urbana; Barkhouse, Wayne; /North Dakota U.; Beldica, Cristina; /Illinois U., Urbana; Bertin, Emmanuel; /Paris, Inst. Astrophys.; Dora Cai, Y.; /NCSA, Urbana; Nicolaci da Costa, Luiz A.; /Rio de Janeiro Observ.; Darnell, J.Anthony; /Illinois U., Urbana, Astron. Dept.; Daues, Gregory E.; /NCSA, Urbana; Jarvis, Michael; /Pennsylvania U.; Gower, Michelle; /NCSA, Urbana; Lin, Huan; /Fermilab /Rio de Janeiro Observ.

    2008-07-01

    The Dark Energy Survey (DES) collaboration will study cosmic acceleration with a 5000 deg2 griZY survey in the southern sky over 525 nights from 2011-2016. The DES data management (DESDM) system will be used to process and archive these data and the resulting science ready data products. The DESDM system consists of an integrated archive, a processing framework, an ensemble of astronomy codes and a data access framework. We are developing the DESDM system for operation in the high performance computing (HPC) environments at the National Center for Supercomputing Applications (NCSA) and Fermilab. Operating the DESDM system in an HPC environment offers both speed and flexibility. We will employ it for our regular nightly processing needs, and for more compute-intensive tasks such as large scale image coaddition campaigns, extraction of weak lensing shear from the full survey dataset, and massive seasonal reprocessing of the DES data. Data products will be available to the Collaboration and later to the public through a virtual-observatory compatible web portal. Our approach leverages investments in publicly available HPC systems, greatly reducing hardware and maintenance costs to the project, which must deploy and maintain only the storage, database platforms and orchestration and web portal nodes that are specific to DESDM. In Fall 2007, we tested the current DESDM system on both simulated and real survey data. We used TeraGrid to process 10 simulated DES nights (3TB of raw data), ingesting and calibrating approximately 250 million objects into the DES Archive database. We also used DESDM to process and calibrate over 50 nights of survey data acquired with the Mosaic2 camera. Comparison to truth tables in the case of the simulated data and internal crosschecks in the case of the real data indicate that astrometric and photometric data quality is excellent.

  9. RPF: An Extensible, Cross-Platform, Binary File Format for Radiation Physics Data

    Energy Technology Data Exchange (ETDEWEB)

    Ham, C L

    2002-09-10

    Lawrence Livermore National Laboratory's Radiation Technology Group (RTG) uses a number of computer codes for simulation and analysis of radiation data. The number of incompatible data formats that these data presented themselves in have continued to multiply. In the 1980's a Common Data Format (CDF, see Appendix A) was devised for internal use by the RTG. This format represented a single gamma-ray spectrum as ASCII energy/count pairs preceded by an ASCII header. The ASCII representation of the data assured that it was compatible on any computing platform and this format is still in use. In the mid 1990's it became apparent that instrument systems of greater complexity would demand a file format of larger capacity to support systems then on the drawing board, including networks of sensors collecting time series of gamma-ray spectra. These systems were in the planning stage and defined data structures were not available. It became apparent that a new storage format for nuclear measurements data would be needed and it would have to be flexible and extensible to accommodate the requirements of systems of the future. As part of an LDRD, we began to investigate what others were doing, especially in the high-energy physics community, to deal with the large volumes of data being generated. Of particular interest was the very general Hierarchical Data Format (HDF), developed and maintained by the National Center for Supercomputing Applications (NCSA), that we ultimately used to develop the Radiation Physics Format (RPF). The HDF subroutine library provides users with the ability to customize a data file format based on standard calls to the HDF subroutine library. The RPF was developed and deployed on Sun and Hewlett-Packard workstations running their proprietary versions of UNIX.

  10. Are Cloud Environments Ready for Scientific Applications?

    Science.gov (United States)

    Mehrotra, P.; Shackleford, K.

    2011-12-01

    Cloud computing environments are becoming widely available both in the commercial and government sectors. They provide flexibility to rapidly provision resources in order to meet dynamic and changing computational needs without the customers incurring capital expenses and/or requiring technical expertise. Clouds also provide reliable access to resources even though the end-user may not have in-house expertise for acquiring or operating such resources. Consolidation and pooling in a cloud environment allow organizations to achieve economies of scale in provisioning or procuring computing resources and services. Because of these and other benefits, many businesses and organizations are migrating their business applications (e.g., websites, social media, and business processes) to cloud environments-evidenced by the commercial success of offerings such as the Amazon EC2. In this paper, we focus on the feasibility of utilizing cloud environments for scientific workloads and workflows particularly of interest to NASA scientists and engineers. There is a wide spectrum of such technical computations. These applications range from small workstation-level computations to mid-range computing requiring small clusters to high-performance simulations requiring supercomputing systems with high bandwidth/low latency interconnects. Data-centric applications manage and manipulate large data sets such as satellite observational data and/or data previously produced by high-fidelity modeling and simulation computations. Most of the applications are run in batch mode with static resource requirements. However, there do exist situations that have dynamic demands, particularly ones with public-facing interfaces providing information to the general public, collaborators and partners, as well as to internal NASA users. In the last few months we have been studying the suitability of cloud environments for NASA's technical and scientific workloads. We have ported several applications to

  11. Framework Application for Core Edge Transport Simulation (FACETS)

    Energy Technology Data Exchange (ETDEWEB)

    Malony, Allen D; Shende, Sameer S; Huck, Kevin A; Mr. Alan Morris, and Mr. Wyatt Spear

    2012-03-14

    The goal of the FACETS project (Framework Application for Core-Edge Transport Simulations) was to provide a multiphysics, parallel framework application (FACETS) that will enable whole-device modeling for the U.S. fusion program, to provide the modeling infrastructure needed for ITER, the next step fusion confinement device. Through use of modern computational methods, including component technology and object oriented design, FACETS is able to switch from one model to another for a given aspect of the physics in a flexible manner. This enables use of simplified models for rapid turnaround or high-fidelity models that can take advantage of the largest supercomputer hardware. FACETS does so in a heterogeneous parallel context, where different parts of the application execute in parallel by utilizing task farming, domain decomposition, and/or pipelining as needed and applicable. ParaTools, Inc. was tasked with supporting the performance analysis and tuning of the FACETS components and framework in order to achieve the parallel scaling goals of the project. The TAU Performance System® was used for instrumentation, measurement, archiving, and profile / tracing analysis. ParaTools, Inc. also assisted in FACETS performance engineering efforts. Through the use of the TAU Performance System, ParaTools provided instrumentation, measurement, analysis and archival support for the FACETS project. Performance optimization of key components has yielded significant performance speedups. TAU was integrated into the FACETS build for both the full coupled application and the UEDGE component. The performance database provided archival storage of the performance regression testing data generated by the project, and helped to track improvements in the software development.

  12. LDRD final report : managing shared memory data distribution in hybrid HPC applications.

    Energy Technology Data Exchange (ETDEWEB)

    Merritt, Alexander M. (Georgia Institute of Technology, Atlanta, GA); Pedretti, Kevin Thomas Tauke

    2010-09-01

    MPI is the dominant programming model for distributed memory parallel computers, and is often used as the intra-node programming model on multi-core compute nodes. However, application developers are increasingly turning to hybrid models that use threading within a node and MPI between nodes. In contrast to MPI, most current threaded models do not require application developers to deal explicitly with data locality. With increasing core counts and deeper NUMA hierarchies seen in the upcoming LANL/SNL 'Cielo' capability supercomputer, data distribution poses an upper boundary on intra-node scalability within threaded applications. Data locality therefore has to be identified at runtime using static memory allocation policies such as first-touch or next-touch, or specified by the application user at launch time. We evaluate several existing techniques for managing data distribution using micro-benchmarks on an AMD 'Magny-Cours' system with 24 cores among 4 NUMA domains and argue for the adoption of a dynamic runtime system implemented at the kernel level, employing a novel page table replication scheme to gather per-NUMA domain memory access traces.

  13. An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

    Science.gov (United States)

    Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

    2012-01-01

    The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.

  14. Will Allis Prize Talk: Electron Collisions - Experiment, Theory and Applications

    Science.gov (United States)

    Bartschat, Klaus

    2016-05-01

    Electron collisions with atoms, ions, and molecules represent one of the very early topics of quantum mechanics. In spite of the field's maturity, a number of recent developments in detector technology (e.g., the ``reaction microscope'' or the ``magnetic-angle changer'') and the rapid increase in computational resources have resulted in significant progress in the measurement, understanding, and theoretical/computational description of few-body Coulomb problems. Close collaborations between experimentalists and theorists worldwide continue to produce high-quality benchmark data, which allow for thoroughly testing and further developing a variety of theoretical approaches. As a result, it has now become possible to reliably calculate the vast amount of atomic data needed for detailed modelling of the physics and chemistry of planetary atmospheres, the interpretation of astrophysical data, optimizing the energy transport in reactive plasmas, and many other topics - including light-driven processes, in which electrons are produced by continuous or short-pulse ultra-intense electromagnetic radiation. In this talk, I will highlight some of the recent developments that have had a major impact on the field. This will be followed by showcasing examples, in which accurate electron collision data enabled applications in fields beyond traditional AMO physics. Finally, open problems and challenges for the future will be outlined. I am very grateful for fruitful scientific collaborations with many colleagues, and the long-term financial support by the NSF through the Theoretical AMO and Computational Physics programs, as well as supercomputer resources through TeraGrid and XSEDE.

  15. Performance optimization of scientific applications on emerging architectures

    Science.gov (United States)

    Dursun, Hikmet

    The shift to many-core architecture design paradigm in computer market has provided unprecedented computational capabilities. This also marks the end of the free-ride era---scientific software must now evolve with new chips. Hence, it is of great importance to develop large legacy-code optimization frameworks to achieve an optimal system architecture-algorithm mapping that maximizes processor utilization and thereby achieves higher application performance. To address this challenge, this thesis studies and develops scalable algorithms for leveraging many-core resources optimally to improve the performance of massively parallel scientific applications. This work presents a systematic approach to optimize scientific codes on emerging architectures, which consists of three major steps: (1) Develop a performance profiling framework to identify application performance bottlenecks on clusters of emerging architectures; (2) explore common algorithmic kernels in a suite of real world scientific applications and develop performance tuning strategies to provide insight into how to maximally utilize underlying hardware; and (3) unify experience in performance optimization to develop a top-down optimization framework for the optimization of scientific applications on emerging high-performance computing platforms. This thesis makes the following contributions. First, we have designed and implemented a performance analysis methodology for Cell-accelerated clusters. Two parallel scientific applications---lattice Boltzmann (LB) flow simulation and atomistic molecular dynamics (MD) simulation---are analyzed and valuable performance insights are gained on a Cell processor based PlayStation3 cluster as well as a hybrid Opteron+Cell based cluster similar to the design of Roadrunner---the first petaflop supercomputer of the world. Second, we have developed a novel parallelization framework for finite-difference time-domain applications. The approach is validated in a seismic

  16. Research on Non-intervention Information Acquisition and Public Sentiment Analysis System for Public Wi-Fi Wireless Networks Based on Supercomputer Platform%基于超算平台的公共Wi-Fi无线网络无痕信息获取与舆情分析系统研究

    Institute of Scientific and Technical Information of China (English)

    杨明; 舒明雷; 顾卫东; 郭强; 周书旺

    2013-01-01

    提出一种利用国家超级计算济南中心的千万亿次计算平台对整个城市范围内的公共Wi-Fi无线网络进行信息获取和舆情分析的系统,它基于非介入式的无线数据包捕获技术、Web页面还原与容错修复技术、多种文本挖掘技术和海量数据处理技术,可对公共Wi-Fi无线网络中的各种非法行为进行取证,对网络舆情进行准确分析和预测,可为相关部门的网络舆论导向工作提供全面准确的参考.%An information acquisition and public sentiment analysis system for the city public Wi-Fi wireless networks was presented, which uses the petaflops computing platform in National Supercomputer Center in Ji'nan. Based on the non-intervention wireless packets capture technology,Web page recovery and fault-tolerant reassembly technology,multiple text data mining technology and mass data process technology,the system can implement the functionality of wireless network forensics, public sentiment analysis and prediction,and provide overall and accurate references for the guidance of public sentiment for the government.

  17. Differential associations between Social Anxiety Disorder, family cohesion, and suicidality across racial/ethnic groups: Findings from the National Comorbidity Survey-Adolescent (NCS-A).

    Science.gov (United States)

    Rapp, Amy M; Lau, Anna; Chavira, Denise A

    2016-09-20

    The proposed research seeks to introduce a novel model relating Social Anxiety Disorder (SAD) and suicide outcomes (i.e., passive suicidal ideation, active suicidal ideation, and suicide attempts) in diverse adolescents. This model posits that family cohesion is one pathway by which suicide risk is increased for socially anxious youth, and predicts that the relationships between these variables may be of different strength in Latino and White subgroups and across gender. Data from a sample of Latino (n=1922) and non-Hispanic White (hereafter referred to as White throughout) (n=5648) male and female adolescents who participated in the National Comorbidity Survey-Adolescent were used for this study. Analyses were conducted using generalized structural equation modeling. Results showed that the mediation model held for White females. Further examination of direct pathways highlighted SAD as a risk factor unique to Latinos for active suicidal ideation and suicide attempt, over and above comorbid depression and other relevant contextual factors. Additionally, family cohesion showed a strong association with suicide outcomes across groups, with some inconsistent findings for White males. Overall, it appears that the mechanism by which SAD increases risk for suicidality is different across groups, indicating further need to identify relevant mediators, especially for racial/ethnic minority youth.

  18. Service Utilization for Lifetime Mental Disorders in U.S. Adolescents: Results of the National Comorbidity Survey-Adolescent Supplement (NCS-A)

    Science.gov (United States)

    Merikangas, Kathleen Ries; He, Jian-ping; Burstein, Marcy; Swendsen, Joel; Avenevoli, Shelli; Case, Brady; Georgiades, Katholiki; Heaton, Leanne; Swanson, Sonja; Olfson, Mark

    2011-01-01

    Objective: Mental health policy for youth has been constrained by a paucity of nationally representative data concerning patterns and correlates of mental health service utilization in this segment of the population. The objectives of this investigation were to examine the rates and sociodemographic correlates of lifetime mental health service use…

  19. Neighborhood communication paradigm to increase scalability in large-scale dynamic scientific applications

    KAUST Repository

    Ovcharenko, Aleksandr

    2012-03-01

    This paper introduces a general-purpose communication package built on top of MPI which is aimed at improving inter-processor communications independently of the supercomputer architecture being considered. The package is developed to support parallel applications that rely on computation characterized by large number of messages of various sizes, often small, that are focused within processor neighborhoods. In some cases, such as solvers having static mesh partitions, the number and size of messages are known a priori. However, in other cases such as mesh adaptation, the messages evolve and vary in number and size and include the dynamic movement of partition objects. The current package provides a utility for dynamic applications based on two key attributes that are: (i) explicit consideration of the neighborhood communication pattern to avoid many-to-many calls and also to reduce the number of collective calls to a minimum, and (ii) use of non-blocking MPI functions along with message packing to manage message flow control and reduce the number and time of communication calls. The test application demonstrated is parallel unstructured mesh adaptation. Results on IBM Blue Gene/P and Cray XE6 computers show that the use of neighborhood-based communication control leads to scalable results when executing generally imbalanced mesh adaptation runs. © 2011 Elsevier B.V. All rights reserved.

  20. ExM:System Support for Extreme-Scale, Many-Task Applications

    Energy Technology Data Exchange (ETDEWEB)

    Katz, Daniel S

    2011-05-31

    The ever-increasing power of supercomputer systems is both driving and enabling the emergence of new problem-solving methods that require the effi cient execution of many concurrent and interacting tasks. Methodologies such as rational design (e.g., in materials science), uncertainty quanti fication (e.g., in engineering), parameter estimation (e.g., for chemical and nuclear potential functions, and in economic energy systems modeling), massive dynamic graph pruning (e.g., in phylogenetic searches), Monte-Carlo- based iterative fi xing (e.g., in protein structure prediction), and inverse modeling (e.g., in reservoir simulation) all have these requirements. These many-task applications frequently have aggregate computing needs that demand the fastest computers. For example, proposed next-generation climate model ensemble studies will involve 1,000 or more runs, each requiring 10,000 cores for a week, to characterize model sensitivity to initial condition and parameter uncertainty. The goal of the ExM project is to achieve the technical advances required to execute such many-task applications efficiently, reliably, and easily on petascale and exascale computers. In this way, we will open up extreme-scale computing to new problem solving methods and application classes. In this document, we report on combined technical progress of the collaborative ExM project, and the institutional financial status of the portion of the project at University of Chicago, over the rst 8 months (through April 30, 2011)

  1. Prometheus: Scalable and Accurate Emulation of Task-Based Applications on Many-Core Systems.

    Energy Technology Data Exchange (ETDEWEB)

    Kestor, Gokcen; Gioiosa, Roberto; Chavarría-Miranda, Daniel

    2015-03-01

    Modeling the performance of non-deterministic parallel applications on future many-core systems requires the development of novel simulation and emulation techniques and tools. We present “Prometheus”, a fast, accurate and modular emulation framework for task-based applications. By raising the level of abstraction and focusing on runtime synchronization, Prometheus can accurately predict applications’ performance on very large many-core systems. We validate our emulation framework against two real platforms (AMD Interlagos and Intel MIC) and report error rates generally below 4%. We, then, evaluate Prometheus’ performance and scalability: our results show that Prometheus can emulate a task-based application on a system with 512K cores in 11.5 hours. We present two test cases that show how Prometheus can be used to study the performance and behavior of systems that present some of the characteristics expected from exascale supercomputer nodes, such as active power management and processors with a high number of cores but reduced cache per core.

  2. Mapping PetaSHA Applications to TeraGrid Architectures

    Science.gov (United States)

    Cui, Y.; Moore, R.; Olsen, K.; Zhu, J.; Dalguer, L. A.; Day, S.; Cruz-Atienza, V.; Maechling, P.; Jordan, T.

    2007-12-01

    The Southern California Earthquake Center (SCEC) has a science program in developing an integrated cyberfacility - PetaSHA - for executing physics-based seismic hazard analysis (SHA) computations. The NSF has awarded PetaSHA 15 million allocation service units this year on the fastest supercomputers available within the NSF TeraGrid. However, one size does not fit all, a range of systems are needed to support this effort at different stages of the simulations. Enabling PetaSHA simulations on those TeraGrid architectures to solve both dynamic rupture and seismic wave propagation have been a challenge from both hardware and software levels. This is an adaptation procedure to meet specific requirements of each architecture. It is important to determine how fundamental system attributes affect application performance. We present an adaptive approach in our PetaSHA application that enables the simultaneous optimization of both computation and communication at run-time using flexible settings. These techniques optimize initialization, source/media partition and MPI-IO output in different ways to achieve optimal performance on the target machines. The resulting code is a factor of four faster than the orignial version. New MPI-I/O capabilities have been added for the accurate Staggered-Grid Split-Node (SGSN) method for dynamic rupture propagation in the velocity-stress staggered-grid finite difference scheme (Dalguer and Day, JGR, 2007), We use execution workflow across TeraGrid sites for managing the resulting data volumes. Our lessons learned indicate that minimizing time to solution is most critical, in particular when scheduling large scale simulations across supercomputer sites. The TeraShake platform has been ported to multiple architectures including TACC Dell lonestar and Abe, Cray XT3 Bigben and Blue Gene/L. Parallel efficiency of 96% with the PetaSHA application Olsen-AWM has been demonstrated on 40,960 Blue Gene/L processors at IBM TJ Watson Center. Notable

  3. Accelerating Communication-Intensive Applications via Novel Data Compression Techniques Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Processor speed has traditionally grown at a rate faster than that of communication speed in computer and supercomputer networks, and it is expected that this trend...

  4. JESPP: Joint Experimentation on Scalable Parallel Processors Supercomputers

    Science.gov (United States)

    2010-03-01

    for traceability of individual pro- grammers during the tumult of operations in a simulation bay, where many operators will need to log-in, use...of which remains to be apprehended. The author’s experienced in teaching an introductory course on Data Mining at the Viterbi School of Engineering

  5. Associative memories for supercomputers. Final report, July 1989-January 1991

    Energy Technology Data Exchange (ETDEWEB)

    Esener, S.C.; Marchand, P.; Krishnamoorthy, A.

    1992-12-01

    A motionless head 2-D parallel readout system for optical disks is presented. Its unique features are discussed and it is compared to various parallel access optical storage media. The motionless-head parallel read-out system for optical disks is shown to meet current and near-term future requirements for high performance secondary storage. In order to select a memory architecture compatible with the motionless-head disk, inner-product and outer-product associative memory algorithms are compared in terms of their storage requirements, search times, system complexities, and fault tolerance. Based on this comparison, the page serial, bit-parallel inner-product method is shown to be well suited to implementation with parallel readout optical disk, and opto-electronic XNOR gate arrays, using for instance the Si/PLZT technology. Finally, the associative memory system design is presented.... Memory, Associative memory, Optical disks.

  6. Benchmarking and tuning the MILC code on clusters and supercomputers

    CERN Document Server

    Gottlieb, S

    2002-01-01

    Recently, we have benchmarked and tuned the MILC code on a number of architectures including Intel Itanium and Pentium IV (PIV), dual-CPU Athlon, and the latest Compaq Alpha nodes. Results will be presented for many of these, and we shall discuss some simple code changes that can result in a very dramatic speedup of the KS conjugate gradient on processors with more advanced memory systems such as PIV, IBM SP and Alpha.

  7. Parametric Parallel Simulation of Discrete Event Systems on SIMD Supercomputers

    Science.gov (United States)

    1994-05-01

    Arrival @ Node i )r, - i. (5.20) qmaxBE P(Accepting Departure @ Node i => Join Nodej )1•. - •i’,P, - (5.21) qmax,BE k XDri + g) P(Null Event)!P,.,.a =W1...network. The departure rate from node j is 0 when that node is in state 0 and g, otherwise. Departure Rate from Nodej = 0* n(0Oj) + j(l - (0j)) 168

  8. Parallel Earthquake Simulations on Large-Scale Multicore Supercomputers

    KAUST Repository

    Wu, Xingfu

    2011-01-01

    Earthquakes are one of the most destructive natural hazards on our planet Earth. Hugh earthquakes striking offshore may cause devastating tsunamis, as evidenced by the 11 March 2011 Japan (moment magnitude Mw9.0) and the 26 December 2004 Sumatra (Mw9.1) earthquakes. Earthquake prediction (in terms of the precise time, place, and magnitude of a coming earthquake) is arguably unfeasible in the foreseeable future. To mitigate seismic hazards from future earthquakes in earthquake-prone areas, such as California and Japan, scientists have been using numerical simulations to study earthquake rupture propagation along faults and seismic wave propagation in the surrounding media on ever-advancing modern computers over past several decades. In particular, ground motion simulations for past and future (possible) significant earthquakes have been performed to understand factors that affect ground shaking in populated areas, and to provide ground shaking characteristics and synthetic seismograms for emergency preparation and design of earthquake-resistant structures. These simulation results can guide the development of more rational seismic provisions for leading to safer, more efficient, and economical50pt]Please provide V. Taylor author e-mail ID. structures in earthquake-prone regions.

  9. Supercomputing for weather and climate modelling: convenience or necessity

    CSIR Research Space (South Africa)

    Landman, WA

    2009-12-01

    Full Text Available Weather and climate modelling require dedicated computer infrastructure in order to generate high-resolution, large ensemble, various models with different configurations, etc. in order to optimise operational forecasts and climate projections. High...

  10. Multiscale Hy3S: Hybrid stochastic simulation for supercomputers

    Directory of Open Access Journals (Sweden)

    Kaznessis Yiannis N

    2006-02-01

    Full Text Available Abstract Background Stochastic simulation has become a useful tool to both study natural biological systems and design new synthetic ones. By capturing the intrinsic molecular fluctuations of "small" systems, these simulations produce a more accurate picture of single cell dynamics, including interesting phenomena missed by deterministic methods, such as noise-induced oscillations and transitions between stable states. However, the computational cost of the original stochastic simulation algorithm can be high, motivating the use of hybrid stochastic methods. Hybrid stochastic methods partition the system into multiple subsets and describe each subset as a different representation, such as a jump Markov, Poisson, continuous Markov, or deterministic process. By applying valid approximations and self-consistently merging disparate descriptions, a method can be considerably faster, while retaining accuracy. In this paper, we describe Hy3S, a collection of multiscale simulation programs. Results Building on our previous work on developing novel hybrid stochastic algorithms, we have created the Hy3S software package to enable scientists and engineers to both study and design extremely large well-mixed biological systems with many thousands of reactions and chemical species. We have added adaptive stochastic numerical integrators to permit the robust simulation of dynamically stiff biological systems. In addition, Hy3S has many useful features, including embarrassingly parallelized simulations with MPI; special discrete events, such as transcriptional and translation elongation and cell division; mid-simulation perturbations in both the number of molecules of species and reaction kinetic parameters; combinatorial variation of both initial conditions and kinetic parameters to enable sensitivity analysis; use of NetCDF optimized binary format to quickly read and write large datasets; and a simple graphical user interface, written in Matlab, to help users create biological systems and analyze data. We demonstrate the accuracy and efficiency of Hy3S with examples, including a large-scale system benchmark and a complex bistable biochemical network with positive feedback. The software itself is open-sourced under the GPL license and is modular, allowing users to modify it for their own purposes. Conclusion Hy3S is a powerful suite of simulation programs for simulating the stochastic dynamics of networks of biochemical reactions. Its first public version enables computational biologists to more efficiently investigate the dynamics of realistic biological systems.

  11. Study of ATLAS TRT performance with GRID and supercomputers.

    CERN Document Server

    Krasnopevtsev, Dimitriy; The ATLAS collaboration; Mashinistov, Ruslan; Belyaev, Nikita; Ryabinkin, Evgeny

    2015-01-01

    After the early success in discovering a new particle consistent with the long awaited Higgs boson, Large Hadron Collider experiments are ready for the precision measurements and further discoveries that will be made possible by much higher LHC collision rates from spring 2015. A proper understanding of the detectors performance at high occupancy conditions is important for many on-going physics analyses. The ATLAS Transition Radiation Tracker (TRT) is one of these detectors. TRT is a large straw tube tracking system that is the outermost of the three subsystems of the ATLAS Inner Detector (ID). TRT contributes significantly to the resolution for high-pT tracks in the ID providing excellent particle identification capabilities and electron-pion separation. ATLAS experiment is using Worldwide LHC Computing Grid. WLCG is a global collaboration of computer centers and provides seamless access to computing resources which include data storage capacity, processing power, sensors, visualisation tools and more. WLCG...

  12. ATLAS FTK a - very complex - custom parallel supercomputer

    CERN Document Server

    Kimura, Naoki; The ATLAS collaboration

    2016-01-01

    In the ever increasing pile-up LHC environment advanced techniques of analysing the data are implemented in order to increase the rate of relevant physics processes with respect to background processes. The Fast TracKer (FTK) is a track finding implementation at hardware level that is designed to deliver full-scan tracks with $p_{T}$ above 1GeV to the ATLAS trigger system for every L1 accept (at a maximum rate of 100kHz). In order to achieve this performance a highly parallel system was designed and now it is under installation in ATLAS. In the beginning of 2016 it will provide tracks for the trigger system in a region covering the central part of the ATLAS detector, and during the year it's coverage will be extended to the full detector coverage. The system relies on matching hits coming from the silicon tracking detectors against 1 billion patterns stored in specially designed ASICS chips (Associative memory - AM06). In a first stage coarse resolution hits are matched against the patterns and the accepted h...

  13. Integration Of PanDA Workload Management System With Supercomputers

    CERN Document Server

    Klimentov, Alexei; The ATLAS collaboration; Maeno, Tadashi; Mashinistov, Ruslan; Nilsson, Paul; Oleynik, Danila; Panitkin, Sergey; Read, Kenneth; Ryabinkin, Evgeny; Wenaus, Torre

    2015-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 100,000 co...

  14. Study of ATLAS TRT performance with GRID and supercomputers.

    CERN Document Server

    Krasnopevtsev, Dimitriy; The ATLAS collaboration; Belyaev, Nikita; Mashinistov, Ruslan; Ryabinkin, Evgeny

    2015-01-01

    After the early success in discovering a new particle consistent with the long awaited Higgs boson, Large Hadron Collider experiments are ready for the precision measurements and further discoveries that will be made possible by much higher LHC collision rates from spring 2015. A proper understanding of the detectors performance at highoccupancy conditions is important for many on-going physics analyses. The ATLAS Transition Radiation Tracker (TRT) is one of these detectors. TRT is a large straw tube tracking system that is the outermost of the three subsystems of the ATLAS Inner Detector (ID). TRT contributes significantly to the resolution for high-pT tracks in the ID providing excellent particle identification capabilities and electron-pion separation. ATLAS experiment is using Worldwide LHC Computing Grid. WLCG is a global collaboration of computer centers and provides seamless access to computing resources which include data storage capacity, processing power, sensors, visualization tools and more. WLCG ...

  15. International Conference Nuclear Theory in the Supercomputing Era 2014

    CERN Document Server

    2014-01-01

    The conference focuses on forefront challenges in physics, namely the fundamentals of nuclear structure and reactions, the origin of the strong inter-nucleon interactions from QCD, and computational nuclear physics with leadership class computer facilities to provide forefront simulations leading to new discoveries.This is the fourth in the series of NTSE-HITES conferences aimed to bring together nuclear theorists, computer scientists and applied mathematicians.

  16. San Diego supercomputer center reaches data transfer milestone

    CERN Multimedia

    2002-01-01

    The SDSC's huge, updated tape storage system has illustrated its effectiveness by transferring data at 828 megabytes per second making it the fastest data archive system according to program director Phil Andrews (1/2 page).

  17. PPARC: World's biggest 'virtual supercomputer' given the go-ahead

    CERN Multimedia

    2003-01-01

    The Particle Physics and Astronomy Research Council has today announced a grant of 16 million pounds to create a massive computing Grid. This Grid, known as GridPP2, will eventually form part of a larger European Grid, to be used to process the data deluge from CERN, when the Large Hadron Collider (LHC), comes online in 2007 (1 page).

  18. Wafer-level micro-optics: trends in manufacturing, testing, packaging, and applications

    Science.gov (United States)

    Voelkel, Reinhard; Gong, Li; Rieck, Juergen; Zheng, Alan

    2012-11-01

    Micro-optics is an indispensable key enabling technology (KET) for many products and applications today. Probably the most prestigious examples are the diffractive light shaping elements used in high-end DUV lithography steppers. Highly efficient refractive and diffractive micro-optical elements are used for precise beam and pupil shaping. Micro-optics had a major impact on the reduction of aberrations and diffraction effects in projection lithography, allowing a resolution enhancement from 250 nm to 45 nm within the last decade. Micro-optics also plays a decisive role in medical devices (endoscopes, ophthalmology), in all laser-based devices and fiber communication networks (supercomputer, ROADM), bringing high-speed internet to our homes (FTTH). Even our modern smart phones contain a variety of micro-optical elements. For example, LED flashlight shaping elements, the secondary camera, and ambient light and proximity sensors. Wherever light is involved, micro-optics offers the chance to further miniaturize a device, to improve its performance, or to reduce manufacturing and packaging costs. Wafer-scale micro-optics fabrication is based on technology established by semiconductor industry. Thousands of components are fabricated in parallel on a wafer. We report on the state of the art in wafer-based manufacturing, testing, packaging and present examples and applications for micro-optical components and systems.

  19. I/O Performance Characterization of Lustre and NASA Applications on Pleiades

    Science.gov (United States)

    Saini, Subhash; Rappleye, Jason; Chang, Johnny; Barker, David Peter; Biswas, Rupak; Mehrotra, Piyush

    2012-01-01

    In this paper we study the performance of the Lustre file system using five scientific and engineering applications representative of NASA workload on large-scale supercomputing systems such as NASA s Pleiades. In order to facilitate the collection of Lustre performance metrics, we have developed a software tool that exports a wide variety of client and server-side metrics using SGI's Performance Co-Pilot (PCP), and generates a human readable report on key metrics at the end of a batch job. These performance metrics are (a) amount of data read and written, (b) number of files opened and closed, and (c) remote procedure call (RPC) size distribution (4 KB to 1024 KB, in powers of 2) for I/O operations. RPC size distribution measures the efficiency of the Lustre client and can pinpoint problems such as small write sizes, disk fragmentation, etc. These extracted statistics are useful in determining the I/O pattern of the application and can assist in identifying possible improvements for users applications. Information on the number of file operations enables a scientist to optimize the I/O performance of their applications. Amount of I/O data helps users choose the optimal stripe size and stripe count to enhance I/O performance. In this paper, we demonstrate the usefulness of this tool on Pleiades for five production quality NASA scientific and engineering applications. We compare the latency of read and write operations under Lustre to that with NFS by tracing system calls and signals. We also investigate the read and write policies and study the effect of page cache size on I/O operations. We examine the performance impact of Lustre stripe size and stripe count along with performance evaluation of file per process and single shared file accessed by all the processes for NASA workload using parameterized IOR benchmark.

  20. Exploring the Use of Concept Spaces to Improve Medical Information Retrieval

    Science.gov (United States)

    2000-01-01

    medical informat- ics, digital libraries and electronic publishing, human factors in HumanrComputer Interaction, and Natural Language Processing. She is...Urbana–Champaign. He is the Principal Investigator of the Digital Libraries Initiative project and the DARPA Information Management Program, which...Ž .Applications NCSA , serving as the scientific advisor for digital libraries and information systems. He has served in this role since 1989

  1. Will Allis Prize for the Study of Ionized Gases: Electron Collisions - Experiment, Theory, and Applications

    Science.gov (United States)

    Bartschat, Klaus

    2016-09-01

    Electron collisions with atoms, ions, and molecules represent one of the very early topics of quantum mechanics. In spite of the field's maturity, a number of recent developments in detector technology (e.g., the ``reaction microscope'' or the ``magnetic-angle changer'') and the rapid increase in computational resources have resulted in significant progress in the measurement, understanding, and theoretical/computational description of few-body Coulomb problems. Close collaborations between experimentalists and theorists worldwide continue to produce high-quality benchmark data, which allow for thoroughly testing and further developing a variety of theoretical approaches. As a result, it has now become possible to reliably calculate the vast amount of atomic data needed for detailed modelling of the physics and chemistry of planetary atmospheres, the interpretation of astrophysical data, optimizing the energy transport in reactive plasmas, and many other topics - including light-driven processes, in which electrons are produced by continuous or short-pulse ultra-intense electromagnetic radiation. I will highlight some of the recent developments that have had a major impact on the field. This will be followed by showcasing examples, in which accurate electron collision data enabled applications in fields beyond traditional AMO physics. Finally, open problems and challenges for the future will be outlined. I am very grateful for fruitful scientific collaborations with many colleagues, and the long-term financial support by the NSF through the Theoretical AMO and Computational Physics programs, as well as supercomputer resources through TeraGrid and XSEDE.

  2. Integrated Performance Monitoring of a Cosmology Application onLeading HEC Platforms

    Energy Technology Data Exchange (ETDEWEB)

    Borrill, Julian; Carter, Jonathan; Oliker, Leonid; Skinner,David; Biswas, Rupak

    2005-04-01

    The Cosmic Microwave Background (CMB) is an exquisitely sensitive probe of the fundamental parameters of cosmology. Extracting this information is computationally intensive, requiring massively parallel computing and sophisticated numerical algorithms. In this work we present MAD bench, a lightweight version of the MADCAP CMB power spectrum estimation code that retains the operational complexity and integrated system requirements. In addition, to quantify communication behavior across a variety of architectural platforms, we introduce the Integrated Performance Monitoring (IPM) package: a portable, lightweight,and scalable tool for effectively extracting MPI message-passing overheads. A performance characterization study is conducted on some of the world's most powerful supercomputers, including the superscalar Seaborg(IBMPower3+) and CC-NUMA Columbia (SGI Altix), as well as the vector-based Earth Simulator (NEC SX-6 enhanced) and Phoenix (Cray X1)systems. In-depth analysis shows that in order to bridge the gap between theoretical and sustained system performance, it is critical to gain a clear understanding of how the distinct parts of large-scale parallel applications interact with the individual subcomponents of HEC platforms.

  3. A portable platform for accelerated PIC codes and its application to GPUs using OpenACC

    Science.gov (United States)

    Hariri, F.; Tran, T. M.; Jocksch, A.; Lanti, E.; Progsch, J.; Messmer, P.; Brunner, S.; Gheller, C.; Villard, L.

    2016-10-01

    We present a portable platform, called PIC_ENGINE, for accelerating Particle-In-Cell (PIC) codes on heterogeneous many-core architectures such as Graphic Processing Units (GPUs). The aim of this development is efficient simulations on future exascale systems by allowing different parallelization strategies depending on the application problem and the specific architecture. To this end, this platform contains the basic steps of the PIC algorithm and has been designed as a test bed for different algorithmic options and data structures. Among the architectures that this engine can explore, particular attention is given here to systems equipped with GPUs. The study demonstrates that our portable PIC implementation based on the OpenACC programming model can achieve performance closely matching theoretical predictions. Using the Cray XC30 system, Piz Daint, at the Swiss National Supercomputing Centre (CSCS), we show that PIC_ENGINE running on an NVIDIA Kepler K20X GPU can outperform the one on an Intel Sandy bridge 8-core CPU by a factor of 3.4.

  4. Efficient Parallelization of a Dynamic Unstructured Application on the Tera MTA

    Science.gov (United States)

    Oliker, Leonid; Biswas, Rupak

    1999-01-01

    The success of parallel computing in solving real-life computationally-intensive problems relies on their efficient mapping and execution on large-scale multiprocessor architectures. Many important applications are both unstructured and dynamic in nature, making their efficient parallel implementation a daunting task. This paper presents the parallelization of a dynamic unstructured mesh adaptation algorithm using three popular programming paradigms on three leading supercomputers. We examine an MPI message-passing implementation on the Cray T3E and the SGI Origin2OOO, a shared-memory implementation using cache coherent nonuniform memory access (CC-NUMA) of the Origin2OOO, and a multi-threaded version on the newly-released Tera Multi-threaded Architecture (MTA). We compare several critical factors of this parallel code development, including runtime, scalability, programmability, and memory overhead. Our overall results demonstrate that multi-threaded systems offer tremendous potential for quickly and efficiently solving some of the most challenging real-life problems on parallel computers.

  5. Communication Requirements and Interconnect Optimization forHigh-End Scientific Applications

    Energy Technology Data Exchange (ETDEWEB)

    Kamil, Shoaib; Oliker, Leonid; Pinar, Ali; Shalf, John

    2007-11-12

    The path towards realizing peta-scale computing isincreasingly dependent on building supercomputers with unprecedentednumbers of processors. To prevent the interconnect from dominating theoverall cost of these ultra-scale systems, there is a critical need forhigh-performance network solutions whose costs scale linearly with systemsize. This work makes several unique contributions towards attaining thatgoal. First, we conduct one of the broadest studies to date of high-endapplication communication requirements, whose computational methodsinclude: finite-difference, lattice-bolzmann, particle in cell, sparselinear algebra, particle mesh ewald, and FFT-based solvers. Toefficiently collect this data, we use the IPM (Integrated PerformanceMonitoring) profiling layer to gather detailed messaging statistics withminimal impact to code performance. Using the derived communicationcharacterizations, we next present fit-trees interconnects, a novelapproach for designing network infrastructure at a fraction of thecomponent cost of traditional fat-tree solutions. Finally, we propose theHybrid Flexibly Assignable Switch Topology (HFAST) infrastructure, whichuses both passive (circuit) and active (packet) commodity switchcomponents to dynamically reconfigure interconnects to suit thetopological requirements of scientific applications. Overall ourexploration leads to a promising directions for practically addressingthe interconnect requirements of future peta-scale systems.

  6. An application of computational fluid mechanics to the air flow in an infant incubator.

    Science.gov (United States)

    Yamaguchi, T; Hanai, S; Horio, H; Hasegawa, T

    1992-01-01

    An application of the computational fluid mechanical method to the air flow in a two-dimensional model of an infant incubator was described. The air flow in a numerical model was simulated and the Navier-Stokes equations were directly solved using a finite-volume method incorporating a body-fitted coordinate system on a mini-supercomputer. The model was based on a real infant incubator, slightly simplified for the sake of computing speed, and included a model of a baby. The number of computation grids was 101 x 61 = 6161. The calculation was carried out under the condition of unsteady, starting airflow and the results were examined by the means of color graphics animation. There were several very large scale eddies in the incubator free space, and their global structure did not show strong changes once they were established. Although the global structure did not change, small scale eddies were shown to be produced around the air inlet and convected down through the free space of the incubator. From these results, we believe that assuming steady and uniform flow in the incubator may not always be relevant when considering heat loss of a baby in an incubator. The steady and uniform flow has been previously assumed either implicitly or explicitly by most of the authors.

  7. Medical Applications

    CERN Document Server

    Biscari, C.

    2014-12-19

    The use of accelerators for medical applications has evolved from initial experimentation to turn-key devices commonly operating in hospitals. New applications are continuously being developed around the world, and the hadrontherapy facilities of the newest generation are placed at the frontier between industrial production and advanced R&D. An introduction to the different medical application accelerators is followed by a description of the hadrontherapy facilities, with special emphasis on CNAO, and the report closes with a brief outlook on the future of this field.

  8. High performance parallel computing of flows in complex geometries: II. Applications

    Energy Technology Data Exchange (ETDEWEB)

    Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F [Computational Fluid Dynamics Team, CERFACS, Toulouse, 31057 (France); Poinsot, T [Institut de Mecanique des Fluides de Toulouse, Toulouse, 31400 (France)], E-mail: Nicolas.gourdain@cerfacs.fr

    2009-01-01

    Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.

  9. Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Chase Qishi [New Jersey Inst. of Technology, Newark, NJ (United States); Univ. of Memphis, TN (United States); Zhu, Michelle Mengxia [Southern Illinois Univ., Carbondale, IL (United States)

    2016-06-06

    The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models feature diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific

  10. Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Chase Qishi [New Jersey Inst. of Technology, Newark, NJ (United States); Univ. of Memphis, TN (United States); Zhu, Michelle Mengxia [Southern Illinois Univ., Carbondale, IL (United States)

    2016-06-06

    The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models feature diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific

  11. Uniprocessor Performance Analysis of a Representative Workload of Sandia National Laboratories' Scientific Applications.

    Energy Technology Data Exchange (ETDEWEB)

    Charles Laverty

    2005-10-01

    UNIPROCESSOR PERFORMANCE ANALYSIS OF A REPRESENTATIVE WORKLOAD OF SANDIA NATIONAL LABORATORIES' SCIENTIFIC APPLICATIONS Master of Science in Electrical Engineering New Mexico State University Las Cruces, New Mexico, 2005 Dr. Jeanine Cook, Chair Throughout the last decade computer performance analysis has become absolutely necessary to maximum performance of some workloads. Sandia National Laboratories (SNL) located in Albuquerque, New Mexico is no different in that to achieve maximum performance of large scientific, parallel workloads performance analysis is needed at the uni-processor level. A representative workload has been chosen as the basis of a computer performance study to determine optimal processor characteristics in order to better specify the next generation of supercomputers. Cube3, a finite element test problem developed at SNL is a representative workload of their scientific workloads. This workload has been studied at the uni-processor level to understand characteristics in the microarchitecture that will lead to the overall performance improvement at the multi-processor level. The goal of studying vthis workload at the uni-processor level is to build a performance prediction model that will be integrated into a multi-processor performance model which is currently being developed at SNL. Through the use of performance counters on the Itanium 2 microarchitecture, performance statistics are studied to determine bottlenecks in the microarchitecture and/or changes in the application code that will maximize performance. From source code analysis a performance degrading loop kernel was identified and through the use of compiler optimizations a performance gain of around 20% was achieved.

  12. Application note :

    Energy Technology Data Exchange (ETDEWEB)

    Russo, Thomas V.

    2013-08-01

    The development of the XyceTM Parallel Electronic Simulator has focused entirely on the creation of a fast, scalable simulation tool, and has not included any schematic capture or data visualization tools. This application note will describe how to use the open source schematic capture tool gschem and its associated netlist creation tool gnetlist to create basic circuit designs for Xyce, and how to access advanced features of Xyce that are not directly supported by either gschem or gnetlist.

  13. Technology advances and market forces: Their impact on high performance architectures

    Science.gov (United States)

    Best, D. R.

    1978-01-01

    Reasonable projections into future supercomputer architectures and technology require an analysis of the computer industry market environment, the current capabilities and trends within the component industry, and the research activities on computer architecture in the industrial and academic communities. Management, programmer, architect, and user must cooperate to increase the efficiency of supercomputer development efforts. Care must be taken to match the funding, compiler, architecture and application with greater attention to testability, maintainability, reliability, and usability than supercomputer development programs of the past.

  14. Photography applications

    Science.gov (United States)

    Cochran, Susan A.; Goodman, James A.; Purkis, Samuel J.; Phinn, Stuart R.

    2013-01-01

    Photographic imaging is the oldest form of remote sensing used in coral reef studies. This chapter briefly explores the history of photography from the 1850s to the present, and delves into its application for coral reef research. The investigation focuses on both photographs collected from low-altitude fixed-wing and rotary aircraft, and those collected from space by astronauts. Different types of classification and analysis techniques are discussed, and several case studies are presented as examples of the broad use of photographs as a tool in coral reef research.

  15. 21St Century Atmospheric Forecasting for Space Based Applications

    Science.gov (United States)

    Alliss, R.; Felton, B.; Craddock, M.; Kiley, H.; Mason, M.

    2016-09-01

    Many space based applications from imaging to communications are impacted by the atmosphere. Atmospheric impacts such as optical turbulence and clouds are the main drivers for these types of systems. For example, in space based optical communications, clouds will produce channel fades on the order of many hundreds of decibels (dB) thereby breaking the communication link. Optical turbulence can also produce fades but these can be compensated for by adaptive optics. The ability to forecast the current and future location and optical thickness of clouds for space to ground Electro Optical or optical communications is therefore critical in order to achieve a highly reliable system. We have developed an innovative method for producing such forecasts. These forecasts are intended to provide lead times on the order of several hours to days so that communication links can be transferred from a currently loudy ground location to another more desirable ground site. The system uses high resolution Numerical Weather Prediction (NWP) along with a variational data assimilation (DA) scheme to improve the initial conditions and forecasts. DA is used to provide an improved estimate of the atmospheric state by combining meteorological observations with NWP products and their respective error statistics. Variational DA accomplishes this through the minimization of a prescribed cost function, whereby differences between the observations and analysis are damped according to their perceived error. The NWP model is a fully three-dimensional (3D) physics-based model of the atmosphere initialized with gridded atmospheric data obtained from a global scale model. The global model input data has a horizontal resolution of approximately 25km, which is insufficient for the desired atmospheric forecasts required at near 1km resolution. Therefore, a variational DA system is used to improve the quality and resolution of the initial conditions first prescribed by the global model. Data used by the

  16. GOSAT-2 : Science Plan, Products, Validation, and Application

    Science.gov (United States)

    Matsunaga, T.; Morino, I.; Yoshida, Y.; Saito, M.; Hiraki, K.; Yokota, Y.; Kamei, A.; Oishi, Y.; Dupuy, E.; Murakami, K.; Ninomiya, K.; Pang, J. S.; Yokota, T.; Maksyutov, S. S.; Machida, T.; Saigusa, N.; Mukai, H.; Nakajima, M.; Imasu, R.; Nakajima, T.

    2013-12-01

    Based on the success of Greenhouse Gases Observing Satellite (GOSAT) launched in 2009, Ministry of the Environment (MOE), Japan Space Exploration Agency (JAXA), and National Institute for Environmental Studies (NIES) started the preparations for the follow-on satellite, GOSAT-2 in FY2011. The current target launch year of GOSAT-2 is FY2017. The objectives of GOSAT-2 include : 1) Continue and enhance spaceborne greenhouse gases observation started by GOSAT, 2) Improve our understanding of global and regional carbon cycles, and 3) Contribute to the climate change related policies as one of MRV(Measurement, Reporting, and Verification) tools for carbon emission reduction. As a scientific background/rationale of GOSAT-2, GOSAT-2 Science Plan is being edited by GOSAT-2 Science Team Preparation Committee. Not only carbon dioxide and methane but also carbon monoxide, tropospheric ozone, and aerosols are discussed in the plan. GOSAT-2 Level 2 (gas concentrations) and Level 4 (gas fluxes) products will be operationally generated at and distributed from GOSAT-2 Data Handling Facility located in NIES. In addition, a new supercomputer dedicated to GOSAT-2 research and development will be also installed in NIES. GOSAT-2 validation plan is also being discussed. Its baseline is similar to the current GOSAT . But various efforts will be made to extend the coverage of validation data for GOSAT-2. The efforts include the increased commercial passenger aircraft volunteering atmospheric measurements and additional ground-based Fourier transform spectrometers to be newly installed in Asian countries. In addition, a compact accelerator mass spectrometer is being introduced to NIES to investigate the contributions of anthropogenic emissions which is important for GOSAT-2. Climate change related policies include JCM (Joint Crediting Mechanism) in which MRV plays a critical role. MRV tools used in the existing JCM projects are mostly ground-based and site-specific. Satellite atmospheric

  17. SCEC Earthquake System Science Using High Performance Computing

    Science.gov (United States)

    Maechling, P. J.; Jordan, T. H.; Archuleta, R.; Beroza, G.; Bielak, J.; Chen, P.; Cui, Y.; Day, S.; Deelman, E.; Graves, R. W.; Minster, J. B.; Olsen, K. B.

    2008-12-01

    were run on NSF TeraGrid sites including simulations that use the full PSC Big Ben supercomputer (4096 cores) and simulations that ran on more than 10K cores at TACC Ranger. The SCEC/CME group used scientific workflow tools and grid-computing to run more than 1.5 million jobs at NCSA for the CyberShake project. Visualizations produced by a SCEC/CME researcher of the 10Hz ShakeOut 1.2 scenario simulation data were used by USGS in ShakeOut publications and public outreach efforts. OpenSHA was ported onto an NSF supercomputer and was used to produce very high resolution hazard PSHA maps that contained more than 1.6 million hazard curves.

  18. Toolkit for high performance Monte Carlo radiation transport and activation calculations for shielding applications in ITER

    Energy Technology Data Exchange (ETDEWEB)

    Serikov, A.; Fischer, U.; Grosse, D.; Leichtle, D.; Majerle, M., E-mail: arkady.serikov@kit.edu [Karlsruhe Institute of Technology (KIT), Eggenstein-Leopoldshafen (Germany)

    2011-07-01

    The Monte Carlo (MC) method is the most suitable computational technique of radiation transport for shielding applications in fusion neutronics. This paper is intended for sharing the results of long term experience of the fusion neutronics group at Karlsruhe Institute of Technology (KIT) in radiation shielding calculations with the MCNP5 code for the ITER fusion reactor with emphasizing on the use of several ITER project-driven computer programs developed at KIT. Two of them, McCad and R2S, seem to be the most useful in radiation shielding analyses. The McCad computer graphical tool allows to perform automatic conversion of the MCNP models from the underlying CAD (CATIA) data files, while the R2S activation interface couples the MCNP radiation transport with the FISPACT activation allowing to estimate nuclear responses such as dose rate and nuclear heating after the ITER reactor shutdown. The cell-based R2S scheme was applied in shutdown photon dose analysis for the designing of the In-Vessel Viewing System (IVVS) and the Glow Discharge Cleaning (GDC) unit in ITER. Newly developed at KIT mesh-based R2S feature was successfully tested on the shutdown dose rate calculations for the upper port in the Neutral Beam (NB) cell of ITER. The merits of McCad graphical program were broadly acknowledged by the neutronic analysts and its continuous improvement at KIT has introduced its stable and more convenient run with its Graphical User Interface. Detailed 3D ITER neutronic modeling with the MCNP Monte Carlo method requires a lot of computation resources, inevitably leading to parallel calculations on clusters. Performance assessments of the MCNP5 parallel runs on the JUROPA/HPC-FF supercomputer cluster permitted to find the optimal number of processors for ITER-type runs. (author)

  19. 云计算及其应用%Cloud Computing and Its Applications

    Institute of Scientific and Technical Information of China (English)

    梁东莺; 高潮

    2011-01-01

    云计算是以虚拟化技术为基础,以网络为载体提供基础架构、平台、软件等服务为形式,整合大规模可扩展的计算、存储、数据、应用等分布式计算资源进行协同工作的超级计算模式;针对当前云计算概念混杂的现状,提出了一个较综合的参考性定义,对目前主流的云计算平台实例进行了概括性介绍,从云平台的层次更深刻地剖析云计算的本质;最后讨论了云计算的未来发展趋势.%As a model of supercomputing , cloud computing is based on virtualization and provides the IAAS, PAAS, SAAS services via Internet, which will integrate all the large-scale and extensible distributed computing resources such as computing, storage, data and ap-plication. With consideration of the status of mixed concepts, a reference definition was put forward after the introduction of the cloud com-putting concept, and also discussed the difference with distributed computing, grid computing, parallel computing and utility computing. In the end, the cases of cloud computing platforms were introduced to help us better understand the cloud computing platform. Finally, we dis-cussed the cloud computing security problem as well as the cloud computing development direction.

  20. Quantum simulation of superconductors on quantum computers. Toward the first applications of quantum processors

    Energy Technology Data Exchange (ETDEWEB)

    Dallaire-Demers, Pierre-Luc

    2016-10-07

    Quantum computers are the ideal platform for quantum simulations. Given enough coherent operations and qubits, such machines can be leveraged to simulate strongly correlated materials, where intricate quantum effects give rise to counter-intuitive macroscopic phenomena such as high-temperature superconductivity. Many phenomena of strongly correlated materials are encapsulated in the Fermi-Hubbard model. In general, no closed-form solution is known for lattices of more than one spatial dimension, but they can be numerically approximated using cluster methods. To model long-range effects such as order parameters, a powerful method to compute the cluster's Green's function consists in finding its self-energy through a variational principle. As is shown in this thesis, this allows the possibility of studying various phase transitions at finite temperature in the Fermi-Hubbard model. However, a classical cluster solver quickly hits an exponential wall in the memory (or computation time) required to store the computation variables. We show theoretically that the cluster solver can be mapped to a subroutine on a quantum computer whose quantum memory usage scales linearly with the number of orbitals in the simulated cluster and the number of measurements scales quadratically. We also provide a gate decomposition of the cluster Hamiltonian and a simple planar architecture for a quantum simulator that can also be used to simulate more general fermionic systems. We briefly analyze the Trotter-Suzuki errors and estimate the scaling properties of the algorithm for more complex applications. A quantum computer with a few tens of qubits could therefore simulate the thermodynamic properties of complex fermionic lattices inaccessible to classical supercomputers.

  1. Energy efficiency vs. performance of the numerical solution of PDEs: An application study on a low-power ARM-based cluster

    Science.gov (United States)

    Göddeke, Dominik; Komatitsch, Dimitri; Geveler, Markus; Ribbrock, Dirk; Rajovic, Nikola; Puzovic, Nikola; Ramirez, Alex

    2013-03-01

    Power consumption and energy efficiency are becoming critical aspects in the design and operation of large scale HPC facilities, and it is unanimously recognised that future exascale supercomputers will be strongly constrained by their power requirements. At current electricity costs, operating an HPC system over its lifetime can already be on par with the initial deployment cost. These power consumption constraints, and the benefits a more energy-efficient HPC platform may have on other societal areas, have motivated the HPC research community to investigate the use of energy-efficient technologies originally developed for the embedded and especially mobile markets. However, lower power does not always mean lower energy consumption, since execution time often also increases. In order to achieve competitive performance, applications then need to efficiently exploit a larger number of processors. In this article, we discuss how applications can efficiently exploit this new class of low-power architectures to achieve competitive performance. We evaluate if they can benefit from the increased energy efficiency that the architecture is supposed to achieve. The applications that we consider cover three different classes of numerical solution methods for partial differential equations, namely a low-order finite element multigrid solver for huge sparse linear systems of equations, a Lattice-Boltzmann code for fluid simulation, and a high-order spectral element method for acoustic or seismic wave propagation modelling. We evaluate weak and strong scalability on a cluster of 96 ARM Cortex-A9 dual-core processors and demonstrate that the ARM-based cluster can be more efficient in terms of energy to solution when executing the three applications compared to an x86-based reference machine.

  2. Accelerating Scientific Applications using High Performance Dense and Sparse Linear Algebra Kernels on GPUs

    KAUST Repository

    Abdelfattah, Ahmad

    2015-01-15

    High performance computing (HPC) platforms are evolving to more heterogeneous configurations to support the workloads of various applications. The current hardware landscape is composed of traditional multicore CPUs equipped with hardware accelerators that can handle high levels of parallelism. Graphical Processing Units (GPUs) are popular high performance hardware accelerators in modern supercomputers. GPU programming has a different model than that for CPUs, which means that many numerical kernels have to be redesigned and optimized specifically for this architecture. GPUs usually outperform multicore CPUs in some compute intensive and massively parallel applications that have regular processing patterns. However, most scientific applications rely on crucial memory-bound kernels and may witness bottlenecks due to the overhead of the memory bus latency. They can still take advantage of the GPU compute power capabilities, provided that an efficient architecture-aware design is achieved. This dissertation presents a uniform design strategy for optimizing critical memory-bound kernels on GPUs. Based on hierarchical register blocking, double buffering and latency hiding techniques, this strategy leverages the performance of a wide range of standard numerical kernels found in dense and sparse linear algebra libraries. The work presented here focuses on matrix-vector multiplication kernels (MVM) as repre- sentative and most important memory-bound operations in this context. Each kernel inherits the benefits of the proposed strategies. By exposing a proper set of tuning parameters, the strategy is flexible enough to suit different types of matrices, ranging from large dense matrices, to sparse matrices with dense block structures, while high performance is maintained. Furthermore, the tuning parameters are used to maintain the relative performance across different GPU architectures. Multi-GPU acceleration is proposed to scale the performance on several devices. The

  3. TYPOLOGIES OF MOBILE APPLICATIONS

    OpenAIRE

    Ion Ivan; Alin Zamfiroiu; Dragoş Palaghiţă3

    2013-01-01

    Mobile applications and their particularities are analyzed. Mobile application specific characteristics are defined. Types of applications are identified and analyzed. The paper established differences between mobile applications and mobile application categories. For each identified type the specific structures and development model are identified.

  4. ICASE: Scientific Visualization Solutions 6

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  5. ICASE: Scientific Visualization Solutions 3

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  6. ICASE: Scientific Visualization Solutions 1

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  7. ICASE: Scientific Visualization Solutions 7

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  8. ICASE: Scientific Visualization Solutions 8

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  9. ICASE: Scientific Visualization Solutions 2

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  10. ICASE: Scientific Visualization Solutions 4

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  11. ICASE: Scientific Visualization Solutions 5

    Science.gov (United States)

    1997-01-01

    ICASE: Institute for Computer Applications in Science and Engineering Visualizing the results of supercomputer simulations can be a computationaly demanding process. Research in applying supercomputing tecnology to the problem of data visualisation is being conducted at ICASE, ar NASA LAngley. These clips look at the work of ICASE and are illustrated with examples of complex 3D renderings of data sets.

  12. Learning Android application testing

    CERN Document Server

    Blundell, Paul

    2015-01-01

    If you are an Android developer looking to test your applications or optimize your application development process, then this book is for you. No previous experience in application testing is required.

  13. Promise Zones for Applicants

    Data.gov (United States)

    Department of Housing and Urban Development — This tool assists applicants to HUD's Promise Zone initiative prepare data to submit with their application by allowing applicants to draw the exact location of the...

  14. Credential Application Awaiting Information

    Data.gov (United States)

    Department of Homeland Security — When a Credential application or required documentation is incomplete, an Awaiting Information letter is issued. The application process cannot continue until all...

  15. Sight Application Analysis Tool

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2014-09-17

    The scale and complexity of scientific applications makes it very difficult to optimize, debug and extend them to support new capabilities. We have developed a tool that supports developers’ efforts to understand the logical flow of their applications and interactions between application components and hardware in a way that scales with application complexity and parallelism.

  16. Microwave power engineering applications

    CERN Document Server

    Okress, Ernest C

    2013-01-01

    Microwave Power Engineering, Volume 2: Applications introduces the electronics technology of microwave power and its applications. This technology emphasizes microwave electronics for direct power utilization and transmission purposes. This volume presents the accomplishments with respect to components, systems, and applications and their prevailing limitations in the light of knowledge of the microwave power technology. The applications discussed include the microwave heating and other processes of materials, which utilize the magnetron predominantly. Other applications include microwave ioni

  17. Photorefractive Materials and Their Applications 3 Applications

    CERN Document Server

    Günter, Peter

    2007-01-01

    In this third volume a series of applications on photorefractive nonlinear optics and optical data storage are presented. This and the other two volumes on photorefractive effects, materials and applications have been prepared mainly for researchers in the field, but also for physics, engineering and materials science students. Several chapters contain sufficient introductory material for those not so familiar with the topic to obtain a thorough understanding of the photorefractive effect. We hope that researchers active in the field will find these books to be a very valuable reference source. The other two volumes are: Photorefractive Materials and Their Applications 1: Basic Effects Photorefractive Materials and Their Applications 2: Materials

  18. Engineering Adaptive Applications

    DEFF Research Database (Denmark)

    Dolog, Peter

    . The different requirements might be satisfied by different variants of features maintained and provided by Web applications. An adaptive Web application can be seen as a family of Web applications where application instances are those generated for particular user based on his characteristics relevant...... for a domain.In this book, we propose a new domain engineering framework which extends a development process of Web applications with techniques required when designing such adaptive customizable Web applications. The framework is provided with design abstractions which deal separately with information served...

  19. Statistical methods of SNP data analysis with applications

    CERN Document Server

    Bulinski, Alexander; Shashkin, Alexey; Yaskov, Pavel

    2011-01-01

    Various statistical methods important for genetic analysis are considered and developed. Namely, we concentrate on the multifactor dimensionality reduction, logic regression, random forests and stochastic gradient boosting. These methods and their new modifications, e.g., the MDR method with "independent rule", are used to study the risk of complex diseases such as cardiovascular ones. The roles of certain combinations of single nucleotide polymorphisms and external risk factors are examined. To perform the data analysis concerning the ischemic heart disease and myocardial infarction the supercomputer SKIF "Chebyshev" of the Lomonosov Moscow State University was employed.

  20. Industrial applications of high-performance computing best global practices

    CERN Document Server

    Osseyran, Anwar

    2015-01-01

    ""This book gives a comprehensive and up-to-date overview of the rapidly expanding field of the industrial use of supercomputers. It is just a pleasure reading through informative country reports and in-depth case studies contributed by leading researchers in the field.""-Jysoo Lee, Principal Researcher, Korea Institute of Science and Technology Information""From telescopes to microscopes, from vacuums to hyperbaric chambers, from sonar waves to laser beams, scientists have perpetually strived to apply technology and invention to new frontiers of scientific advancement. Along the way, they hav

  1. CyberIntegrator: A Highly Interactive Problem Solving Environment to Support Environmental Observatories

    Science.gov (United States)

    Marini, L.; Minsker, B.; Kooper, R.; Myers, J.; Bajcsy, P.

    2006-12-01

    This work presents CyberIntegrator, a component of the Environmental Cyber-Infrastructure Demonstration (ECID) project at the National Center for Supercomputing Applications (NCSA), which is exploring cyberinfrastructure for environmental observatories with an emphasis on supporting exploratory analysis and end-to-end productivity. CyberIntegrator is a novel workflow-based system that supports interactive workflow creation, connection to external data and event streams, provenance tracking, and incorporation of workflow fragments and functionality from other systems and applications, all of which enable the types of tasks expected in observatories. This presentation describes CyberIntegrator's use in three environmental use cases that reveal its novel aspects. The three environmental use cases are: modeling fecal coliform concentrations in Copano Bay, TX; discovering vegetation variability from large remotely sensed images at the US continental scale; detecting anomalous data from sensors in Corpus Christi Bay, TX. The use cases show how CyberIntegrator supports scientists through: 1. Integration of heterogeneous tools: CyberIntegrator can execute tools from multiple (currently six) heterogeneous software packages while hiding integration complexity. It passes the outputs of a software tool to the input of another tool transparently to an end user. 2. Provenance information gathering: Meta-data about scientific analyses are gathered and stored in a meta- data repository as Resource Description Framework (RDF) triples. The information enables tracking of provenance from data to workflow to publication and provides recommendations to users based on previous community activities. 3. Interactive human-computer interface: CyberIntegrator editor provides a user-friendly interface for browsing registries of data, tools and computational resources; creating workflows in a step-by-step exploration mode; re-using and re-purposing workflows; executing process flows locally or

  2. Engineering Web Applications

    DEFF Research Database (Denmark)

    Casteleyn, Sven; Daniel, Florian; Dolog, Peter

    Nowadays, Web applications are almost omnipresent. The Web has become a platform not only for information delivery, but also for eCommerce systems, social networks, mobile services, and distributed learning environments. Engineering Web applications involves many intrinsic challenges due...

  3. Applicant Satisfaction Survey

    Data.gov (United States)

    Office of Personnel Management — The Chief Human Capital Officers developed 3 surveys that asks applicants to assess their satisfaction with the application process on a 1-10 point scale, with 10...

  4. Nanoplasmonics advanced device applications

    CERN Document Server

    Chon, James W M

    2013-01-01

    Focusing on control and manipulation of plasmons at nanometer dimensions, nanoplasmonics combines the strength of electronics and photonics, and is predicted to replace existing integrated circuits and photonic devices. It is one of the fastest growing fields of science, with applications in telecommunication, consumer electronics, data storage, medical diagnostics, and energy.Nanoplasmonics: Advanced Device Applications provides a scientific and technological background of a particular nanoplasmonic application and outlines the progress and challenges of the application. It reviews the latest

  5. Engineering electrochemical capacitor applications

    Science.gov (United States)

    Miller, John R.

    2016-09-01

    Electrochemical capacitor (EC) applications have broadened tremendously since EC energy storage devices were introduced in 1978. Then typical applications operated below 10 V at power levels below 1 W. Today many EC applications operate at voltages approaching 1000 V at power levels above 100 kW. This paper briefly reviews EC energy storage technology, shows representative applications using EC storage, and describes engineering approaches to design EC storage systems. Comparisons are made among storage systems designed to meet the same application power requirement but using different commercial electrochemical capacitor products.

  6. SIMS applications workshop. Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-04-01

    The first ANSTO/AINSE SIMS Workshop drew together a mixture of Surface Analysis experts and Surface Analysis users with the concept that SIMS analysis has to be enfolded within the spectrum of surface analysis techniques and that the user should select the technique most applicable to the problem. With this concept in mind the program was structured as sessions on SIMS Facilities; Applications to Mineral Surfaces; Applications to Biological Systems, Applications to Surfaces as Semi- conductors, Catalysts and Surface Coatings; and Applications to Ceramics

  7. Cyberinfrastructure for Data Authorship, Publication and Application Interoperability

    Science.gov (United States)

    Helly, J. J.

    2012-12-01

    Since the mid-1990s, at the San Diego Supercomputer Center (SDSC) at the University of California, San Diego (UCSD), we have been building digital library systems for a range of disciplines and evolving the underlying cyberinfrastructure components through generations of deployed, operational systems. These include applications for coastal resource management (California Coastal Atlas), blue-water oceanography (SIOExplorer), deep-ocean drilling (Integrated Ocean Drilling Program Site Survey Data Bank), atmospheric science (Center for Multi-scale Modeling of Atmospheric Processes (CMMAP)) and geospatial data-sharing across the State of California (CSDI). SIOExplorer and IODP SSDB are operational for about ten years under the control of staff at SIO using earlier versions of the technologies we propose to leverage here. Recently, CLIDEEP (Climate Impacts on the Deep Ocean), a part of International Network for Scientific Investigation of Deep-Sea Ecosystems (INDEEP), has been added to the list of projects to which this technology will be applied thereby entraining a new community of ecologists in best-management practices for scientific data and data publication. Since those earlier systems were made operational, continuing developments have led to the evolution of the Digital Library Framework and Digital Library System technologies to facilitate the production of multi-lateral metadata conforming to a variety of metadata standards (e.g., Dublin Core, FGDC, ISO19139) and to automate the production of the metadata required to obtain a digital object identifier (DOI) from the CrossRef system and DataCite cross-referencing systems. Since the emergence of CrossRef, there is now also a citation service called DataCite. For about the past two years, a Data Citation Standards and Practices Task Group under the International Council for Science / Committee on Data for Science and Technology (ICTSI/CODATA) was formed to develop a report for its international membership

  8. Electronic Submissions of Pesticide Applications

    Science.gov (United States)

    Applications for pesticide registration can be submitted electronically, including forms, studies, and draft product labeling. Applicants need not submit multiple electronic copies of any pieces of their applications.

  9. Criteria for Social Applications

    DEFF Research Database (Denmark)

    Atzenbeck, Claus

    2007-01-01

    Social networks are becoming increasingly important for a wide number of applications. This is in particular true in the context of the Web 2.0 movement where a number of Web-based applications emerged - termed social networking applications or services - that allow the articulation of social rel...... to represent a community.  In this paper we outline a framework for analyzing applications that permit the construction of social networks. Our main focus is on the abstractions and mechanisms that a number of applications provide to facilitate the building of such networks.......Social networks are becoming increasingly important for a wide number of applications. This is in particular true in the context of the Web 2.0 movement where a number of Web-based applications emerged - termed social networking applications or services - that allow the articulation of social...... relationships between individuals thus creating social networks. Although Web 2.0 applications are a popular and characteristic class of such applications they are not the only representatives that permit such functionality. Applications in the Personal Information Management domain exhibit similar...

  10. Criteria for Social Applications

    DEFF Research Database (Denmark)

    Atzenbeck, Claus

    2007-01-01

    relationships between individuals thus creating social networks. Although Web 2.0 applications are a popular and characteristic class of such applications they are not the only representatives that permit such functionality. Applications in the Personal Information Management domain exhibit similar...... characteristics but have never been mentioned in the context of social networking. The increasing number and diversity of such applications makes their study, analysis and evaluation from a systems point of view critical and important as their study may help identify relationships that are useful when attempting......Social networks are becoming increasingly important for a wide number of applications. This is in particular true in the context of the Web 2.0 movement where a number of Web-based applications emerged - termed social networking applications or services - that allow the articulation of social...

  11. A Survey of Distributed Capability File Systems and Their Application to Cloud Environments

    Science.gov (United States)

    2014-09-01

    Department of the Navy memorandum N2N6/4U119014. [2] I. R. Porche III, B. Wilson, E.-E. Johnson , S. Tierney, and E. Saltzman, “Data flood: Helping...file systems,” in Proceedings of the 2007 ACM/IEEE Conference on Supercomputing (SC’07), Nov. 2007, pp. 1–12. [61] J. G. Steiner , C. Neuman, and J. I

  12. Enabling Extreme Scale Earth Science Applications at the Oak Ridge Leadership Computing Facility

    Science.gov (United States)

    Anantharaj, V. G.; Mozdzynski, G.; Hamrud, M.; Deconinck, W.; Smith, L.; Hack, J.

    2014-12-01

    The Oak Ridge Leadership Facility (OLCF), established at the Oak Ridge National Laboratory (ORNL) under the auspices of the U.S. Department of Energy (DOE), welcomes investigators from universities, government agencies, national laboratories and industry who are prepared to perform breakthrough research across a broad domain of scientific disciplines, including earth and space sciences. Titan, the OLCF flagship system, is currently listed as #2 in the Top500 list of supercomputers in the world, and the largest available for open science. The computational resources are allocated primarily via the Innovative and Novel Computational Impact on Theory and Experiment (INCITE) program, sponsored by the U.S. DOE Office of Science. In 2014, over 2.25 billion core hours on Titan were awarded via INCITE projects., including 14% of the allocation toward earth sciences. The INCITE competition is also open to research scientists based outside the USA. In fact, international research projects account for 12% of the INCITE awards in 2014. The INCITE scientific review panel also includes 20% participation from international experts. Recent accomplishments in earth sciences at OLCF include the world's first continuous simulation of 21,000 years of earth's climate history (2009); and an unprecedented simulation of a magnitude 8 earthquake over 125 sq. miles. One of the ongoing international projects involves scaling the ECMWF Integrated Forecasting System (IFS) model to over 200K cores of Titan. ECMWF is a partner in the EU funded Collaborative Research into Exascale Systemware, Tools and Applications (CRESTA) project. The significance of the research carried out within this project is the demonstration of techniques required to scale current generation Petascale capable simulation codes towards the performance levels required for running on future Exascale systems. One of the techniques pursued by ECMWF is to use Fortran2008 coarrays to overlap computations and communications and

  13. Towards 21st Century Stellar Models: Star Clusters, Supercomputing, and Asteroseismology

    DEFF Research Database (Denmark)

    Campbell, S. W.; Constantino, T. N.; D'Orazi, V.;

    2016-01-01

    Stellar models provide a vital basis for many aspects of astronomy and astrophysics. Recent advances in observational astronomy -- through asteroseismology, precision photometry, high-resolution spectroscopy, and large-scale surveys -- are placing stellar models under greater quantitative scrutin...... a brief overview of the evolution, importance, and substantial uncertainties of core helium burning stars in particular and then briefly discuss a range of methods, both theoretical and observational, that we are using to advance the modelling....

  14. Theory, design, and simulation of GASP: A block data flow architecture for gallium arsenide supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Fouts, D.J.

    1990-01-01

    The advantages and disadvantages of using high-speed gallium arsenide (GaAs) logic for implementing digital systems are reviewed. A set of design guidelines is presented for systems that will be constructed with high-speed technologies such as GaAs and silicon emitter coupled logic (ECL). A new class of computer and digital system architectures, known as functionally modular architectures, is defined and explained. Functionally modular architectures are ideal for implementation in GaAs because they adhere to the design guidelines. GASP, a new, functionally modular, block data flow computer architecture is then described. SPICE simulations indicate that if constructed with existing GaAs IC technology, parts of GASP could run at a clock speed of 1 GHz, with the rest of the architecture using a 500 MHz clock. The new architecture uses data flow techniques at a program block level, which allows efficient execution of parallel programs while maintaining reasonably good performance on sequential programs. A simulation study of the architecture's best case and worst case performance is presented. Simulations of GASP executing a highly parallel program indicate that an instruction execution rate of over 30,000 MIPS can be attained with a 65 processor system.

  15. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

    Directory of Open Access Journals (Sweden)

    Mark James Abraham

    2015-09-01

    Full Text Available GROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported.

  16. Distributed Processing of PIV images with a low power cluster supercomputer

    Science.gov (United States)

    Smith, Barton; Horne, Kyle; Hauser, Thomas

    2007-11-01

    Recent advances in digital photography and solid-state lasers make it possible to acquire images at up to 3000 frames per second. However, as the ability to acquire large samples very quickly has been realized, processing speed has not kept pace. A 2-D Particle Image Velocimetry (PIV) acquisition computer would require over five hours to process the data that can be acquired in one second with a Time-resolved Stereo PIV (TRSPIV) system. To decrease the computational time, parallel processing using a Beowulf cluster has been applied. At USU we have developed a low-power Beowulf cluster integrated with the data acquisition system of a TRSPIV system. This approach of integrating the PIV system and the Beowulf cluster eliminates the communication time, thus speeding up the process. In addition to improving the practicality of TRSPIV, this system will also be useful to researchers performing any PIV measurement where a large number of samples are required. Our presentation will describe the hardware and software implementation of our approach.

  17. Time Parallel Solution of Linear Partial Differential Equations on the Intel Touchstone Delta Supercomputer

    Science.gov (United States)

    Toomarian, N.; Fijany, A.; Barhen, J.

    1993-01-01

    Evolutionary partial differential equations are usually solved by decretization in time and space, and by applying a marching in time procedure to data and algorithms potentially parallelized in the spatial domain.

  18. Erasmus Computing Grid : Het Bouwen van een 20 TeraFLOP Virtuelle Supercomputer

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); L.V. de Zeeuw (Luc)

    2007-01-01

    textabstractHet Erasmus Medische Centrum (Erasmus MC) en Hogeschool Rotterdam (HR) zijn in 2005 een unieke samenwerking begonnen om 95% van de capaciteit op al haar computers en die van anderen beschikbaar te maken voor onderzoek en onderwijs. Deze samenwerking heeft geleid tot het Erasmus

  19. Solving sparse linear least squares problems on some supercomputers by using large dense blocks

    DEFF Research Database (Denmark)

    Hansen, Per Christian; Ostromsky, T; Sameh, A;

    1997-01-01

    technique is preferable to sparse matrix technique when the matrices are not large, because the high computational speed compensates fully the disadvantages of using more arithmetic operations and more storage. For very large matrices the computations must be organized as a sequence of tasks in each...... the matrix so that dense blocks can be constructed and treated with some standard software, say LAPACK or NAG. These ideas are implemented for linear least-squares problems. The rectangular matrices (that appear in such problems) are decomposed by an orthogonal method. Results obtained on a CRAY C92A...

  20. Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer

    Directory of Open Access Journals (Sweden)

    Michael eHines

    2011-11-01

    Full Text Available The performance of several spike exchange methods using a Blue Gene/P supercomputerhas been tested with 8K to 128K cores using randomly connected networks of up to 32M cells with 1k connections per cell and 4M cells with 10k connections per cell. The spike exchange methods used are the standard Message Passing Interface collective, MPI_Allgather, and several variants of the non-blocking multisend method either implemented via non-blocking MPI_Isend, or exploiting the possibility of very low overhead direct memory access communication available on the Blue Gene/P. In all cases the worst performing method was that using MPI_Isend due to the high overhead of initiating a spike communication. The two best performing methods --- the persistent multisend method using the Record-Replay feature of the Deep Computing Messaging Framework DCMF_Multicast;and a two phase multisend in which a DCMF_Multicast is used to first send to a subset of phase 1 destination cores which then pass it on to their subset of phase 2 destination cores --- had similar performance with very low overhead for the initiation of spike communication. Departure from ideal scaling for the multisend methods is almost completely due to load imbalance caused by the largevariation in number of cells that fire on each processor in the interval between synchronization. Spike exchange time itself is negligible since transmission overlaps with computation and is handled by a direct memory access controller. We conclude that ideal performance scaling will be ultimately limited by imbalance between incoming processor spikes between synchronization intervals. Thus, counterintuitively, maximization of load balance requires that the distribution of cells on processors should not reflect neural net architecture but be randomly distributed so that sets of cells which are burst firing together should be on different processors with their targets on as large a set of processors as possible.

  1. Defect generation and motion in polyethylene-like crystals, analyzed by simulation with supercomputers

    Energy Technology Data Exchange (ETDEWEB)

    Wunderlich, B.; Xenopoulos, A.; Noid, D.W.; Sumpter, B.G. (Oak Ridge National Lab., TN (USA) Tennessee Univ., Knoxville, TN (USA). Dept. of Chemistry)

    1990-01-01

    Defects in polymers were observed by high resolution electron microscopy and inferred from their mechanical and dielectrical behavior. The details of their generation was not known, however, in the past. During the last few years we have been able to extend the molecular dynamics simulation of polyethylene to crystals containing up to 6100 atoms and to times as long as 100 ps. The major observation was that single bond rotations of more than 90{degree} become possible already more than 100 K below the melting temperature. These defects have lifetimes of only a few ps. By coupling to kinks (2g1) they can extend their lifetime considerably. Addition of a thermal, mechanical or dielectric free energy gradient to the thermally created defects seems to be able to account for the microscopic motion needed to explain the macroscopically observed annealing, deformations and relaxation effects. Key to the mechanical and dielectric properties is thus the existence of conformational disorder (condis crystal). 47 refs., 9 figs.

  2. UbiWorld: An environment integrating virtual reality, supercomputing, and design

    Energy Technology Data Exchange (ETDEWEB)

    Disz, T.; Papka, M.E.; Stevens, R. [Argonne National Lab., IL (United States). Mathematics and Computer Science Div.

    1997-07-01

    UbiWorld is a concept being developed by the Futures Laboratory group at Argonne National Laboratory that ties together the notion of ubiquitous computing (Ubicomp) with that of using virtual reality for rapid prototyping. The goal is to develop an environment where one can explore Ubicomp-type concepts without having to build real Ubicomp hardware. The basic notion is to extend object models in a virtual world by using distributed wide area heterogeneous computing technology to provide complex networking and processing capabilities to virtual reality objects.

  3. A Sixty-Year Timeline of the Air Force Maui Optical and Supercomputing Site

    Science.gov (United States)

    2013-01-01

    work from the beginning. This research would not have been as successful without the enthusiastic help and guidance that we received from a number of...Advanced Research Projects Agency (ARPA) proposes the ARPA Midcourse Observation Station (AMOS) as an astronomical-quality observatory for obtaining... guidance on implementation of a basic research program at AMOS. Site Management Duffner, undated 2001 AFRL Det 15 initiates a project to develop and

  4. DNS of MHD turbulent flow via the HELIOS supercomputer system at IFERC-CSC

    Science.gov (United States)

    Satake, Shin-ichi; Kimura, Masato; Yoshimori, Hajime; Kunugi, Tomoaki; Takase, Kazuyuki

    2014-06-01

    The simulation plays an important role to estimate characteristics of cooling in a blanket for such high heating plasma in ITER-BA. An objective of this study is to perform large -scale direct numerical simulation (DNS) on heat transfer of magneto hydro dynamic (MHD) turbulent flow on coolant materials assumed from Flibe to lithium. The coolant flow conditions in ITER-BA are assumed to be Reynolds number and Hartmann number of a higher order. The maximum target of the DNS assumed by this study based on the result of the benchmark of Helios at IFERC-CSC for Project cycle 1 is 116 TB (2048 nodes). Moreover, we tested visualization by ParaView to visualize directly the large-scale computational result. If this large-scale DNS becomes possible, an essential understanding and modelling of a MHD turbulent flow and a design of nuclear fusion reactor contributes greatly.

  5. Using Mitrion-C to Implement Floating-Point Arithmetic on a Cray XD1 Supercomputer

    Science.gov (United States)

    2008-01-01

    Authorized licensed use limited to: US Naval Academy. Downloaded on February 5, 2010 at 07:24 from IEEE Xplore . Restrictions apply. Report Documentation...Downloaded on February 5, 2010 at 07:24 from IEEE Xplore . Restrictions apply. FPGA’s memory and host memory present on the same compute node...Naval Academy. Downloaded on February 5, 2010 at 07:24 from IEEE Xplore . Restrictions apply. memories had a bit-width of 64 bits, or 8 bytes, the

  6. LDRD final report : a lightweight operating system for multi-core capability class supercomputers.

    Energy Technology Data Exchange (ETDEWEB)

    Kelly, Suzanne Marie; Hudson, Trammell B. (OS Research); Ferreira, Kurt Brian; Bridges, Patrick G. (University of New Mexico); Pedretti, Kevin Thomas Tauke; Levenhagen, Michael J.; Brightwell, Ronald Brian

    2010-09-01

    The two primary objectives of this LDRD project were to create a lightweight kernel (LWK) operating system(OS) designed to take maximum advantage of multi-core processors, and to leverage the virtualization capabilities in modern multi-core processors to create a more flexible and adaptable LWK environment. The most significant technical accomplishments of this project were the development of the Kitten lightweight kernel, the co-development of the SMARTMAP intra-node memory mapping technique, and the development and demonstration of a scalable virtualization environment for HPC. Each of these topics is presented in this report by the inclusion of a published or submitted research paper. The results of this project are being leveraged by several ongoing and new research projects.

  7. What would a data scientist do with 10 seconds on a supercomputer?

    Science.gov (United States)

    Nychka, D. W.

    2014-12-01

    The statistical problems of large climate datasets, the flexibility ofhigh level data languages such as R, and the architectures of currentsupercomputers have motivated a different paradigm for data analysis problems that are amenable to being parallelized. Part of theswitch in thinking is to harness many cores for a short amount of timeto produce interactive-like exploratory data analysis for thespace-time data sets typically encountered in the geosciences. As motivation we consider the near interactive analysis ofdaily observed temperature and rainfall fields for North America over thepast 30 years. For certain kinds of analysis the potential is forspeedups on the order of a factor of a 1000 more and so changestraditional work flows of statistical modeling and inference for largegeophysical datasets.

  8. Erasmus Computing Grid : Het Bouwen van een 20 TeraFLOP Virtuelle Supercomputer

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); L.V. de Zeeuw (Luc)

    2007-01-01

    textabstractHet Erasmus Medische Centrum (Erasmus MC) en Hogeschool Rotterdam (HR) zijn in 2005 een unieke samenwerking begonnen om 95% van de capaciteit op al haar computers en die van anderen beschikbaar te maken voor onderzoek en onderwijs. Deze samenwerking heeft geleid tot het Erasmus Comp

  9. Assessing the Need for Supercomputing Resources Within the Pacific Area of Responsibility

    Science.gov (United States)

    2015-05-26

    database can be divided into smaller shards that are distributed to many nodes, and each node performs the search in parallel with the rest. The desired...subparts that can be solved in parallel, for example, when searching a large database for records satisfying a particular condition. To solve this, the...records are then collected after the parallel searches are completed. However, during the time that the search is being performed on each shard

  10. The Erasmus Computing Grid – Building a Super-Computer for Free

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); A. Abuseiris (Anis); R.M. de Graaf (Rob); M. Lesnussa (Michael); F.G. Grosveld (Frank)

    2011-01-01

    textabstractToday advances in scientific research as well as clinical diagnostics and treatment are inevitably connected with information solutions concerning computation power and information storage. The needs for information technology are enormous and are in many cases the limiting factor for ne

  11. The Erasmus Computing Grid - Building a Super-Computer for FREE

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); L.V. de Zeeuw (Luc)

    2007-01-01

    textabstractToday advances in scientific research as well as clinical diagnostics and treatment are inevitably connected with information solutions concerning computation power and information storage. The needs for information technology are enormous and are in many cases the limiting factor fo

  12. Erasmus Computing Grid: Het bouwen van een 20 Tera-FLOPS Virtuele Supercomputer.

    NARCIS (Netherlands)

    L.V. de Zeeuw (Luc); T.A. Knoch (Tobias); J.H. van den Berg (Jan); F.G. Grosveld (Frank)

    2007-01-01

    textabstractHet Erasmus Medisch Centrum en de Hogeschool Rotterdam zijn in 2005 een samenwerking begonnen teneinde de ongeveer 95% onbenutte rekencapaciteit van hun computers beschikbaar te maken voor onderzoek en onderwijs. Deze samenwerking heeft geleid tot het Erasmus Computing GRID (ECG), een vi

  13. Installation of the CDC 7600 supercomputer system in the computer centre in 1972

    CERN Multimedia

    Nettz, William

    1972-01-01

    The CDC 7600 was installed in 1972 in the newly built computer centre. It was said to be the largest and most powerful computer system in Europe at that time and remained the fastest machine at CERN for 9 years. It was replaced after 12 years. Dr. Julian Blake (CERN), Dr. Tor Bloch (CERN), Erwin Gasser (Control Data Corporation), Jean-Marie LaPorte (Control Data Corporation), Peter McWilliam (Control Data Corporation), Hans Oeshlein (Control Data Corporation), and Peter Warn (Control Data Corporation) were heavily involved in this project and may appear on the pictures. William Nettz (who took the pictures) was in charge of the installation. Excerpt from CERN annual report 1972: 'Data handling and evaluation is becoming an increasingly important part of physics experiments. In order to meet these requirements a new central computer system, CDC 7600/6400, has been acquired and it was brought into more or less regular service during the year. Some initial hardware problems have disappeared but work has still to...

  14. The Erasmus Computing Grid – Building a Super-Computer for Free

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); A. Abuseiris (Anis); R.M. de Graaf (Rob); M. Lesnussa (Michael); F.G. Grosveld (Frank)

    2011-01-01

    textabstractToday advances in scientific research as well as clinical diagnostics and treatment are inevitably connected with information solutions concerning computation power and information storage. The needs for information technology are enormous and are in many cases the limiting factor for

  15. The Erasmus Computing Grid - Building a Super-Computer for FREE

    NARCIS (Netherlands)

    T.A. Knoch (Tobias); L.V. de Zeeuw (Luc)

    2007-01-01

    textabstractToday advances in scientific research as well as clinical diagnostics and treatment are inevitably connected with information solutions concerning computation power and information storage. The needs for information technology are enormous and are in many cases the limiting

  16. Optimization of the computational load of a hypercube supercomputer onboard a mobile robot

    Energy Technology Data Exchange (ETDEWEB)

    Barhen, J.; Toomarian, N.; Protopopescu, V.

    1987-12-01

    A combinatorial optimization methodology is developed, which enables the efficient use of hypercube multiprocessors onboard mobile intelligent robots dedicated to time-critical missions. The methodology is implemented in terms of large-scale concurrent algorithms based either on fast simulated annealing, or on nonlinear asynchronous neural networks. In particular, analytic expressions are given for the effect of single-neuron perturbations on the systems' configuration energy. Compact neuromorphic data structures are used to model effects such as precedence constraints, processor idling times, and task-schedule overlaps. Results for a typical robot-dynamics benchmark are presented.

  17. Introductory User’s Guide to the ARL Supercomputer Facility at APG

    Science.gov (United States)

    1993-06-01

    12.4.4 flint ................................. 12-12 13. Interlanguage Communication ................... 13.1 Fortran and C...1993 Interlanguage Communication 13. Interlanguage Communication The considerations expressed in this chapter apply in general to programs whose...2079 6.0 (Sections 1 and 8, and Appendix A), UNICOS Standard C Library Reference Manual, SR-2080 6.0, ( Interlanguage Communications), UNICOS Math

  18. Novel Supercomputing Approaches for High Performance Linear Algebra Using FPGAs Project

    Data.gov (United States)

    National Aeronautics and Space Administration — We propose to develop novel FPGA-based algorithmic technology that will enable unprecedented computational power for the solution of large sparse linear equation...

  19. Integration Of PanDA Workload Management System With Supercomputers for ATLAS

    CERN Document Server

    Oleynik, Danila; The ATLAS collaboration; De, Kaushik; Wenaus, Torre; Maeno, Tadashi; Barreiro Megino, Fernando Harald; Nilsson, Paul; Guan, Wen; Panitkin, Sergey

    2016-01-01

    The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production ANd Distributed Analysis system) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more t...

  20. Towards 21st Century Stellar Models: Star Clusters, Supercomputing, and Asteroseismology

    CERN Document Server

    Campbell, S W; D'Orazi, V; Meakin, C; Stello, D; Christensen-Dalsgaard, J; Kuehn, C; De Silva, G M; Arnett, W D; Lattanzio, J C; MacLean, B T

    2015-01-01

    Stellar models provide a vital basis for many aspects of astronomy and astrophysics. Recent advances in observational astronomy -- through asteroseismology, precision photometry, high-resolution spectroscopy, and large-scale surveys -- are placing stellar models under greater quantitative scrutiny than ever. The model limitations are being exposed and the next generation of stellar models is needed as soon as possible. The current uncertainties in the models propagate to the later phases of stellar evolution, hindering our understanding of stellar populations and chemical evolution. Here we give a brief overview of the evolution, importance, and substantial uncertainties of core helium burning stars in particular and then briefly discuss a range of methods, both theoretical and observational, that we are using to advance the modelling.

  1. Parallel Supercomputing PC Cluster and Some Physical Results in Lattice QCD

    Institute of Scientific and Technical Information of China (English)

    LUOXiang-Qian; MEIZhong-Hao; EricB.Gregory; YANGJie-Chao; WANGYu-Li; LINYin

    2003-01-01

    We describe the construction of a high performance parallel computer composed of PC components, present some physical results for light hadron and hybrid meson masses from lattice QCD. We also show that the smearing technique is very useful for improving the spectrum calculations.

  2. Towards 21st Century Stellar Models: Star Clusters, Supercomputing, and Asteroseismology

    DEFF Research Database (Denmark)

    Campbell, S. W.; Constantino, T. N.; D'Orazi, V.;

    2016-01-01

    Stellar models provide a vital basis for many aspects of astronomy and astrophysics. Recent advances in observational astronomy -- through asteroseismology, precision photometry, high-resolution spectroscopy, and large-scale surveys -- are placing stellar models under greater quantitative scrutin...

  3. Good Seeing: Best Practices for Sustainable Operations at the Air Force Maui Optical and Supercomputing Site

    Science.gov (United States)

    2016-01-01

    instrument switching capabilities, a steady stream of telemetry must be supplied by detectors and actuators situated at all points of possible failure...remote staff must have constant access to this telemetry . The telescope must be able to be opened and closed remotely, and weather changes must be...and narrow-field facilities rarely occupy the same site, so this dual capacity is part of AMOS’s unique value proposition. 3 Since AFRL took over

  4. Cyberinfrastructure for Atmospheric Discovery

    Science.gov (United States)

    Wilhelmson, R.; Moore, C. W.

    2004-12-01

    Each year across the United States, floods, tornadoes, hail, strong winds, lightning, hurricanes, and winter storms cause hundreds of deaths, routinely disrupt transportation and commerce, and result in billions of dollars in annual economic losses . MEAD and LEAD are two recent efforts aimed at developing the cyberinfrastructure for studying and forecasting these events through collection, integration, and analysis of observational data coupled with numerical simulation, data mining, and visualization. MEAD (Modeling Environment for Atmospheric Discovery) has been funded for two years as an NCSA (National Center for Supercomputing Applications) Alliance Expedition. The goal of this expedition has been the development/adaptation of cyberinfrastructure that will enable research simulations, datamining, machine learning and visualization of hurricanes and storms utilizing the high performance computing environments including the TeraGrid. Portal grid and web infrastructure are being tested that will enable launching of hundreds of individual WRF (Weather Research and Forecasting) simulations. In a similar way, multiple Regional Ocean Modeling System (ROMS) or WRF/ROMS simulations can be carried out. Metadata and the resulting large volumes of data will then be made available for further study and for educational purposes using analysis, mining, and visualization services. Initial coupling of the ROMS and WRF codes has been completed and parallel I/O is being implemented for these models. Management of these activities (services) are being enabled through Grid workflow technologies (e.g. OGCE). LEAD (Linked Environments for Atmospheric Discovery) is a recently funded 5-year, large NSF ITR grant that involves 9 institutions who are developing a comprehensive national cyberinfrastructure in mesoscale meteorology, particularly one that can interoperate with others being developed. LEAD is addressing the fundamental information technology (IT) research challenges needed

  5. Technical applications of aerogels

    Energy Technology Data Exchange (ETDEWEB)

    Hrubesh, L.W.

    1997-08-18

    Aerogel materials posses such a wide variety of exceptional properties that a striking number of applications have developed for them. Many of the commercial applications of aerogels such as catalysts, thermal insulation, windows, and particle detectors are still under development and new application as have been publicized since the ISA4 Conference in 1994: e.g.; supercapacitors, insulation for heat storage in automobiles, electrodes for capacitive deionization, etc. More applications are evolving as the scientific and engineering community becomes familiar with the unusual and exceptional physical properties of aerogels, there are also scientific and technical application, as well. This paper discusses a variety of applications under development at Lawrence Livermore National Laboratory for which several types of aerogels are formed in custom sizes and shapes. Particular discussions will focus on the uses of aerogels for physics experiments which rely on the exceptional, sometimes unique, properties of aerogels.

  6. Applications of Photocatalytic Disinfection

    Directory of Open Access Journals (Sweden)

    Joanne Gamage

    2010-01-01

    Full Text Available Due to the superior ability of photocatalysis to inactivate a wide range of harmful microorganisms, it is being examined as a viable alternative to traditional disinfection methods such as chlorination, which can produce harmful byproducts. Photocatalysis is a versatile and effective process that can be adapted for use in many applications for disinfection in both air and water matrices. Additionally, photocatalytic surfaces are being developed and tested for use in the context of “self-disinfecting” materials. Studies on the photocatalytic technique for disinfection demonstrate this process to have potential for widespread applications in indoor air and environmental health, biological, and medical applications, laboratory and hospital applications, pharmaceutical and food industry, plant protection applications, wastewater and effluents treatment, and drinking water disinfection. Studies on photocatalytic disinfection using a variety of techniques and test organisms are reviewed, with an emphasis on the end-use application of developed technologies and methods.

  7. Industrial Application of Accelerators

    CERN Document Server

    CERN. Geneva

    2017-01-01

    At CERN, we are very familiar with large, high energy particle accelerators. However, in the world outside CERN, there are more than 35000 accelerators which are used for applications ranging from treating cancer, through making better electronics to removing harmful micro-organisms from food and water. These are responsible for around $0.5T of commerce each year. Almost all are less than 20 MeV and most use accelerator types that are somewhat different from what is at CERN. These lectures will describe some of the most common applications, some of the newer applications in development and the accelerator technology used for them. It will also show examples of where technology developed for particle physics is now being studied for these applications. Rob Edgecock is a Professor of Accelerator Science, with a particular interest in the medical applications of accelerators. He works jointly for the STFC Rutherford Appleton Laboratory and the International Institute for Accelerator Applications at the Univer...

  8. Industrial Application of Accelerators

    CERN Document Server

    CERN. Geneva

    2017-01-01

    At CERN, we are very familiar with large, high energy particle accelerators. However, in the world outside CERN, there are more than 35000 accelerators which are used for applications ranging from treating cancer, through making better electronics to removing harmful micro-organisms from food and water. These are responsible for around $0.5T of commerce each year. Almost all are less than 20 MeV and most use accelerator types that are somewhat different from what is at CERN. These lectures will describe some of the most common applications, some of the newer applications in development and the accelerator technology used for them. It will also show examples of where technology developed for particle physics is now being studied for these applications. Rob Edgecock is a Professor of Accelerator Science, with a particular interest in the medical applications of accelerators. He works jointly for the STFC Rutherford Appleton Laboratory and the International Institute for Accelerator Applications at the Uni...

  9. Microcomputer interfacing and applications

    CERN Document Server

    Mustafa, M A

    1990-01-01

    This is the applications guide to interfacing microcomputers. It offers practical non-mathematical solutions to interfacing problems in many applications including data acquisition and control. Emphasis is given to the definition of the objectives of the interface, then comparing possible solutions and producing the best interface for every situation. Dr Mustafa A Mustafa is a senior designer of control equipment and has written many technical articles and papers on the subject of computers and their application to control engineering.

  10. Exploiting chaos for applications

    Energy Technology Data Exchange (ETDEWEB)

    Ditto, William L., E-mail: wditto@hawaii.edu [Department of Physics and Astronomy, University of Hawaii at Mānoa, Honolulu, Hawaii 96822 (United States); Sinha, Sudeshna, E-mail: sudeshna@iisermohali.ac.in [Indian Institute of Science Education and Research (IISER), Mohali, Knowledge City, Sector 81, SAS Nagar, PO Manauli 140306, Punjab (India)

    2015-09-15

    We discuss how understanding the nature of chaotic dynamics allows us to control these systems. A controlled chaotic system can then serve as a versatile pattern generator that can be used for a range of application. Specifically, we will discuss the application of controlled chaos to the design of novel computational paradigms. Thus, we present an illustrative research arc, starting with ideas of control, based on the general understanding of chaos, moving over to applications that influence the course of building better devices.

  11. Applications of combinatorial optimization

    CERN Document Server

    Paschos, Vangelis Th

    2013-01-01

    Combinatorial optimization is a multidisciplinary scientific area, lying in the interface of three major scientific domains: mathematics, theoretical computer science and management. The three volumes of the Combinatorial Optimization series aims to cover a wide range of topics in this area. These topics also deal with fundamental notions and approaches as with several classical applications of combinatorial optimization. "Applications of Combinatorial Optimization" is presenting a certain number among the most common and well-known applications of Combinatorial Optimization.

  12. Refrigeration systems and applications

    CERN Document Server

    Dincer, Ibrahim

    2010-01-01

    Refrigeration Systems and Applications, 2nd edition offers a comprehensive treatise that addresses real-life technical and operational problems, enabling the reader to gain an understanding of the fundamental principles and the practical applications of refrigeration technology. New and unique analysis techniques (including exergy as a potential tool), models, correlations, procedures and applications are covered, and recent developments in the field are included - many of which are taken from the author's own research activities in this area. The book also includes so

  13. REST based mobile applications

    Science.gov (United States)

    Rambow, Mark; Preuss, Thomas; Berdux, Jörg; Conrad, Marc

    2008-02-01

    Simplicity is the major advantage of REST based webservices. Whereas SOAP is widespread in complex, security sensitive business-to-business aplications, REST is widely used for mashups and end-user centric applicatons. In that context we give an overview of REST and compare it to SOAP. Furthermore we apply the GeoDrawing application as an example for REST based mobile applications and emphasize on pros and cons for the use of REST in mobile application scenarios.

  14. Support vector machines applications

    CERN Document Server

    Guo, Guodong

    2014-01-01

    Support vector machines (SVM) have both a solid mathematical background and good performance in practical applications. This book focuses on the recent advances and applications of the SVM in different areas, such as image processing, medical practice, computer vision, pattern recognition, machine learning, applied statistics, business intelligence, and artificial intelligence. The aim of this book is to create a comprehensive source on support vector machine applications, especially some recent advances.

  15. Microprocessors principles and applications

    CERN Document Server

    Debenham, Michael J

    1979-01-01

    Microprocessors: Principles and Applications deals with the principles and applications of microprocessors and covers topics ranging from computer architecture and programmed machines to microprocessor programming, support systems and software, and system design. A number of microprocessor applications are considered, including data processing, process control, and telephone switching. This book is comprised of 10 chapters and begins with a historical overview of computers and computing, followed by a discussion on computer architecture and programmed machines, paying particular attention to t

  16. GNSS applications and methods

    CERN Document Server

    Gleason, Scott

    2009-01-01

    Placing emphasis on applications development, this unique resource offers a highly practical overview of GNSS (global navigation satellite systems), including GPS. The applications presented in the book range from the traditional location applications to combining GNSS with other sensors and systems and into more exotic areas, such as remote sensing and space weather monitoring. Written by leading experts in the field, this book presents the fundamental underpinnings of GNSS and provides you with detailed examples of various GNSS applications. Moreover, the software included with the book cont

  17. Application Coherency Manager Project

    Data.gov (United States)

    National Aeronautics and Space Administration — This proposal describes an Application Coherency Manager that implements and manages the interdependencies of simulation, data, and platform information. It will...

  18. Hardening Azure applications

    CERN Document Server

    Gaurav, Suraj

    2015-01-01

    Learn what it takes to build large scale, mission critical applications -hardened applications- on the Azure cloud platform. This 208 page book covers the techniques and engineering principles that every architect and developer needs to know to harden their Azure/.NET applications to ensure maximum reliability and high availability when deployed at scale. While the techniques are implemented in .NET and optimized for Azure, the principles here will also be valuable for users of other cloud-based development platforms. Applications come in a variety of forms, from simple apps that can be bui

  19. Mongoose for application development

    CERN Document Server

    Holmes, Simon

    2013-01-01

    This book is a mini tutorial full of code examples and strategies to give you plenty of options when building your own applications with MongoDB.This book is ideal for people who want to develop applications on the Node.js stack quickly and efficiently. Prior knowledge of the stack is not essential as the book briefly covers the installation of the core components and builds all aspects of the example application. The focus of the book is on what Mongoose adds to you applications, so experienced Node.js developers will also benefit.

  20. Nanomaterials for Defense Applications

    Science.gov (United States)

    Turaga, Uday; Singh, Vinitkumar; Lalagiri, Muralidhar; Kiekens, Paul; Ramkumar, Seshadri S.

    Nanotechnology has found a number of applications in electronics and healthcare. Within the textile field, applications of nanotechnology have been limited to filters, protective liners for chemical and biological clothing and nanocoatings. This chapter presents an overview of the applications of nanomaterials such as nanofibers and nanoparticles that are of use to military and industrial sectors. An effort has been made to categorize nanofibers based on the method of production. This chapter particularly focuses on a few latest developments that have taken place with regard to the application of nanomaterials such as metal oxides in the defense arena.

  1. Biomedical Application of Laser

    Institute of Scientific and Technical Information of China (English)

    K. X. He; Alan Chow; Jiada Mo; Wang Zhuo

    2004-01-01

    @@ INTRODUCTION Lasers have revolutionized research and development in medicine and dentistry. They have led to development and production of many new products. Laser applications in diagnosis, treatment and surgery are enormous and have led to speedy and more efficient results, as well as better and quicker healing Processes. The applications could be classified in terms of areas of uses or in terms of instruments/products.In this paper, discussions will not be grouped in a particular fashion, but will be on specific applications. A lot of information on these applications can be found in the Internet. Such information will be mentioned in related discussions and will be given in the appendix.

  2. Application Technology Research Unit

    Data.gov (United States)

    Federal Laboratory Consortium — To conduct fundamental and developmental research on new and improved application technologies to protect floricultural, nursery, landscape, turf, horticultural, and...

  3. Expert Oracle application express

    CERN Document Server

    Scott, John Edward

    2011-01-01

    Expert Oracle Application Express brings you groundbreaking insights into developing with Oracle's enterprise-level, rapid-development tool from some of the best practitioners in the field today. Oracle Application Express (APEX) is an entirely web-based development framework that is built into every edition of Oracle Database. The framework rests upon Oracle's powerful PL/SQL language, enabling power users and developers to rapidly develop applications that easily scale to hundreds, even thousands of concurrent users. The 13 authors of Expert Oracle Application Express build their careers aro

  4. Electrical applications 2

    CERN Document Server

    Tyler, David W

    1998-01-01

    Electrical Applications 2 covers the BTEC NII level objectives in Electrical Applications U86/330. To understand the applications, a knowledge of the underlying principles is needed and these are covered briefly in the text. Key topics discussed are: the transmission and distribution of electrical energy; safety and regulations; tariffs and power factor correction; materials and their applications in the electrical industry; transformers; DC machines; illumination; and fuse protection. Included in each chapter are worked examples which should be carefully worked through before progressing to t

  5. A platform independent communication library for distributed computing

    NARCIS (Netherlands)

    Groen, D.; Rieder, S.; Grosso, P.; de Laat, C.; Portegies Zwart, S.

    2010-01-01

    We present MPWide, a platform independent communication library for performing message passing between supercomputers. Our library couples several local MPI applications through a long distance network using, for example, optical links. The implementation is deliberately kept light-weight, platform

  6. Engineering Adaptive Applications

    DEFF Research Database (Denmark)

    Dolog, Peter

    for a domain.In this book, we propose a new domain engineering framework which extends a development process of Web applications with techniques required when designing such adaptive customizable Web applications. The framework is provided with design abstractions which deal separately with information served...

  7. Nanomaterials in biomedical applications

    DEFF Research Database (Denmark)

    Christiansen, Jesper de Claville; Potarniche, Catalina-Gabriela; Vuluga, Z.

    2011-01-01

    Advances in nano materials have lead to applications in many areas from automotive to electronics and medicine. Nano composites are a popular group of nano materials. Nanocomposites in medical applications provide novel solutions to common problems. Materials for implants, biosensors and drug del...

  8. Application Security Automation

    Science.gov (United States)

    Malaika, Majid A.

    2011-01-01

    With today's high demand for online applications and services running on the Internet, software has become a vital component in our lives. With every revolutionary technology comes challenges unique to its characteristics; for online applications, security is one huge concern and challenge. Currently, there are several schemes that address…

  9. Nanomaterials in biomedical applications

    DEFF Research Database (Denmark)

    Christiansen, Jesper de Claville; Potarniche, Catalina-Gabriela; Vuluga, Z.

    2011-01-01

    Advances in nano materials have lead to applications in many areas from automotive to electronics and medicine. Nano composites are a popular group of nano materials. Nanocomposites in medical applications provide novel solutions to common problems. Materials for implants, biosensors and drug del...

  10. Application Statistics 1987.

    Science.gov (United States)

    Council of Ontario Universities, Toronto.

    Summary statistics on application and registration patterns of applicants wishing to pursue full-time study in first-year places in Ontario universities (for the fall of 1987) are given. Data on registrations were received indirectly from the universities as part of their annual submission of USIS/UAR enrollment data to Statistics Canada and MCU.…

  11. Glasses for photonic applications

    NARCIS (Netherlands)

    Richardson, K.; Krol, D.M.; Hirao, K.

    2010-01-01

    Recent advances in the application of glassy materials in planar and fiber-based photonic structures have led to novel devices and components that go beyond the original thinking of the use of glass in the 1960s, when glass fibers were developed for low-loss, optical communication applications. Expl

  12. Modelling Foundations and Applications

    DEFF Research Database (Denmark)

    This book constitutes the refereed proceedings of the 8th European Conference on Modelling Foundations and Applications, held in Kgs. Lyngby, Denmark, in July 2012. The 20 revised full foundations track papers and 10 revised full applications track papers presented were carefully reviewed...

  13. Progressive Web applications

    CERN Document Server

    CERN. Geneva

    2017-01-01

    Progressive Web Applications are native-like applications running inside of a browser context. In my presentation I would like describe their characteristics, benchmarks and building process using a quick and simple case study example with focus on Service Workers api.

  14. Database Application Schema Forensics

    Directory of Open Access Journals (Sweden)

    Hector Quintus Beyers

    2014-12-01

    Full Text Available The application schema layer of a Database Management System (DBMS can be modified to deliver results that may warrant a forensic investigation. Table structures can be corrupted by changing the metadata of a database or operators of the database can be altered to deliver incorrect results when used in queries. This paper will discuss categories of possibilities that exist to alter the application schema with some practical examples. Two forensic environments are introduced where a forensic investigation can take place in. Arguments are provided why these environments are important. Methods are presented how these environments can be achieved for the application schema layer of a DBMS. A process is proposed on how forensic evidence should be extracted from the application schema layer of a DBMS. The application schema forensic evidence identification process can be applied to a wide range of forensic settings.

  15. Mobile Learning Applications Audit

    Directory of Open Access Journals (Sweden)

    Paul POCATILU

    2010-01-01

    Full Text Available While mobile learning (m-learning applications have proven their value in educational activities, there is a need to measure their reliability, accessibility and further more their trustworthiness. Mobile devices are far more vulnerable then classic computers and present inconvenient interfaces due to their size, hardware limitations and their mobile connectivity. Mobile learning applications should be audited to determine if they should be trusted or not, while multimedia contents like automatic speech recognition (ASR can improve their accessibility. This article will start with a brief introduction on m-learning applications, then it will present the audit process for m-learning applications, it will iterate their specific security threats, it will define the ASR process, and it will elaborate how ASR can enhance accessibility of these types of applications.

  16. Geometry and its applications

    CERN Document Server

    Meyer, Walter J

    2006-01-01

    Meyer''s Geometry and Its Applications, Second Edition, combines traditional geometry with current ideas to present a modern approach that is grounded in real-world applications. It balances the deductive approach with discovery learning, and introduces axiomatic, Euclidean geometry, non-Euclidean geometry, and transformational geometry. The text integrates applications and examples throughout and includes historical notes in many chapters. The Second Edition of Geometry and Its Applications is a significant text for any college or university that focuses on geometry''s usefulness in other disciplines. It is especially appropriate for engineering and science majors, as well as future mathematics teachers.* Realistic applications integrated throughout the text, including (but not limited to): - Symmetries of artistic patterns- Physics- Robotics- Computer vision- Computer graphics- Stability of architectural structures- Molecular biology- Medicine- Pattern recognition* Historical notes included in many chapters...

  17. Lasers Fundamentals and Applications

    CERN Document Server

    Thyagarajan, K

    2010-01-01

    Lasers: Fundamentals and Applications, serves as a vital textbook to accompany undergraduate and graduate courses on lasers and their applications. Ever since their invention in 1960, lasers have assumed tremendous importance in the fields of science, engineering and technology because of their diverse uses in basic research and countless technological applications. This book provides a coherent presentation of the basic physics behind the way lasers work, and presents some of their most important applications in vivid detail. After reading this book, students will understand how to apply the concepts found within to practical, tangible situations. This textbook includes worked-out examples and exercises to enhance understanding, and the preface shows lecturers how to most beneficially match the textbook with their course curricula. The book includes several recent Nobel Lectures, which will further expose students to the emerging applications and excitement of working with lasers. Students who study lasers, ...

  18. LACCASE: PROPERTIES AND APPLICATIONS

    Directory of Open Access Journals (Sweden)

    Vernekar Madhavi

    2009-11-01

    Full Text Available Laccases (benzenediol:oxygen oxidoreductase, EC 1.10.3.2 are multi-copper oxidases that are widely distributed among plants, insects, and fungi. They have been described in different genera of ascomycetes, some deuteromycetes, and mainly in basidiomycetes. These enzymes catalyze the one-electron oxidation of a wide variety of organic and inorganic substrates, including mono-, di-, and polyphenols, amino-phenols, methoxyphenols, aromatic amines, and ascorbate, with the concomitant four electron reduction of oxygen to water. Laccase is currently the focus of much attention because of its diverse applications, such as delignification of lignocellulosics, crosslinking of polysaccha-rides, bioremediation applications, such as waste detoxification, and textile dye transformation, food technologic uses, personal and medical care applications, and biosensor and analytical applications. This review helps to understand the properties of this important enzyme for efficient utilization for its biotechnological and environmental applications.

  19. Applications of ionizing radiations

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2014-07-01

    Developments in standard applications and brand new nuclear technologies, with high impact on the future of the agriculture, medicine, industry and the environmental preservation. The Radiation Technology Center (CTR) mission is to apply the radiation and radioisotope technologies in Industry, Health, Agriculture, and Environmental Protection, expanding the scientific knowledge, improving human power resources, transferring technology, generating products and offering services for the Brazilian society. The CTR main R and D activities are in consonance with the IPEN Director Plan (2011-2013) and the Applications of Ionizing Radiation Program, with four subprograms: Irradiation of Food and Agricultural Products; Radiation and Radioisotopes Applications in Industry and Environment; Radioactive Sources and Radiation Applications in Human Health; and Radioactive Facilities and Equipment for the Applications of Nuclear Techniques.

  20. GPU computing and applications

    CERN Document Server

    See, Simon

    2015-01-01

    This book presents a collection of state of the art research on GPU Computing and Application. The major part of this book is selected from the work presented at the 2013 Symposium on GPU Computing and Applications held in Nanyang Technological University, Singapore (Oct 9, 2013). Three major domains of GPU application are covered in the book including (1) Engineering design and simulation; (2) Biomedical Sciences; and (3) Interactive & Digital Media. The book also addresses the fundamental issues in GPU computing with a focus on big data processing. Researchers and developers in GPU Computing and Applications will benefit from this book. Training professionals and educators can also benefit from this book to learn the possible application of GPU technology in various areas.

  1. Stirling engine application study

    Science.gov (United States)

    Teagan, W. P.; Cunningham, D.

    1983-01-01

    A range of potential applications for Stirling engines in the power range from 0.5 to 5000 hp is surveyed. Over one hundred such engine applications are grouped into a small number of classes (10), with the application in each class having a high degree of commonality in technical performance and cost requirements. A review of conventional engines (usually spark ignition or Diesel) was then undertaken to determine the degree to which commercial engine practice now serves the needs of the application classes and to detemine the nature of the competition faced by a new engine system. In each application class the Stirling engine was compared to the conventional engines, assuming that objectives of ongoing Stirling engine development programs are met. This ranking process indicated that Stirling engines showed potential for use in all application classes except very light duty applications (lawn mowers, etc.). However, this potential is contingent on demonstrating much greater operating life and reliability than has been demonstrated to date by developmental Stirling engine systems. This implies that future program initiatives in developing Stirling engine systems should give more emphasis to life and reliability issues than has been the case in ongoing programs.

  2. User Types in Online Applications

    Directory of Open Access Journals (Sweden)

    Ion IVAN

    2011-08-01

    Full Text Available Online applications are presented in the context of information society. Online applications characteristics are analyzed. Quality characteristics are presented in relation to online applications users. Types of users for AVIO application are presented. Use cases for AVIO application are identified. The limitations of AVIO application are defined. Types of users in online applications are identified. The threedimensional matrix of access to the online application resources is built. The user type-oriented database is structured. Access management of the fields related to the database tables is analyzed. The classification of online applications users is done.

  3. Building Social Web Applications

    CERN Document Server

    Bell, Gavin

    2009-01-01

    Building a web application that attracts and retains regular visitors is tricky enough, but creating a social application that encourages visitors to interact with one another requires careful planning. This book provides practical solutions to the tough questions you'll face when building an effective community site -- one that makes visitors feel like they've found a new home on the Web. If your company is ready to take part in the social web, this book will help you get started. Whether you're creating a new site from scratch or reworking an existing site, Building Social Web Applications

  4. Express web application development

    CERN Document Server

    Yaapa, Hage

    2013-01-01

    Express Web Application Development is a practical introduction to learning about Express. Each chapter introduces you to a different area of Express, using screenshots and examples to get you up and running as quickly as possible.If you are looking to use Express to build your next web application, ""Express Web Application Development"" will help you get started and take you right through to Express' advanced features. You will need to have an intermediate knowledge of JavaScript to get the most out of this book.

  5. Developing Large Web Applications

    CERN Document Server

    Loudon, Kyle

    2010-01-01

    How do you create a mission-critical site that provides exceptional performance while remaining flexible, adaptable, and reliable 24/7? Written by the manager of a UI group at Yahoo!, Developing Large Web Applications offers practical steps for building rock-solid applications that remain effective even as you add features, functions, and users. You'll learn how to develop large web applications with the extreme precision required for other types of software. Avoid common coding and maintenance headaches as small websites add more pages, more code, and more programmersGet comprehensive soluti

  6. Biomedical applications of polymers

    CERN Document Server

    Gebelein, C G

    1991-01-01

    The biomedical applications of polymers span an extremely wide spectrum of uses, including artificial organs, skin and soft tissue replacements, orthopaedic applications, dental applications, and controlled release of medications. No single, short review can possibly cover all these items in detail, and dozens of books andhundreds of reviews exist on biomedical polymers. Only a few relatively recent examples will be cited here;additional reviews are listed under most of the major topics in this book. We will consider each of the majorclassifications of biomedical polymers to some extent, inclu

  7. Optical materials and applications

    CERN Document Server

    Wakaki, Moriaki; Kudo, Keiei

    2012-01-01

    The definition of optical material has expanded in recent years, largely because of IT advances that have led to rapid growth in optoelectronics applications. Helping to explain this evolution, Optical Materials and Applications presents contributions from leading experts who explore the basic concepts of optical materials and the many typical applications in which they are used. An invaluable reference for readers ranging from professionals to technical managers to graduate engineering students, this book covers everything from traditional principles to more cutting-edge topics. It also detai

  8. Polythiophenes in biological applications.

    Science.gov (United States)

    Sista, Prakash; Ghosh, Koushik; Martinez, Jennifer S; Rocha, Reginaldo C

    2014-01-01

    Polythiophene and its derivatives have shown tremendous potential for interfacing electrically conducting polymers with biological applications. These semiconducting organic polymers are relatively soft, conduct electrons and ions, have low cytotoxicity, and can undergo facile chemical modifications. In addition, the reduction in electrical impedance of electrodes coated with polythiophenes may prove to be invaluable for a stable and permanent connection between devices and biological tissues. This review article focuses on the synthesis and some key applications of polythiophenes in multidisciplinary areas at the interface with biology. These polymers have shown tremendous potential in biological applications such as diagnostics, therapy, drug delivery, imaging, implant devices and artificial organs.

  9. Wind energy applications guide

    Energy Technology Data Exchange (ETDEWEB)

    anon.

    2001-01-01

    The brochure is an introduction to various wind power applications for locations with underdeveloped transmission systems, from remote water pumping to village electrification. It includes an introductory section on wind energy, including wind power basics and system components and then provides examples of applications, including water pumping, stand-alone systems for home and business, systems for community centers, schools, and health clinics, and examples in the industrial area. There is also a page of contacts, plus two specific example applications for a wind-diesel system for a remote station in Antarctica and one on wind-diesel village electrification in Russia.

  10. Professional Tizen application development

    CERN Document Server

    Jaygarl, HoJun; Kim, YoonSoo; Choi, Eunyoung; Bradwick, Kevin; Lansdell

    2014-01-01

    Create powerful, marketable applications with Tizen for the smartphone and beyond  Tizen is the only platform designed for multiple device categories that is HTML5-centric and entirely open source. Written by experts in the field, this comprehensive guide includes chapters on both web and native application development, covering subjects such as location and social features, advanced UIs, animations, sensors and multimedia. This book is a comprehensive resource for learning how to develop Tizen web and native applications that are polished, bug-free and ready to sell on a range of smart dev

  11. Handbook of satellite applications

    CERN Document Server

    Madry, Scott; Camacho-Lara, Sergio

    2017-01-01

    The first edition of this ground breaking reference work was the most comprehensive reference source available about the key aspects of the satellite applications field. This updated second edition covers the technology, the markets, applications and regulations related to satellite telecommunications, broadcasting and networking—including civilian and military systems; precise satellite navigation and timing networks (i.e. GPS and others); remote sensing and meteorological satellite systems. Created under the auspices of the International Space University based in France, this brand new edition is now expanded to cover new innovative small satellite constellations, new commercial launching systems, innovation in military application satellites and their acquisition, updated appendices, a useful glossary and more.

  12. Professional Cocoa Application Security

    CERN Document Server

    Lee, Graham J

    2010-01-01

    The first comprehensive security resource for Mac and iPhone developers. The Mac platform is legendary for security, but consequently, Apple developers have little appropriate security information available to help them assure that their applications are equally secure. This Wrox guide provides the first comprehensive go-to resource for Apple developers on the available frameworks and features that support secure application development.: While Macs are noted for security, developers still need to design applications for the Mac and the iPhone with security in mind; this guide offers the first

  13. Mobile Augmented Reality Applications

    CERN Document Server

    Prochazka, David; Popelka, Ondrej; Stastny, Jiri

    2011-01-01

    Augmented reality have undergone considerable improvement in past years. Many special techniques and hardware devices were developed, but the crucial breakthrough came with the spread of intelligent mobile phones. This enabled mass spread of augmented reality applications. However mobile devices have limited hardware capabilities, which narrows down the methods usable for scene analysis. In this article we propose an augmented reality application which is using cloud computing to enable using of more complex computational methods such as neural networks. Our goal is to create an affordable augmented reality application suitable which will help car designers in by 'virtualizing' car modifications.

  14. Android Applications Security

    Directory of Open Access Journals (Sweden)

    Paul POCATILU

    2011-01-01

    Full Text Available The use of smartphones worldwide is growing very fast and also the malicious attacks have increased. The mobile security applications development keeps the pace with this trend. The paper presents the vulnerabilities of mobile applications. The Android applications and devices are analyzed through the security perspective. The usage of restricted API is also presented. The paper also focuses on how users can prevent these malicious attacks and propose some prevention measures, including the architecture of a mobile security system for Android devices.

  15. Biomaterials and therapeutic applications

    Science.gov (United States)

    Ferraro, Angelo

    2016-03-01

    A number of organic and inorganic, synthetic or natural derived materials have been classified as not harmful for the human body and are appropriate for medical applications. These materials are usually named biomaterials since they are suitable for introduction into living human tissues of prosthesis, as well as for drug delivery, diagnosis, therapies, tissue regeneration and many other clinical applications. Recently, nanomaterials and bioabsorbable polymers have greatly enlarged the fields of application of biomaterials attracting much more the attention of the biomedical community. In this review paper I am going to discuss the most recent advances in the use of magnetic nanoparticles and biodegradable materials as new biomedical tools.

  16. Graphene: synthesis and applications

    Directory of Open Access Journals (Sweden)

    Phaedon Avouris

    2012-03-01

    Full Text Available Graphene, since the demonstration of its easy isolation by the exfoliation of graphite in 2004 by Novoselov, Geim and co-workers, has been attracting enormous attention in the scientific community. Because of its unique properties, high hopes have been placed on it for technological applications in many areas. Here we will briefly review aspects of two of these application areas: analog electronics and photonics/optoelectronics. We will discuss the relevant material properties, device physics, and some of the available results. Of course, we cannot rely on graphite exfoliation as the source of graphene for technological applications, so we will start by introducing large scale graphene growth techniques.

  17. Lift application development cookbook

    CERN Document Server

    Garcia, Gilberto T

    2013-01-01

    Lift Application Development Cookbook contains practical recipes on everything you will need to create secure web applications using this amazing framework.The book first teaches you basic topics such as starting a new application and gradually moves on to teach you advanced topics to achieve a certain task. Then, it explains every step in detail so that you can build your knowledge about how things work.This book is for developers who have at least some basic knowledge about Scala and who are looking for a functional, secure, and modern web framework. Prior experience with HTML and JavaScript

  18. Professional mobile application development

    CERN Document Server

    McWherter, Jeff

    2012-01-01

    Create applications for all major smartphone platforms Creating applications for the myriad versions and varieties of mobile phone platforms on the market can be daunting to even the most seasoned developer. This authoritative guide is written in such as way that it takes your existing skills and experience and uses that background as a solid foundation for developing applications that cross over between platforms, thereby freeing you from having to learn a new platform from scratch each time. Concise explanations walk you through the tools and patterns for developing for all the mobile platfo

  19. Introducing ZEUS-MP A 3D, Parallel, Multiphysics Code for Astrophysical Fluid Dynamics

    CERN Document Server

    Norman, M L

    2000-01-01

    We describe ZEUS-MP: a Multi-Physics, Massively-Parallel, Message-Passing code for astrophysical fluid dynamics simulations in 3 dimensions. ZEUS-MP is a follow-on to the sequential ZEUS-2D and ZEUS-3D codes developed and disseminated by the Laboratory for Computational Astrophysics (lca.ncsa.uiuc.edu) at NCSA. V1.0 released 1/1/2000 includes the following physics modules: ideal hydrodynamics, ideal MHD, and self-gravity. Future releases will include flux-limited radiation diffusion, thermal heat conduction, two-temperature plasma, and heating and cooling functions. The covariant equations are cast on a moving Eulerian grid with Cartesian, cylindrical, and spherical polar coordinates currently supported. Parallelization is done by domain decomposition and implemented in F77 and MPI. The code is portable across a wide range of platforms from networks of workstations to massively parallel processors. Some parallel performance results are presented as well as an application to turbulent star formation.

  20. Neutron sources and applications

    Energy Technology Data Exchange (ETDEWEB)

    Price, D.L. [ed.] [Argonne National Lab., IL (United States); Rush, J.J. [ed.] [National Inst. of Standards and Technology, Gaithersburg, MD (United States)

    1994-01-01

    Review of Neutron Sources and Applications was held at Oak Brook, Illinois, during September 8--10, 1992. This review involved some 70 national and international experts in different areas of neutron research, sources, and applications. Separate working groups were asked to (1) review the current status of advanced research reactors and spallation sources; and (2) provide an update on scientific, technological, and medical applications, including neutron scattering research in a number of disciplines, isotope production, materials irradiation, and other important uses of neutron sources such as materials analysis and fundamental neutron physics. This report summarizes the findings and conclusions of the different working groups involved in the review, and contains some of the best current expertise on neutron sources and applications.