WorldWideScience

Sample records for high-performance distributed computations

  1. Distributed metadata in a high performance computing environment

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Zhang, Zhenhua; Liu, Xuezhao; Tang, Haiying

    2017-07-11

    A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination that a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.

  2. High Performance Distributed Computing in a Supercomputer Environment: Computational Services and Applications Issues

    Science.gov (United States)

    Kramer, Williams T. C.; Simon, Horst D.

    1994-01-01

    This tutorial proposes to be a practical guide for the uninitiated to the main topics and themes of high-performance computing (HPC), with particular emphasis to distributed computing. The intent is first to provide some guidance and directions in the rapidly increasing field of scientific computing using both massively parallel and traditional supercomputers. Because of their considerable potential computational power, loosely or tightly coupled clusters of workstations are increasingly considered as a third alternative to both the more conventional supercomputers based on a small number of powerful vector processors, as well as high massively parallel processors. Even though many research issues concerning the effective use of workstation clusters and their integration into a large scale production facility are still unresolved, such clusters are already used for production computing. In this tutorial we will utilize the unique experience made at the NAS facility at NASA Ames Research Center. Over the last five years at NAS massively parallel supercomputers such as the Connection Machines CM-2 and CM-5 from Thinking Machines Corporation and the iPSC/860 (Touchstone Gamma Machine) and Paragon Machines from Intel were used in a production supercomputer center alongside with traditional vector supercomputers such as the Cray Y-MP and C90.

  3. A Distributed Multi Agents Based Platform for High Performance Computing Infrastructures

    OpenAIRE

    Kiourt, Chairi; Kalles, Dimitris

    2016-01-01

    This work introduces a novel, modular, layered web based platform for managing machine learning experiments on grid-based High Performance Computing infrastructures. The coupling of the communication services offered by the grid, with an administration layer and conventional web server programming, via a data synchronization utility, leads to the straightforward development of a web-based user interface that allows the monitoring and managing of diverse online distributed computing applicatio...

  4. A configurable distributed high-performance computing framework for satellite's TDI-CCD imaging simulation

    Science.gov (United States)

    Xue, Bo; Mao, Bingjing; Chen, Xiaomei; Ni, Guoqiang

    2010-11-01

    This paper renders a configurable distributed high performance computing(HPC) framework for TDI-CCD imaging simulation. It uses strategy pattern to adapt multi-algorithms. Thus, this framework help to decrease the simulation time with low expense. Imaging simulation for TDI-CCD mounted on satellite contains four processes: 1) atmosphere leads degradation, 2) optical system leads degradation, 3) electronic system of TDI-CCD leads degradation and re-sampling process, 4) data integration. Process 1) to 3) utilize diversity data-intensity algorithms such as FFT, convolution and LaGrange Interpol etc., which requires powerful CPU. Even uses Intel Xeon X5550 processor, regular series process method takes more than 30 hours for a simulation whose result image size is 1500 * 1462. With literature study, there isn't any mature distributing HPC framework in this field. Here we developed a distribute computing framework for TDI-CCD imaging simulation, which is based on WCF[1], uses Client/Server (C/S) layer and invokes the free CPU resources in LAN. The server pushes the process 1) to 3) tasks to those free computing capacity. Ultimately we rendered the HPC in low cost. In the computing experiment with 4 symmetric nodes and 1 server , this framework reduced about 74% simulation time. Adding more asymmetric nodes to the computing network, the time decreased namely. In conclusion, this framework could provide unlimited computation capacity in condition that the network and task management server are affordable. And this is the brand new HPC solution for TDI-CCD imaging simulation and similar applications.

  5. VLab: A Science Gateway for Distributed First Principles Calculations in Heterogeneous High Performance Computing Systems

    Science.gov (United States)

    da Silveira, Pedro Rodrigo Castro

    2014-01-01

    This thesis describes the development and deployment of a cyberinfrastructure for distributed high-throughput computations of materials properties at high pressures and/or temperatures--the Virtual Laboratory for Earth and Planetary Materials--VLab. VLab was developed to leverage the aggregated computational power of grid systems to solve…

  6. Performance tuning for high performance computing systems

    OpenAIRE

    Pahuja, Himanshu

    2017-01-01

    A Distributed System is composed by integration between loosely coupled software components and the underlying hardware resources that can be distributed over the standard internet framework. High Performance Computing used to involve utilization of supercomputers which could churn a lot of computing power to process massively complex computational tasks, but is now evolving across distributed systems, thereby having the ability to utilize geographically distributed computing resources. We...

  7. High-Performance Computation of Distributed-Memory Parallel 3D Voronoi and Delaunay Tessellation

    Energy Technology Data Exchange (ETDEWEB)

    Peterka, Tom; Morozov, Dmitriy; Phillips, Carolyn

    2014-11-14

    Computing a Voronoi or Delaunay tessellation from a set of points is a core part of the analysis of many simulated and measured datasets: N-body simulations, molecular dynamics codes, and LIDAR point clouds are just a few examples. Such computational geometry methods are common in data analysis and visualization; but as the scale of simulations and observations surpasses billions of particles, the existing serial and shared-memory algorithms no longer suffice. A distributed-memory scalable parallel algorithm is the only feasible approach. The primary contribution of this paper is a new parallel Delaunay and Voronoi tessellation algorithm that automatically determines which neighbor points need to be exchanged among the subdomains of a spatial decomposition. Other contributions include periodic and wall boundary conditions, comparison of our method using two popular serial libraries, and application to numerous science datasets.

  8. The High Performance Computing Initiative

    Science.gov (United States)

    Holcomb, Lee B.; Smith, Paul H.; Macdonald, Michael J.

    1991-01-01

    The paper discusses NASA High Performance Computing Initiative (HPCI), an essential component of the Federal High Performance Computing Program. The HPCI program is designed to provide a thousandfold increase in computing performance, and apply the technologies to NASA 'Grand Challenges'. The Grand Challenges chosen include integrated multidisciplinary simulations and design optimizations of aerospace vehicles throughout the mission profiles; the multidisciplinary modeling and data analysis of the earth and space science physical phenomena; and the spaceborne control of automated systems, handling, and analysis of sensor data and real-time response to sensor stimuli.

  9. Optimization procedure for algorithms of task scheduling in high performance heterogeneous distributed computing systems

    Directory of Open Access Journals (Sweden)

    Nirmeen A. Bahnasawy

    2011-11-01

    Full Text Available In distributed computing, the schedule by which tasks are assigned to processors is critical to minimizing the execution time of the application. However, the problem of discovering the schedule that gives the minimum execution time is NP-complete. In this paper, a new task scheduling algorithm called Sorted Nodes in Leveled DAG Division (SNLDD is introduced and developed for HeDCSs with consider a bounded number of processors. The main principle of the developed algorithm is to divide the Directed Acyclic Graph (DAG into levels and sort the tasks in each level according to their computation size in descending order. To evaluate the performance of the developed SNLDD algorithm, a comparative study has been done between the developed SNLDD algorithm and the Longest Dynamic Critical Path (LDCP algorithm which is considered the most efficient existing algorithm. According to the comparative results, it is found that the performance of the developed algorithm provides better performance than the LDCP algorithm in terms of speedup, efficiency, complexity, and quality. Also, a new procedure called Superior Performance Optimization Procedure (SPOP has been introduced and implemented in the developed SNLDD algorithm and the LDCP algorithm to minimize the sleek time of the processors in the system. Again, the performance of the SNLDD algorithm outperforms the existing LDCP algorithm after adding the SPOP procedure.

  10. Technologies and tools for high-performance distributed computing. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Karonis, Nicholas T.

    2000-05-01

    In this project we studied the practical use of the MPI message-passing interface in advanced distributed computing environments. We built on the existing software infrastructure provided by the Globus Toolkit{trademark}, the MPICH portable implementation of MPI, and the MPICH-G integration of MPICH with Globus. As a result of this project we have replaced MPICH-G with its successor MPICH-G2, which is also an integration of MPICH with Globus. MPICH-G2 delivers significant improvements in message passing performance when compared to its predecessor MPICH-G and was based on superior software design principles resulting in a software base that was much easier to make the functional extensions and improvements we did. Using Globus services we replaced the default implementation of MPI's collective operations in MPICH-G2 with more efficient multilevel topology-aware collective operations which, in turn, led to the development of a new timing methodology for broadcasts [8]. MPICH-G2 was extended to include client/server functionality from the MPI-2 standard [23] to facilitate remote visualization applications and, through the use of MPI idioms, MPICH-G2 provided application-level control of quality-of-service parameters as well as application-level discovery of underlying Grid-topology information. Finally, MPICH-G2 was successfully used in a number of applications including an award-winning record-setting computation in numerical relativity. In the sections that follow we describe in detail the accomplishments of this project, we present experimental results quantifying the performance improvements, and conclude with a discussion of our applications experiences. This project resulted in a significant increase in the utility of MPICH-G2.

  11. High Performance Computing at NASA

    Science.gov (United States)

    Bailey, David H.; Cooper, D. M. (Technical Monitor)

    1994-01-01

    The speaker will give an overview of high performance computing in the U.S. in general and within NASA in particular, including a description of the recently signed NASA-IBM cooperative agreement. The latest performance figures of various parallel systems on the NAS Parallel Benchmarks will be presented. The speaker was one of the authors of the NAS (National Aerospace Standards) Parallel Benchmarks, which are now widely cited in the industry as a measure of sustained performance on realistic high-end scientific applications. It will be shown that significant progress has been made by the highly parallel supercomputer industry during the past year or so, with several new systems, based on high-performance RISC processors, that now deliver superior performance per dollar compared to conventional supercomputers. Various pitfalls in reporting performance will be discussed. The speaker will then conclude by assessing the general state of the high performance computing field.

  12. Computational Biology and High Performance Computing 2000

    Energy Technology Data Exchange (ETDEWEB)

    Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

    2000-10-19

    The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.

  13. Information Power Grid: Distributed High-Performance Computing and Large-Scale Data Management for Science and Engineering

    Science.gov (United States)

    Johnston, William E.; Gannon, Dennis; Nitzberg, Bill

    2000-01-01

    We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3

  14. High performance computing and communications program

    Science.gov (United States)

    Holcomb, Lee

    1992-01-01

    A review of the High Performance Computing and Communications (HPCC) program is provided in vugraph format. The goals and objectives of this federal program are as follows: extend U.S. leadership in high performance computing and computer communications; disseminate the technologies to speed innovation and to serve national goals; and spur gains in industrial competitiveness by making high performance computing integral to design and production.

  15. Inclusive vision for high performance computing at the CSIR

    CSIR Research Space (South Africa)

    Gazendam, A

    2006-02-01

    Full Text Available gaining popularity in South Africa. One reason for this relatively slow adoption is the lack of appropriate scientific computing infrastructure. Open and distributed high-performance computing (HPC) represents a radically new computing concept for data...

  16. High performance computing in Windows Azure cloud

    OpenAIRE

    Ambruš, Dejan

    2013-01-01

    High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...

  17. High-performance computing using FPGAs

    CERN Document Server

    Benkrid, Khaled

    2013-01-01

    This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community.  The book includes:  Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation.     Seven architecture chapters which...

  18. Molecular simulation workflows as parallel algorithms: the execution engine of Copernicus, a distributed high-performance computing platform.

    Science.gov (United States)

    Pronk, Sander; Pouya, Iman; Lundborg, Magnus; Rotskoff, Grant; Wesén, Björn; Kasson, Peter M; Lindahl, Erik

    2015-06-09

    Computational chemistry and other simulation fields are critically dependent on computing resources, but few problems scale efficiently to the hundreds of thousands of processors available in current supercomputers-particularly for molecular dynamics. This has turned into a bottleneck as new hardware generations primarily provide more processing units rather than making individual units much faster, which simulation applications are addressing by increasingly focusing on sampling with algorithms such as free-energy perturbation, Markov state modeling, metadynamics, or milestoning. All these rely on combining results from multiple simulations into a single observation. They are potentially powerful approaches that aim to predict experimental observables directly, but this comes at the expense of added complexity in selecting sampling strategies and keeping track of dozens to thousands of simulations and their dependencies. Here, we describe how the distributed execution framework Copernicus allows the expression of such algorithms in generic workflows: dataflow programs. Because dataflow algorithms explicitly state dependencies of each constituent part, algorithms only need to be described on conceptual level, after which the execution is maximally parallel. The fully automated execution facilitates the optimization of these algorithms with adaptive sampling, where undersampled regions are automatically detected and targeted without user intervention. We show how several such algorithms can be formulated for computational chemistry problems, and how they are executed efficiently with many loosely coupled simulations using either distributed or parallel resources with Copernicus.

  19. High performance computing at Sandia National Labs

    Energy Technology Data Exchange (ETDEWEB)

    Cahoon, R.M.; Noe, J.P.; Vandevender, W.H.

    1995-10-01

    Sandia`s High Performance Computing Environment requires a hierarchy of resources ranging from desktop, to department, to centralized, and finally to very high-end corporate resources capable of teraflop performance linked via high-capacity Asynchronous Transfer Mode (ATM) networks. The mission of the Scientific Computing Systems Department is to provide the support infrastructure for an integrated corporate scientific computing environment that will meet Sandia`s needs in high-performance and midrange computing, network storage, operational support tools, and systems management. This paper describes current efforts at SNL/NM to expand and modernize centralized computing resources in support of this mission.

  20. Debugging a high performance computing program

    Science.gov (United States)

    Gooding, Thomas M.

    2013-08-20

    Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.

  1. High-performance computing reveals missing genes

    OpenAIRE

    Whyte, Barry James

    2010-01-01

    Scientists at the Virginia Bioinformatics Institute and the Department of Computer Science at Virginia Tech have used high-performance computing to locate small genes that have been missed by scientists in their quest to define the microbial DNA sequences of life.

  2. High performance computing on vector systems

    CERN Document Server

    Roller, Sabine

    2008-01-01

    Presents the developments in high-performance computing and simulation on modern supercomputer architectures. This book covers trends in hardware and software development in general and specifically the vector-based systems and heterogeneous architectures. It presents innovative fields like coupled multi-physics or multi-scale simulations.

  3. High Performance Computing and Communications Panel Report.

    Science.gov (United States)

    President's Council of Advisors on Science and Technology, Washington, DC.

    This report offers advice on the strengths and weaknesses of the High Performance Computing and Communications (HPCC) initiative, one of five presidential initiatives launched in 1992 and coordinated by the Federal Coordinating Council for Science, Engineering, and Technology. The HPCC program has the following objectives: (1) to extend U.S.…

  4. Scalable resource management in high performance computers.

    Energy Technology Data Exchange (ETDEWEB)

    Frachtenberg, E. (Eitan); Petrini, F. (Fabrizio); Fernandez Peinador, J. (Juan); Coll, S. (Salvador)

    2002-01-01

    Clusters of workstations have emerged as an important platform for building cost-effective, scalable and highly-available computers. Although many hardware solutions are available today, the largest challenge in making large-scale clusters usable lies in the system software. In this paper we present STORM, a resource management tool designed to provide scalability, low overhead and the flexibility necessary to efficiently support and analyze a wide range of job scheduling algorithms. STORM achieves these feats by closely integrating the management daemons with the low-level features that are common in state-of-the-art high-performance system area networks. The architecture of STORM is based on three main technical innovations. First, a sizable part of the scheduler runs in the thread processor located on the network interface. Second, we use hardware collectives that are highly scalable both for implementing control heartbeats and to distribute the binary of a parallel job in near-constant time, irrespective of job and machine sizes. Third, we use an I/O bypass protocol that allows fast data movements from the file system to the communication buffers in the network interface and vice versa. The experimental results show that STORM can launch a job with a binary of 12MB on a 64 processor/32 node cluster in less than 0.25 sec on an empty network, in less than 0.45 sec when all the processors are busy computing other jobs, and in less than 0.65 sec when the network is flooded with a background traffic. This paper provides experimental and analytical evidence that these results scale to a much larger number of nodes. To the best of our knowledge, STORM is at least two orders of magnitude faster than existing production schedulers in launching jobs, performing resource management tasks and gang scheduling.

  5. A Globally Distributed Grid Monitoring System to Facilitate High-Performance Computing at D0/SAM-Grid

    Energy Technology Data Exchange (ETDEWEB)

    Rana, Abhishek S. [Texas U., Arlington

    2002-01-01

    A grid environment involves large scale sharing of resources that are distributed from a geographical or an administrative perspective. There is a need for systems that enable continuous discovery and monitoring of the components of a grid. In this work, we discuss the development and deployment of a monitoring system that has been designed as a prototype for the D0/SAM-Grid. We have developed a system that uses a layered architecture for information generation and processing, utilizes the various grid middleware tools, and implements Integration and Enquiry Protocols using existing Discovery Protocols to provide a user with a coherent view of all current activity in this grid - in the form of a web portal interface. The prototype system has been deployed for monitoring of 11 sites geographically distributed in 5 countries across 3 continents. This work focuses on the D0/SAM-Grid, and is based on the SAM system developed at Fermilab.

  6. High-performance computing for airborne applications

    Energy Technology Data Exchange (ETDEWEB)

    Quinn, Heather M [Los Alamos National Laboratory; Manuzzato, Andrea [Los Alamos National Laboratory; Fairbanks, Tom [Los Alamos National Laboratory; Dallmann, Nicholas [Los Alamos National Laboratory; Desgeorges, Rose [Los Alamos National Laboratory

    2010-06-28

    Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.

  7. High Performance Computing Operations Review Report

    Energy Technology Data Exchange (ETDEWEB)

    Cupps, Kimberly C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2013-12-19

    The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.

  8. PREFACE: High Performance Computing Symposium 2011

    Science.gov (United States)

    Talon, Suzanne; Mousseau, Normand; Peslherbe, Gilles; Bertrand, François; Gauthier, Pierre; Kadem, Lyes; Moitessier, Nicolas; Rouleau, Guy; Wittig, Rod

    2012-02-01

    HPCS (High Performance Computing Symposium) is a multidisciplinary conference that focuses on research involving High Performance Computing and its application. Attended by Canadian and international experts and renowned researchers in the sciences, all areas of engineering, the applied sciences, medicine and life sciences, mathematics, the humanities and social sciences, it is Canada's pre-eminent forum for HPC. The 25th edition was held in Montréal, at the Université du Québec à Montréal, from 15-17 June and focused on HPC in Medical Science. The conference was preceded by tutorials held at Concordia University, where 56 participants learned about HPC best practices, GPU computing, parallel computing, debugging and a number of high-level languages. 274 participants from six countries attended the main conference, which involved 11 invited and 37 contributed oral presentations, 33 posters, and an exhibit hall with 16 booths from our sponsors. The work that follows is a collection of papers presented at the conference covering HPC topics ranging from computer science to bioinformatics. They are divided here into four sections: HPC in Engineering, Physics and Materials Science, HPC in Medical Science, HPC Enabling to Explore our World and New Algorithms for HPC. We would once more like to thank the participants and invited speakers, the members of the Scientific Committee, the referees who spent time reviewing the papers and our invaluable sponsors. To hear the invited talks and learn about 25 years of HPC development in Canada visit the Symposium website: http://2011.hpcs.ca/lang/en/conference/keynote-speakers/ Enjoy the excellent papers that follow, and we look forward to seeing you in Vancouver for HPCS 2012! Gilles Peslherbe Chair of the Scientific Committee Normand Mousseau Co-Chair of HPCS 2011 Suzanne Talon Chair of the Organizing Committee UQAM Sponsors The PDF also contains photographs from the conference banquet.

  9. High Performance Networks From Supercomputing to Cloud Computing

    CERN Document Server

    Abts, Dennis

    2011-01-01

    Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati

  10. Resource estimation in high performance medical image computing.

    Science.gov (United States)

    Banalagay, Rueben; Covington, Kelsie Jade; Wilkes, D M; Landman, Bennett A

    2014-10-01

    Medical imaging analysis processes often involve the concatenation of many steps (e.g., multi-stage scripts) to integrate and realize advancements from image acquisition, image processing, and computational analysis. With the dramatic increase in data size for medical imaging studies (e.g., improved resolution, higher throughput acquisition, shared databases), interesting study designs are becoming intractable or impractical on individual workstations and servers. Modern pipeline environments provide control structures to distribute computational load in high performance computing (HPC) environments. However, high performance computing environments are often shared resources, and scheduling computation across these resources necessitates higher level modeling of resource utilization. Submission of 'jobs' requires an estimate of the CPU runtime and memory usage. The resource requirements for medical image processing algorithms are difficult to predict since the requirements can vary greatly between different machines, different execution instances, and different data inputs. Poor resource estimates can lead to wasted resources in high performance environments due to incomplete executions and extended queue wait times. Hence, resource estimation is becoming a major hurdle for medical image processing algorithms to efficiently leverage high performance computing environments. Herein, we present our implementation of a resource estimation system to overcome these difficulties and ultimately provide users with the ability to more efficiently utilize high performance computing resources.

  11. Resource Estimation in High Performance Medical Image Computing

    Science.gov (United States)

    Banalagay, Rueben; Covington, Kelsie Jade; Wilkes, D.M.

    2015-01-01

    Medical imaging analysis processes often involve the concatenation of many steps (e.g., multi-stage scripts) to integrate and realize advancements from image acquisition, image processing, and computational analysis. With the dramatic increase in data size for medical imaging studies (e.g., improved resolution, higher throughput acquisition, shared databases), interesting study designs are becoming intractable or impractical on individual workstations and servers. Modern pipeline environments provide control structures to distribute computational load in high performance computing (HPC) environments. However, high performance computing environments are often shared resources, and scheduling computation across these resources necessitates higher level modeling of resource utilization. Submission of ‘jobs’ requires an estimate of the CPU runtime and memory usage. The resource requirements for medical image processing algorithms are difficult to predict since the requirements can vary greatly between different machines, different execution instances, and different data inputs. Poor resource estimates can lead to wasted resources in high performance environments due to incomplete executions and extended queue wait times. Hence, resource estimation is becoming a major hurdle for medical image processing algorithms to efficiently leverage high performance computing environments. Herein, we present our implementation of a resource estimation system to overcome these difficulties and ultimately provide users with the ability to more efficiently utilize high performance computing resources. PMID:24906466

  12. A Component Architecture for High-Performance Scientific Computing

    Energy Technology Data Exchange (ETDEWEB)

    Bernholdt, David E; Allan, Benjamin A; Armstrong, Robert C; Bertrand, Felipe; Chiu, Kenneth; Dahlgren, Tamara L; Damevski, Kostadin; Elwasif, Wael R; Epperly, Thomas G; Govindaraju, Madhusudhan; Katz, Daniel S; Kohl, James A; Krishnan, Manoj Kumar; Kumfert, Gary K; Larson, J Walter; Lefantzi, Sophia; Lewis, Michael J; Malony, Allen D; McInnes, Lois C; Nieplocha, Jarek; Norris, Boyana; Parker, Steven G; Ray, Jaideep; Shende, Sameer; Windus, Theresa L; Zhou, Shujia

    2006-07-03

    The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal overhead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including combustion research, global climate simulation, and computational chemistry.

  13. A Component Architecture for High-Performance Scientific Computing

    Energy Technology Data Exchange (ETDEWEB)

    Bernholdt, D E; Allan, B A; Armstrong, R; Bertrand, F; Chiu, K; Dahlgren, T L; Damevski, K; Elwasif, W R; Epperly, T W; Govindaraju, M; Katz, D S; Kohl, J A; Krishnan, M; Kumfert, G; Larson, J W; Lefantzi, S; Lewis, M J; Malony, A D; McInnes, L C; Nieplocha, J; Norris, B; Parker, S G; Ray, J; Shende, S; Windus, T L; Zhou, S

    2004-12-14

    The Common Component Architecture (CCA) provides a means for software developers to manage the complexity of large-scale scientific simulations and to move toward a plug-and-play environment for high-performance computing. In the scientific computing context, component models also promote collaboration using independently developed software, thereby allowing particular individuals or groups to focus on the aspects of greatest interest to them. The CCA supports parallel and distributed computing as well as local high-performance connections between components in a language-independent manner. The design places minimal requirements on components and thus facilitates the integration of existing code into the CCA environment. The CCA model imposes minimal overhead to minimize the impact on application performance. The focus on high performance distinguishes the CCA from most other component models. The CCA is being applied within an increasing range of disciplines, including combustion research, global climate simulation, and computational chemistry.

  14. High Performance Photogrammetric Processing on Computer Clusters

    Science.gov (United States)

    Adrov, V. N.; Drakin, M. A.; Sechin, A. Y.

    2012-07-01

    Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.

  15. HIGH PERFORMANCE PHOTOGRAMMETRIC PROCESSING ON COMPUTER CLUSTERS

    Directory of Open Access Journals (Sweden)

    V. N. Adrov

    2012-07-01

    Full Text Available Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.

  16. The path toward HEP High Performance Computing

    CERN Document Server

    Apostolakis, John; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

    2014-01-01

    High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on th...

  17. High performance computing applications in neurobiological research

    Science.gov (United States)

    Ross, Muriel D.; Cheng, Rei; Doshay, David G.; Linton, Samuel W.; Montgomery, Kevin; Parnas, Bruce R.

    1994-01-01

    The human nervous system is a massively parallel processor of information. The vast numbers of neurons, synapses and circuits is daunting to those seeking to understand the neural basis of consciousness and intellect. Pervading obstacles are lack of knowledge of the detailed, three-dimensional (3-D) organization of even a simple neural system and the paucity of large scale, biologically relevant computer simulations. We use high performance graphics workstations and supercomputers to study the 3-D organization of gravity sensors as a prototype architecture foreshadowing more complex systems. Scaled-down simulations run on a Silicon Graphics workstation and scale-up, three-dimensional versions run on the Cray Y-MP and CM5 supercomputers.

  18. Replica-Based High-Performance Tuple Space Computing

    DEFF Research Database (Denmark)

    Andric, Marina; De Nicola, Rocco; Lluch Lafuente, Alberto

    2015-01-01

    We present the tuple-based coordination language RepliKlaim, which enriches Klaim with primitives for replica-aware coordination. Our overall goal is to offer suitable solutions to the challenging problems of data distribution and locality in large-scale high performance computing. In particular,...

  19. High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael; HLRS 2017

    2018-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  20. High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  1. Visualization and Data Analysis for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-09-27

    This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.

  2. Flow simulation and high performance computing

    Science.gov (United States)

    Tezduyar, T.; Aliabadi, S.; Behr, M.; Johnson, A.; Kalro, V.; Litke, M.

    1996-10-01

    Flow simulation is a computational tool for exploring science and technology involving flow applications. It can provide cost-effective alternatives or complements to laboratory experiments, field tests and prototyping. Flow simulation relies heavily on high performance computing (HPC). We view HPC as having two major components. One is advanced algorithms capable of accurately simulating complex, real-world problems. The other is advanced computer hardware and networking with sufficient power, memory and bandwidth to execute those simulations. While HPC enables flow simulation, flow simulation motivates development of novel HPC techniques. This paper focuses on demonstrating that flow simulation has come a long way and is being applied to many complex, real-world problems in different fields of engineering and applied sciences, particularly in aerospace engineering and applied fluid mechanics. Flow simulation has come a long way because HPC has come a long way. This paper also provides a brief review of some of the recently-developed HPC methods and tools that has played a major role in bringing flow simulation where it is today. A number of 3D flow simulations are presented in this paper as examples of the level of computational capability reached with recent HPC methods and hardware. These examples are, flow around a fighter aircraft, flow around two trains passing in a tunnel, large ram-air parachutes, flow over hydraulic structures, contaminant dispersion in a model subway station, airflow past an automobile, multiple spheres falling in a liquid-filled tube, and dynamics of a paratrooper jumping from a cargo aircraft.

  3. High Performance Spaceflight Computing (HPSC) Project

    Data.gov (United States)

    National Aeronautics and Space Administration — In 2012, the NASA Game Changing Development Program (GCDP), residing in the NASA Space Technology Mission Directorate (STMD), commissioned a High Performance...

  4. Idle waves in high-performance computing.

    Science.gov (United States)

    Markidis, Stefano; Vencels, Juris; Peng, Ivy Bo; Akhmetova, Dana; Laure, Erwin; Henri, Pierre

    2015-01-01

    The vast majority of parallel scientific applications distributes computation among processes that are in a busy state when computing and in an idle state when waiting for information from other processes. We identify the propagation of idle waves through processes in scientific applications with a local information exchange between the two processes. Idle waves are nondispersive and have a phase velocity inversely proportional to the average busy time. The physical mechanism enabling the propagation of idle waves is the local synchronization between two processes due to remote data dependency. This study provides a description of the large number of processes in parallel scientific applications as a continuous medium. This work also is a step towards an understanding of how localized idle periods can affect remote processes, leading to the degradation of global performance in parallel scientific applications.

  5. High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2000-01-01

    The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.

  6. High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    1999-01-01

    The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.

  7. The design of linear algebra libraries for high performance computers

    Energy Technology Data Exchange (ETDEWEB)

    Dongarra, J.J. [Tennessee Univ., Knoxville, TN (United States). Dept. of Computer Science]|[Oak Ridge National Lab., TN (United States); Walker, D.W. [Oak Ridge National Lab., TN (United States)

    1993-08-01

    This paper discusses the design of linear algebra libraries for high performance computers. Particular emphasis is placed on the development of scalable algorithms for MIMD distributed memory concurrent computers. A brief description of the EISPACK, LINPACK, and LAPACK libraries is given, followed by an outline of ScaLAPACK, which is a distributed memory version of LAPACK currently under development. The importance of block-partitioned algorithms in reducing the frequency of data movement between different levels of hierarchical memory is stressed. The use of such algorithms helps reduce the message startup costs on distributed memory concurrent computers. Other key ideas in our approach are the use of distributed versions of the Level 3 Basic Linear Algebra Subprograms (BLAS) as computational building blocks, and the use of Basic Linear Algebra Communication Subprograms (BLACS) as communication building blocks. Together the distributed BLAS and the BLACS can be used to construct higher-level algorithms, and hide many details of the parallelism from the application developer. The block-cyclic data distribution is described, and adopted as a good way of distributing block-partitioned matrices. Block-partitioned versions of the Cholesky and LU factorizations are presented, and optimization issues associated with the implementation of the LU factorization algorithm on distributed memory concurrent computers are discussed, together with its performance on the Intel Delta system. Finally, approaches to the design of library interfaces are reviewed.

  8. Integrating reconfigurable hardware-based grid for high performance computing.

    Science.gov (United States)

    Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

    2015-01-01

    FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process.

  9. Integrating Reconfigurable Hardware-Based Grid for High Performance Computing

    Science.gov (United States)

    Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

    2015-01-01

    FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241

  10. High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2003-01-01

    This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.

  11. High Performance Human-Computer Interfaces

    National Research Council Canada - National Science Library

    Despain, a

    1997-01-01

    Human interfaces to the computer have remained fairly crude since the use of teletypes despite the fact that computer, storage and communication performance have continued to improve by many orders of magnitude...

  12. aRNApipe: a balanced, efficient and distributed pipeline for processing RNA-seq data in high-performance computing environments.

    Science.gov (United States)

    Alonso, Arnald; Lasseigne, Brittany N; Williams, Kelly; Nielsen, Josh; Ramaker, Ryne C; Hardigan, Andrew A; Johnston, Bobbi; Roberts, Brian S; Cooper, Sara J; Marsal, Sara; Myers, Richard M

    2017-06-01

    The wide range of RNA-seq applications and their high-computational needs require the development of pipelines orchestrating the entire workflow and optimizing usage of available computational resources. We present aRNApipe, a project-oriented pipeline for processing of RNA-seq data in high-performance cluster environments. aRNApipe is highly modular and can be easily migrated to any high-performance computing (HPC) environment. The current applications included in aRNApipe combine the essential RNA-seq primary analyses, including quality control metrics, transcript alignment, count generation, transcript fusion identification, alternative splicing and sequence variant calling. aRNApipe is project-oriented and dynamic so users can easily update analyses to include or exclude samples or enable additional processing modules. Workflow parameters are easily set using a single configuration file that provides centralized tracking of all analytical processes. Finally, aRNApipe incorporates interactive web reports for sample tracking and a tool for managing the genome assemblies available to perform an analysis. https://github.com/HudsonAlpha/aRNAPipe ; DOI: 10.5281/zenodo.202950. rmyers@hudsonalpha.org. Supplementary data are available at Bioinformatics online.

  13. Portability Support for High Performance Computing

    Science.gov (United States)

    Cheng, Doreen Y.; Cooper, D. M. (Technical Monitor)

    1994-01-01

    While a large number of tools have been developed to support application portability, high performance application developers often prefer to use vendor-provided, non-portable programming interfaces. This phenomena indicates the mismatch between user priorities and tool capabilities. This paper summarizes the results of a user survey and a developer survey. The user survey has revealed the user priorities and resulted in three criteria for evaluating tool support for portability. The developer survey has resulted in the evaluation of portability support and indicated the possibilities and difficulties of improvements.

  14. Efficient High Performance Computing on Heterogeneous Platforms

    NARCIS (Netherlands)

    Shen, J.

    2015-01-01

    Heterogeneous platforms are mixes of different processing units in a compute node (e.g., CPUs+GPUs, CPU+MICs) or a chip package (e.g., APUs). This type of platforms keeps gaining popularity in various computer systems ranging from supercomputers to mobile devices. In this context, improving their

  15. DOE research in utilization of high-performance computers

    Energy Technology Data Exchange (ETDEWEB)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-12-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure.

  16. A History of High-Performance Computing

    Science.gov (United States)

    2006-01-01

    Faster than most speedy computers. More powerful than its NASA data-processing predecessors. Able to leap large, mission-related computational problems in a single bound. Clearly, it s neither a bird nor a plane, nor does it need to don a red cape, because it s super in its own way. It's Columbia, NASA s newest supercomputer and one of the world s most powerful production/processing units. Named Columbia to honor the STS-107 Space Shuttle Columbia crewmembers, the new supercomputer is making it possible for NASA to achieve breakthroughs in science and engineering, fulfilling the Agency s missions, and, ultimately, the Vision for Space Exploration. Shortly after being built in 2004, Columbia achieved a benchmark rating of 51.9 teraflop/s on 10,240 processors, making it the world s fastest operational computer at the time of completion. Putting this speed into perspective, 20 years ago, the most powerful computer at NASA s Ames Research Center, home of the NASA Advanced Supercomputing Division (NAS), ran at a speed of about 1 gigaflop (one billion calculations per second). The Columbia supercomputer is 50,000 times faster than this computer and offers a tenfold increase in capacity over the prior system housed at Ames. What s more, Columbia is considered the world s largest Linux-based, shared-memory system. The system is offering immeasurable benefits to society and is the zenith of years of NASA/private industry collaboration that has spawned new generations of commercial, high-speed computing systems.

  17. High Performance Computing for Medical Image Interpretation

    Science.gov (United States)

    1993-10-01

    patterns from which the diagnoses can be made. A general problem arising from this modality is the detection of small I rom td physics poit of view...applied in, e.g., chest radiography and orthodontics (Scott and Symons (1982)). Computer Tomography (CT) applies to all techniques by which...density in the z-direction towards its equilibrium value. T2 is the transverse or spin-spin relaxation time which governs the evolution of the

  18. NASA High Performance Computing and Communications program

    Science.gov (United States)

    Holcomb, Lee; Smith, Paul; Hunter, Paul

    1994-01-01

    The National Aeronautics and Space Administration's HPCC program is part of a new Presidential initiative aimed at producing a 1000-fold increase in supercomputing speed and a 1(X)-fold improvement in available communications capability by 1997. As more advanced technologies are developed under the HPCC program, they will be used to solve NASA's 'Grand Challenge' problems, which include improving the design and simulation of advanced aerospace vehicles, allowing people at remote locations to communicate more effectively and share information, increasing scientists' abilities to model the Earth's climate and forecast global environmental trends, and improving the development of advanced spacecraft. NASA's HPCC program is organized into three projects which are unique to the agency's mission: the Computational Aerosciences (CAS) project, the Earth and Space Sciences (ESS) project, and the Remote Exploration and Experimentation (REE) project. An additional project, the Basic Research and Human Resources (BRHR) project, exists to promote long term research in computer science and engineering and to increase the pool of trained personnel in a variety of scientific disciplines. This document presents an overview of the objectives and organization of these projects, as well as summaries of early accomplishments and the significance, status, and plans for individual research and development programs within each project. Areas of emphasis include benchmarking, testbeds, software and simulation methods.

  19. High-performance computing MRI simulations.

    Science.gov (United States)

    Stöcker, Tony; Vahedipour, Kaveh; Pflugfelder, Daniel; Shah, N Jon

    2010-07-01

    A new open-source software project is presented, JEMRIS, the Jülich Extensible MRI Simulator, which provides an MRI sequence development and simulation environment for the MRI community. The development was driven by the desire to achieve generality of simulated three-dimensional MRI experiments reflecting modern MRI systems hardware. The accompanying computational burden is overcome by means of parallel computing. Many aspects are covered that have not hitherto been simultaneously investigated in general MRI simulations such as parallel transmit and receive, important off-resonance effects, nonlinear gradients, and arbitrary spatiotemporal parameter variations at different levels. The latter can be used to simulate various types of motion, for instance. The JEMRIS user interface is very simple to use, but nevertheless it presents few limitations. MRI sequences with arbitrary waveforms and complex interdependent modules are modeled in a graphical user interface-based environment requiring no further programming. This manuscript describes the concepts, methods, and performance of the software. Examples of novel simulation results in active fields of MRI research are given. (c) 2010 Wiley-Liss, Inc.

  20. High Performance Computing in Science and Engineering '08 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2009-01-01

    The discussions and plans on all scienti?c, advisory, and political levels to realize an even larger “European Supercomputer” in Germany, where the hardware costs alone will be hundreds of millions Euro – much more than in the past – are getting closer to realization. As part of the strategy, the three national supercomputing centres HLRS (Stuttgart), NIC/JSC (Julic ¨ h) and LRZ (Munich) have formed the Gauss Centre for Supercomputing (GCS) as a new virtual organization enabled by an agreement between the Federal Ministry of Education and Research (BMBF) and the state ministries for research of Baden-Wurttem ¨ berg, Bayern, and Nordrhein-Westfalen. Already today, the GCS provides the most powerful high-performance computing - frastructure in Europe. Through GCS, HLRS participates in the European project PRACE (Partnership for Advances Computing in Europe) and - tends its reach to all European member countries. These activities aligns well with the activities of HLRS in the European HPC infrastructur...

  1. RISC Processors and High Performance Computing

    Science.gov (United States)

    Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.

  2. Benchmarking: More Aspects of High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Ravindrudu, Rahul [Iowa State Univ., Ames, IA (United States)

    2004-01-01

    pattern for the left-looking factorization. The right-looking algorithm performs better for in-core data, but the left-looking will perform better for out-of-core data due to the reduced I/O operations. Hence the conclusion that out-of-core algorithms will perform better when designed from start. The out-of-core and thread based computation do not interact in this case, since I/O is not done by the threads. The performance of the thread based computation does not depend on I/O as the algorithms are in the BLAS algorithms which assumes all the data to be in memory. This is the reason the out-of-core results and OpenMP threads results were presented separately and no attempt to combine them was made. In general, the modified HPL performs better with larger block sizes, due to less I/O involved for out-of-core part and better cache utilization for the thread based computation.

  3. NINJA: Java for High Performance Numerical Computing

    Directory of Open Access Journals (Sweden)

    José E. Moreira

    2002-01-01

    Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.

  4. From the Editor: The High Performance Computing Act of 1991.

    Science.gov (United States)

    McClure, Charles R.

    1992-01-01

    Discusses issues related to the High Performance Computing and Communication program and National Research and Education Network (NREN) established by the High Performance Computing Act of 1991, including program management, specific program development, affecting policy decisions, access to the NREN, the Department of Education role, and…

  5. Defining a Comprehensive Threat Model for High Performance Computational Clusters

    OpenAIRE

    Mogilevsky, Dmitry; Lee, Adam; Yurcik, William

    2005-01-01

    Over the past decade, high performance computational (HPC) clusters have become mainstream in academic and industrial settings as accessible means of computation. Throughout their proliferation, HPC security has been a secondary concern to performance. It is evident, however, that ensuring HPC security presents different challenges than the ones faced when dealing with traditional networks. To design suitable security measures for high performance computing, it is necessary to first realize t...

  6. Contemporary high performance computing from petascale toward exascale

    CERN Document Server

    Vetter, Jeffrey S

    2013-01-01

    Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the

  7. Mastering the Challenge of High-Performance Computing.

    Science.gov (United States)

    Roach, Ronald

    2003-01-01

    Discusses how, just as all of higher education got serious with wiring individual campuses for the Internet, the nation's leading research institutions have initiated "high-performance computing." Describes several such initiatives involving historically black colleges and universities. (EV)

  8. Export Control of High Performance Computing: Analysis and Alternative Strategies

    National Research Council Canada - National Science Library

    Holland, Charles

    2001-01-01

    High performance computing has historically played an important role in the ability of the United States to develop and deploy a wide range of national security capabilities, such as stealth aircraft...

  9. Architecture and Programming Models for High Performance Intensive Computation

    Science.gov (United States)

    2016-06-29

    AFRL-AFOSR-VA-TR-2016-0230 Architecture and Programming Models for High Performance Intensive Computation XiaoMing Li UNIVERSITY OF DELAWARE Final...TITLE AND SUBTITLE Architecture and Programming Models for High Performance Intensive Computation 5a. CONTRACT NUMBER 5b. GRANT NUMBER FA9550-13-1-0213...developing an efficient system architecture and software tools for building and running Dynamic Data Driven Application Systems (DDDAS). The foremost

  10. High-performance Computing Applied to Semantic Databases

    Energy Technology Data Exchange (ETDEWEB)

    Goodman, Eric L.; Jimenez, Edward; Mizell, David W.; al-Saffar, Sinan; Adolf, Robert D.; Haglin, David J.

    2011-06-02

    To-date, the application of high-performance computing resources to Semantic Web data has largely focused on commodity hardware and distributed memory platforms. In this paper we make the case that more specialized hardware can offer superior scaling and close to an order of magnitude improvement in performance. In particular we examine the Cray XMT. Its key characteristics, a large, global shared-memory, and processors with a memory-latency tolerant design, offer an environment conducive to programming for the Semantic Web and have engendered results that far surpass current state of the art. We examine three fundamental pieces requisite for a fully functioning semantic database: dictionary encoding, RDFS inference, and query processing. We show scaling up to 512 processors (the largest configuration we had available), and the ability to process 20 billion triples completely in-memory.

  11. High Performance Input/Output Systems for High Performance Computing and Four-Dimensional Data Assimilation

    Science.gov (United States)

    Fox, Geoffrey C.; Ou, Chao-Wei

    1997-01-01

    The approach of this task was to apply leading parallel computing research to a number of existing techniques for assimilation, and extract parameters indicating where and how input/output limits computational performance. The following was used for detailed knowledge of the application problems: 1. Developing a parallel input/output system specifically for this application 2. Extracting the important input/output characteristics of data assimilation problems; and 3. Building these characteristics s parameters into our runtime library (Fortran D/High Performance Fortran) for parallel input/output support.

  12. A Research and Development Strategy for High Performance Computing.

    Science.gov (United States)

    Office of Science and Technology Policy, Washington, DC.

    This report is the result of a systematic review of the status and directions of high performance computing and its relationship to federal research and development. Conducted by the Federal Coordinating Council for Science, Engineering, and Technology (FCCSET), the review involved a series of workshops attended by numerous computer scientists and…

  13. Achieving High Performance with FPGA-Based Computing

    Science.gov (United States)

    Herbordt, Martin C.; VanCourt, Tom; Gu, Yongfeng; Sukhwani, Bharat; Conti, Al; Model, Josh; DiSabello, Doug

    2011-01-01

    Numerous application areas, including bioinformatics and computational biology, demand increasing amounts of processing capability. In many cases, the computation cores and data types are suited to field-programmable gate arrays. The challenge is identifying the design techniques that can extract high performance potential from the FPGA fabric. PMID:21603088

  14. Promoting High-Performance Computing and Communications. A CBO Study.

    Science.gov (United States)

    Webre, Philip

    In 1991 the Federal Government initiated the multiagency High Performance Computing and Communications program (HPCC) to further the development of U.S. supercomputer technology and high-speed computer network technology. This overview by the Congressional Budget Office (CBO) concentrates on obstacles that might prevent the growth of the…

  15. Nuclear Forces and High-Performance Computing: The Perfect Match

    Energy Technology Data Exchange (ETDEWEB)

    Luu, T; Walker-Loud, A

    2009-06-12

    High-performance computing is now enabling the calculation of certain nuclear interaction parameters directly from Quantum Chromodynamics, the quantum field theory that governs the behavior of quarks and gluons and is ultimately responsible for the nuclear strong force. We briefly describe the state of the field and describe how progress in this field will impact the greater nuclear physics community. We give estimates of computational requirements needed to obtain certain milestones and describe the scientific and computational challenges of this field.

  16. High performance computing and communications: FY 1997 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-12-01

    The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A provides a detailed look at HPCC Program activities within each agency.

  17. High Performance Computing in Science and Engineering '14

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2015-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and   engineers. The book comes with a wealth of color illustrations and tables of results.  

  18. Quantum Accelerators for High-performance Computing Systems

    Energy Technology Data Exchange (ETDEWEB)

    Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

    2017-11-01

    We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.

  19. Optical interconnection networks for high-performance computing systems.

    Science.gov (United States)

    Biberman, Aleksandr; Bergman, Keren

    2012-04-01

    Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers.

  20. Interaction and Impact Studies for Distributed Energy Resource, Transactive Energy, and Electric Grid, using High Performance Computing ?based Modeling and Simulation

    Energy Technology Data Exchange (ETDEWEB)

    Kelley, B M

    2017-02-10

    The electric utility industry is undergoing significant transformations in its operation model, including a greater emphasis on automation, monitoring technologies, and distributed energy resource management systems (DERMS). With these changes and new technologies, while driving greater efficiencies and reliability, these new models may introduce new vectors of cyber attack. The appropriate cybersecurity controls to address and mitigate these newly introduced attack vectors and potential vulnerabilities are still widely unknown and performance of the control is difficult to vet. This proposal argues that modeling and simulation (M&S) is a necessary tool to address and better understand these problems introduced by emerging technologies for the grid. M&S will provide electric utilities a platform to model its transmission and distribution systems and run various simulations against the model to better understand the operational impact and performance of cybersecurity controls.

  1. GPU-based high-performance computing for radiation therapy.

    Science.gov (United States)

    Jia, Xun; Ziegenhein, Peter; Jiang, Steve B

    2014-02-21

    Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented.

  2. High Performance Data Transfer for Distributed Data Intensive Sciences

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Chin [Zettar Inc., Mountain View, CA (United States); Cottrell, R ' Les' A. [SLAC National Accelerator Lab., Menlo Park, CA (United States); Hanushevsky, Andrew B. [SLAC National Accelerator Lab., Menlo Park, CA (United States); Kroeger, Wilko [SLAC National Accelerator Lab., Menlo Park, CA (United States); Yang, Wei [SLAC National Accelerator Lab., Menlo Park, CA (United States)

    2017-03-06

    We report on the development of ZX software providing high performance data transfer and encryption. The design scales in: computation power, network interfaces, and IOPS while carefully balancing the available resources. Two U.S. patent-pending algorithms help tackle data sets containing lots of small files and very large files, and provide insensitivity to network latency. It has a cluster-oriented architecture, using peer-to-peer technologies to ease deployment, operation, usage, and resource discovery. Its unique optimizations enable effective use of flash memory. Using a pair of existing data transfer nodes at SLAC and NERSC, we compared its performance to that of bbcp and GridFTP and determined that they were comparable. With a proof of concept created using two four-node clusters with multiple distributed multi-core CPUs, network interfaces and flash memory, we achieved 155Gbps memory-to-memory over a 2x100Gbps link aggregated channel and 70Gbps file-to-file with encryption over a 5000 mile 100Gbps link.

  3. Contemporary high performance computing from petascale toward exascale

    CERN Document Server

    Vetter, Jeffrey S

    2015-01-01

    A continuation of Contemporary High Performance Computing: From Petascale toward Exascale, this second volume continues the discussion of HPC flagship systems, major application workloads, facilities, and sponsors. The book includes of figures and pictures that capture the state of existing systems: pictures of buildings, systems in production, floorplans, and many block diagrams and charts to illustrate system design and performance.

  4. Enabling High-Performance Computing as a Service

    KAUST Repository

    AbdelBaky, Moustafa

    2012-10-01

    With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a representative HPC application. © 2012 IEEE.

  5. Computer science of the high performance; Informatica del alto rendimiento

    Energy Technology Data Exchange (ETDEWEB)

    Moraleda, A.

    2008-07-01

    The high performance computing is taking shape as a powerful accelerator of the process of innovation, to drastically reduce the waiting times for access to the results and the findings in a growing number of processes and activities as complex and important as medicine, genetics, pharmacology, environment, natural resources management or the simulation of complex processes in a wide variety of industries. (Author)

  6. High Performance Computing and Networking for Science--Background Paper.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Office of Technology Assessment.

    The Office of Technology Assessment is conducting an assessment of the effects of new information technologies--including high performance computing, data networking, and mass data archiving--on research and development. This paper offers a view of the issues and their implications for current discussions about Federal supercomputer initiatives…

  7. Seeking Solution: High-Performance Computing for Science. Background Paper.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Office of Technology Assessment.

    This is the second publication from the Office of Technology Assessment's assessment on information technology and research, which was requested by the House Committee on Science and Technology and the Senate Committee on Commerce, Science, and Transportation. The first background paper, "High Performance Computing & Networking for…

  8. High Performance Computing and Communications: Toward a National Information Infrastructure.

    Science.gov (United States)

    Federal Coordinating Council for Science, Engineering and Technology, Washington, DC.

    This report describes the High Performance Computing and Communications (HPCC) initiative of the Federal Coordinating Council for Science, Engineering, and Technology. This program is supportive of and coordinated with the National Information Infrastructure Initiative. Now halfway through its 5-year effort, the HPCC program counts among its…

  9. High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  10. High performance computing in power and energy systems

    Energy Technology Data Exchange (ETDEWEB)

    Khaitan, Siddhartha Kumar [Iowa State Univ., Ames, IA (United States); Gupta, Anshul (eds.) [IBM Watson Research Center, Yorktown Heights, NY (United States)

    2013-07-01

    The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, cascading and security analysis, interaction between hybrid systems (electric, transport, gas, oil, coal, etc.) and so on, to get meaningful information in real time to ensure a secure, reliable and stable power system grid. Advanced research on development and implementation of market-ready leading-edge high-speed enabling technologies and algorithms for solving real-time, dynamic, resource-critical problems will be required for dynamic security analysis targeted towards successful implementation of Smart Grid initiatives. This books aims to bring together some of the latest research developments as well as thoughts on the future research directions of the high performance computing applications in electric power systems planning, operations, security, markets, and grid integration of alternate sources of energy, etc.

  11. Vecpar'08: High Performance Computing for Computational Science

    OpenAIRE

    Invernizzi, Alice

    2008-01-01

    The 8th International Meeting of High Perfomance Computing for Computational Science was held between 24 and 27 June 2008 in Toulouse at the ENSEEIHT Ecole Nationale Superiéure d’Electrotechnique, d’Electronique, d’Informatique, d’Hydraulique et des Téléccomunications of the University of Toulouse. The conference was the opportunity for gathering of an enlarged scientific community made up of mathematicians, physicists, engineers reverting to computer simulations for analysis of complex syst...

  12. High performance computing and communications: FY 1996 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-05-16

    The High Performance Computing and Communications (HPCC) Program was formally authorized by passage of the High Performance Computing Act of 1991, signed on December 9, 1991. Twelve federal agencies, in collaboration with scientists and managers from US industry, universities, and research laboratories, have developed the Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1995 and FY 1996. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency.

  13. High performance computing in structural determination by electron cryomicroscopy.

    Science.gov (United States)

    Fernández, J J

    2008-10-01

    Computational advances have significantly contributed to the current role of electron cryomicroscopy (cryoEM) in structural biology. The needs for computational power are constantly growing with the increasing complexity of algorithms and the amount of data needed to push the resolution limits. High performance computing (HPC) is becoming paramount in cryoEM to cope with those computational needs. Since the nineties, different HPC strategies have been proposed for some specific problems in cryoEM and, in fact, some of them are already available in common software packages. Nevertheless, the literature is scattered in the areas of computer science and structural biology. In this communication, the HPC approaches devised for the computation-intensive tasks in cryoEM (single particles and tomography) are retrospectively reviewed and the future trends are discussed. Moreover, the HPC capabilities available in the most common cryoEM packages are surveyed, as an evidence of the importance of HPC in addressing the future challenges.

  14. 3rd International Conference on High Performance Scientific Computing

    CERN Document Server

    Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

    2008-01-01

    This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...

  15. 6th International Conference on High Performance Scientific Computing

    CERN Document Server

    Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

    2017-01-01

    This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...

  16. 5th International Conference on High Performance Scientific Computing

    CERN Document Server

    Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

    2014-01-01

    This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...

  17. High-Performance Java Codes for Computational Fluid Dynamics

    Science.gov (United States)

    Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.

  18. High performance flight computer developed for deep space applications

    Science.gov (United States)

    Bunker, Robert L.

    1993-01-01

    The development of an advanced space flight computer for real time embedded deep space applications which embodies the lessons learned on Galileo and modern computer technology is described. The requirements are listed and the design implementation that meets those requirements is described. The development of SPACE-16 (Spaceborne Advanced Computing Engine) (where 16 designates the databus width) was initiated to support the MM2 (Marine Mark 2) project. The computer is based on a radiation hardened emulation of a modern 32 bit microprocessor and its family of support devices including a high performance floating point accelerator. Additional custom devices which include a coprocessor to improve input/output capabilities, a memory interface chip, and an additional support chip that provide management of all fault tolerant features, are described. Detailed supporting analyses and rationale which justifies specific design and architectural decisions are provided. The six chip types were designed and fabricated. Testing and evaluation of a brass/board was initiated.

  19. High performance computing in science and engineering '09: transactions of the High Performance Computing Center, Stuttgart (HLRS) 2009

    National Research Council Canada - National Science Library

    Nagel, Wolfgang E; Kröner, Dietmar; Resch, Michael

    2010-01-01

    ...), NIC/JSC (J¨ u lich), and LRZ (Munich). As part of that strategic initiative, in May 2009 already NIC/JSC has installed the first phase of the GCS HPC Tier-0 resources, an IBM Blue Gene/P with roughly 300.000 Cores, this time in J¨ u lich, With that, the GCS provides the most powerful high-performance computing infrastructure in Europe alread...

  20. Overview of Parallel Platforms for Common High Performance Computing

    Directory of Open Access Journals (Sweden)

    T. Fryza

    2012-04-01

    Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.

  1. High performance computing and communications: FY 1995 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1994-04-01

    The High Performance Computing and Communications (HPCC) Program was formally established following passage of the High Performance Computing Act of 1991 signed on December 9, 1991. Ten federal agencies in collaboration with scientists and managers from US industry, universities, and laboratories have developed the HPCC Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1994 and FY 1995. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency. Although the Department of Education is an official HPCC agency, its current funding and reporting of crosscut activities goes through the Committee on Education and Health Resources, not the HPCC Program. For this reason the Implementation Plan covers nine HPCC agencies.

  2. Fundamentals of Modeling, Data Assimilation, and High-performance Computing

    Science.gov (United States)

    Rood, Richard B.

    2005-01-01

    This lecture will introduce the concepts of modeling, data assimilation and high- performance computing as it relates to the study of atmospheric composition. The lecture will work from basic definitions and will strive to provide a framework for thinking about development and application of models and data assimilation systems. It will not provide technical or algorithmic information, leaving that to textbooks, technical reports, and ultimately scientific journals. References to a number of textbooks and papers will be provided as a gateway to the literature.

  3. High Performance Computing - Power Application Programming Interface Specification.

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H.,; Kelly, Suzanne M.; Pedretti, Kevin; Grant, Ryan; Olivier, Stephen Lecler; Levenhagen, Michael J.; DeBonis, David

    2014-08-01

    Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

  4. Benchmarking high performance computing architectures with CMS’ skeleton framework

    Science.gov (United States)

    Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

    2017-10-01

    In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.

  5. Mass storage: The key to success in high performance computing

    Science.gov (United States)

    Lee, Richard R.

    1993-01-01

    There are numerous High Performance Computing & Communications Initiatives in the world today. All are determined to help solve some 'Grand Challenges' type of problem, but each appears to be dominated by the pursuit of higher and higher levels of CPU performance and interconnection bandwidth as the approach to success, without any regard to the impact of Mass Storage. My colleagues and I at Data Storage Technologies believe that all will have their performance against their goals ultimately measured by their ability to efficiently store and retrieve the 'deluge of data' created by end-users who will be using these systems to solve Scientific Grand Challenges problems, and that the issue of Mass Storage will become then the determinant of success or failure in achieving each projects goals. In today's world of High Performance Computing and Communications (HPCC), the critical path to success in solving problems can only be traveled by designing and implementing Mass Storage Systems capable of storing and manipulating the truly 'massive' amounts of data associated with solving these challenges. Within my presentation I will explore this critical issue and hypothesize solutions to this problem.

  6. High performance transcription factor-DNA docking with GPU computing.

    Science.gov (United States)

    Wu, Jiadong; Hong, Bo; Takeda, Takako; Guo, Jun-Tao

    2012-06-21

    Protein-DNA docking is a very challenging problem in structural bioinformatics and has important implications in a number of applications, such as structure-based prediction of transcription factor binding sites and rational drug design. Protein-DNA docking is very computational demanding due to the high cost of energy calculation and the statistical nature of conformational sampling algorithms. More importantly, experiments show that the docking quality depends on the coverage of the conformational sampling space. It is therefore desirable to accelerate the computation of the docking algorithm, not only to reduce computing time, but also to improve docking quality. In an attempt to accelerate the sampling process and to improve the docking performance, we developed a graphics processing unit (GPU)-based protein-DNA docking algorithm. The algorithm employs a potential-based energy function to describe the binding affinity of a protein-DNA pair, and integrates Monte-Carlo simulation and a simulated annealing method to search through the conformational space. Algorithmic techniques were developed to improve the computation efficiency and scalability on GPU-based high performance computing systems. The effectiveness of our approach is tested on a non-redundant set of 75 TF-DNA complexes and a newly developed TF-DNA docking benchmark. We demonstrated that the GPU-based docking algorithm can significantly accelerate the simulation process and thereby improving the chance of finding near-native TF-DNA complex structures. This study also suggests that further improvement in protein-DNA docking research would require efforts from two integral aspects: improvement in computation efficiency and energy function design. We present a high performance computing approach for improving the prediction accuracy of protein-DNA docking. The GPU-based docking algorithm accelerates the search of the conformational space and thus increases the chance of finding more near

  7. High-performance computer management based on Java

    OpenAIRE

    Sander, Volker; Erwin, Dietmar; Huber, Valentina

    1998-01-01

    Coupling of distributed computer resources connected by a high speed network to one virtual computer is the basic idea of a metacomputer. Access to the metacomputer should be provided by an intuitive graphical user interface (GUI), ideally WWW based. This paper presents a metacomputer architecture using a Java based GUI. The concept will be discussed with regard to security, communication, scalability, and the integration into existing frameworks.

  8. Opportunities and challenges of high-performance computing in chemistry

    Energy Technology Data Exchange (ETDEWEB)

    Guest, M.F.; Kendall, R.A.; Nichols, J.A. [eds.] [and others

    1995-06-01

    The field of high-performance computing is developing at an extremely rapid pace. Massively parallel computers offering orders of magnitude increase in performance are under development by all the major computer vendors. Many sites now have production facilities that include massively parallel hardware. Molecular modeling methodologies (both quantum and classical) are also advancing at a brisk pace. The transition of molecular modeling software to a massively parallel computing environment offers many exciting opportunities, such as the accurate treatment of larger, more complex molecular systems in routine fashion, and a viable, cost-effective route to study physical, biological, and chemical `grand challenge` problems that are impractical on traditional vector supercomputers. This will have a broad effect on all areas of basic chemical science at academic research institutions and chemical, petroleum, and pharmaceutical industries in the United States, as well as chemical waste and environmental remediation processes. But, this transition also poses significant challenges: architectural issues (SIMD, MIMD, local memory, global memory, etc.) remain poorly understood and software development tools (compilers, debuggers, performance monitors, etc.) are not well developed. In addition, researchers that understand and wish to pursue the benefits offered by massively parallel computing are often hindered by lack of expertise, hardware, and/or information at their site. A conference and workshop organized to focus on these issues was held at the National Institute of Health, Bethesda, Maryland (February 1993). This report is the culmination of the organized workshop. The main conclusion: a drastic acceleration in the present rate of progress is required for the chemistry community to be positioned to exploit fully the emerging class of Teraflop computers, even allowing for the significant work to date by the community in developing software for parallel architectures.

  9. Power/energy use cases for high performance computing

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hammond, Steven [National Renewable Energy Lab. (NREL), Golden, CO (United States); Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Munch, Kristin [National Renewable Energy Lab. (NREL), Golden, CO (United States)

    2013-12-01

    Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.

  10. Trends in high-performance computing for engineering calculations.

    Science.gov (United States)

    Giles, M B; Reguly, I

    2014-08-13

    High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  11. Next-generation sequencing: big data meets high performance computing.

    Science.gov (United States)

    Schmidt, Bertil; Hildebrandt, Andreas

    2017-04-01

    The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Scalability of DL_POLY on High Performance Computing Platform

    Directory of Open Access Journals (Sweden)

    Mabule Samuel Mabakane

    2017-12-01

    Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.

  13. Scout: high-performance heterogeneous computing made simple

    Energy Technology Data Exchange (ETDEWEB)

    Jablin, James [Los Alamos National Laboratory; Mc Cormick, Patrick [Los Alamos National Laboratory; Herlihy, Maurice [BROWN UNIV.

    2011-01-26

    Researchers must often write their own simulation and analysis software. During this process they simultaneously confront both computational and scientific problems. Current strategies for aiding the generation of performance-oriented programs do not abstract the software development from the science. Furthermore, the problem is becoming increasingly complex and pressing with the continued development of many-core and heterogeneous (CPU-GPU) architectures. To acbieve high performance, scientists must expertly navigate both software and hardware. Co-design between computer scientists and research scientists can alleviate but not solve this problem. The science community requires better tools for developing, optimizing, and future-proofing codes, allowing scientists to focus on their research while still achieving high computational performance. Scout is a parallel programming language and extensible compiler framework targeting heterogeneous architectures. It provides the abstraction required to buffer scientists from the constantly-shifting details of hardware while still realizing higb-performance by encapsulating software and hardware optimization within a compiler framework.

  14. Chip-to-board interconnects for high-performance computing

    Science.gov (United States)

    Riester, Markus B. K.; Houbertz-Krauss, Ruth; Steenhusen, Sönke

    2013-02-01

    Super computing is reaching out to ExaFLOP processing speeds, creating fundamental challenges for the way that computing systems are designed and built. One governing topic is the reduction of power used for operating the system, and eliminating the excess heat generated from the system. Current thinking sees optical interconnects on most interconnect levels to be a feasible solution to many of the challenges, although there are still limitations to the technical solutions, in particular with regard to manufacturability. This paper explores drivers for enabling optical interconnect technologies to advance into the module and chip level. The introduction of optical links into High Performance Computing (HPC) could be an option to allow scaling the manufacturing technology to large volume manufacturing. This will drive the need for manufacturability of optical interconnects, giving rise to other challenges that add to the realization of this type of interconnection. This paper describes a solution that allows the creation of optical components on module level, integrating optical chips, laser diodes or PIN diodes as components much like the well known SMD components used for electrical components. The paper shows the main challenges and potential solutions to this challenge and proposes a fundamental paradigm shift in the manufacturing of 3-dimensional optical links for the level 1 interconnect (chip package).

  15. Simple, Parallel, High-Performance Virtual Machines for Extreme Computations

    CERN Document Server

    Nejad, Bijan Chokoufe; Reuter, Jürgen

    2014-01-01

    We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new...

  16. Multicore Challenges and Benefits for High Performance Scientific Computing

    Directory of Open Access Journals (Sweden)

    Ida M.B. Nielsen

    2008-01-01

    Full Text Available Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.

  17. High-performance Scientific Computing using Parallel Computing to Improve Performance Optimization Problems

    Directory of Open Access Journals (Sweden)

    Florica Novăcescu

    2011-10-01

    Full Text Available HPC (High Performance Computing has become essential for the acceleration of innovation and the companies’ assistance in creating new inventions, better models and more reliable products as well as obtaining processes and services at low costs. The information in this paper focuses particularly on: description the field of high performance scientific computing, parallel computing, scientific computing, parallel computers, and trends in the HPC field, presented here reveal important new directions toward the realization of a high performance computational society. The practical part of the work is an example of use of the HPC tool to accelerate solving an electrostatic optimization problem using the Parallel Computing Toolbox that allows solving computational and data-intensive problems using MATLAB and Simulink on multicore and multiprocessor computers.

  18. Optimizing high performance computing workflow for protein functional annotation.

    Science.gov (United States)

    Stanberry, Larissa; Rekepalli, Bhanu; Liu, Yuan; Giblock, Paul; Higdon, Roger; Montague, Elizabeth; Broomall, William; Kolker, Natali; Kolker, Eugene

    2014-09-10

    Functional annotation of newly sequenced genomes is one of the major challenges in modern biology. With modern sequencing technologies, the protein sequence universe is rapidly expanding. Newly sequenced bacterial genomes alone contain over 7.5 million proteins. The rate of data generation has far surpassed that of protein annotation. The volume of protein data makes manual curation infeasible, whereas a high compute cost limits the utility of existing automated approaches. In this work, we present an improved and optmized automated workflow to enable large-scale protein annotation. The workflow uses high performance computing architectures and a low complexity classification algorithm to assign proteins into existing clusters of orthologous groups of proteins. On the basis of the Position-Specific Iterative Basic Local Alignment Search Tool the algorithm ensures at least 80% specificity and sensitivity of the resulting classifications. The workflow utilizes highly scalable parallel applications for classification and sequence alignment. Using Extreme Science and Engineering Discovery Environment supercomputers, the workflow processed 1,200,000 newly sequenced bacterial proteins. With the rapid expansion of the protein sequence universe, the proposed workflow will enable scientists to annotate big genome data.

  19. Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

    Science.gov (United States)

    Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

    2013-01-01

    With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…

  20. A Heterogeneous High-Performance System for Computational and Computer Science

    Science.gov (United States)

    2016-11-15

    System for Computational and Computer Science Report Title This DoD HBC/MI Equipment/Instrumentation grant was awarded in October 2014 for the purchase... Computing (HPC) course taught in the department of computer science as to attract more graduate students from many disciplines where their research...AND SUBTITLE A Heterogeneous High-Performance System for Computational and Computer Science 5a. CONTRACT NUMBER W911NF-15-1-0023 5b

  1. Multi-Language Programming Environments for High Performance Java Computing

    Directory of Open Access Journals (Sweden)

    Vladimir Getov

    1999-01-01

    Full Text Available Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI tool which provides application programmers wishing to use Java with immediate accessibility to existing scientific packages. The JCI tool also facilitates rapid development and reuse of existing code. These benefits are provided at minimal cost to the programmer. While beneficial to the programmer, the additional advantages of mixed‐language programming in terms of application performance and portability are addressed in detail within the context of this paper. In addition, we discuss how the JCI tool is complementing other ongoing projects such as IBM’s High‐Performance Compiler for Java (HPCJ and IceT’s metacomputing environment.

  2. The Future of Software Engineering for High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Pope, G [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-07-16

    DOE ASCR requested that from May through mid-July 2015 a study group identify issues and recommend solutions from a software engineering perspective transitioning into the next generation of High Performance Computing. The approach used was to ask some of the DOE complex experts who will be responsible for doing this work to contribute to the study group. The technique used was to solicit elevator speeches: a short and concise write up done as if the author was a speaker with only a few minutes to convince a decision maker of their top issues. Pages 2-18 contain the original texts of the contributed elevator speeches and end notes identifying the 20 contributors. The study group also ranked the importance of each topic, and those scores are displayed with each topic heading. A perfect score (and highest priority) is three, two is medium priority, and one is lowest priority. The highest scoring topic areas were software engineering and testing resources; the lowest scoring area was compliance to DOE standards. The following two paragraphs are an elevator speech summarizing the contributed elevator speeches. Each sentence or phrase in the summary is hyperlinked to its source via a numeral embedded in the text. A risk one liner has also been added to each topic to allow future risk tracking and mitigation.

  3. Towards New Metrics for High-Performance Computing Resilience

    Energy Technology Data Exchange (ETDEWEB)

    Hukerikar, Saurabh [ORNL; Ashraf, Rizwan A [ORNL; Engelmann, Christian [ORNL

    2017-01-01

    Ensuring the reliability of applications is becoming an increasingly important challenge as high-performance computing (HPC) systems experience an ever-growing number of faults, errors and failures. While the HPC community has made substantial progress in developing various resilience solutions, it continues to rely on platform-based metrics to quantify application resiliency improvements. The resilience of an HPC application is concerned with the reliability of the application outcome as well as the fault handling efficiency. To understand the scope of impact, effective coverage and performance efficiency of existing and emerging resilience solutions, there is a need for new metrics. In this paper, we develop new ways to quantify resilience that consider both the reliability and the performance characteristics of the solutions from the perspective of HPC applications. As HPC systems continue to evolve in terms of scale and complexity, it is expected that applications will experience various types of faults, errors and failures, which will require applications to apply multiple resilience solutions across the system stack. The proposed metrics are intended to be useful for understanding the combined impact of these solutions on an application's ability to produce correct results and to evaluate their overall impact on an application's performance in the presence of various modes of faults.

  4. Lightweight Provenance Service for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

    2017-09-09

    Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

  5. FY 1995 Blue Book: High Performance Computing and Communications: Technology for the National Information Infrastructure

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program was created to accelerate the development of future generations of high performance computers...

  6. Mixed-Language High-Performance Computing for Plasma Simulations

    Directory of Open Access Journals (Sweden)

    Quanming Lu

    2003-01-01

    Full Text Available Java is receiving increasing attention as the most popular platform for distributed computing. However, programmers are still reluctant to embrace Java as a tool for writing scientific and engineering applications due to its still noticeable performance drawbacks compared with other programming languages such as Fortran or C. In this paper, we present a hybrid Java/Fortran implementation of a parallel particle-in-cell (PIC algorithm for plasma simulations. In our approach, the time-consuming components of this application are designed and implemented as Fortran subroutines, while less calculation-intensive components usually involved in building the user interface are written in Java. The two types of software modules have been glued together using the Java native interface (JNI. Our mixed-language PIC code was tested and its performance compared with pure Java and Fortran versions of the same algorithm on a Sun E6500 SMP system and a Linux cluster of Pentium~III machines.

  7. High Performance Polar Decomposition on Distributed Memory Systems

    KAUST Repository

    Sukkari, Dalal E.

    2016-08-08

    The polar decomposition of a dense matrix is an important operation in linear algebra. It can be directly calculated through the singular value decomposition (SVD) or iteratively using the QR dynamically-weighted Halley algorithm (QDWH). The former is difficult to parallelize due to the preponderant number of memory-bound operations during the bidiagonal reduction. We investigate the latter scenario, which performs more floating-point operations but exposes at the same time more parallelism, and therefore, runs closer to the theoretical peak performance of the system, thanks to more compute-bound matrix operations. Profiling results show the performance scalability of QDWH for calculating the polar decomposition using around 9200 MPI processes on well and ill-conditioned matrices of 100K×100K problem size. We study then the performance impact of the QDWH-based polar decomposition as a pre-processing step toward calculating the SVD itself. The new distributed-memory implementation of the QDWH-SVD solver achieves up to five-fold speedup against current state-of-the-art vendor SVD implementations. © Springer International Publishing Switzerland 2016.

  8. Architectural Principles and Experimentation of Distributed High Performance Virtual Clusters

    Science.gov (United States)

    Younge, Andrew J.

    2016-01-01

    With the advent of virtualization and Infrastructure-as-a-Service (IaaS), the broader scientific computing community is considering the use of clouds for their scientific computing needs. This is due to the relative scalability, ease of use, advanced user environment customization abilities, and the many novel computing paradigms available for…

  9. High-Performance Special-Purpose Computers in Science

    OpenAIRE

    Fukushige, Toshiyuki; Hut, Piet; Makino, Jun

    1998-01-01

    The next decade will be an exciting time for computational physicists. After 50 years of being forced to use standardized commercial equipment, it will finally become relatively straightforward to adapt one's computing tools to one's own needs. The breakthrough that opens this new era is the now wide-spread availability of programmable chips that allow virtually every computational scientist to design his or her own special-purpose computer.

  10. 15 CFR 743.2 - High performance computers: Post shipment verification reporting.

    Science.gov (United States)

    2010-01-01

    ... 15 Commerce and Foreign Trade 2 2010-01-01 2010-01-01 false High performance computers: Post... ADMINISTRATION REGULATIONS SPECIAL REPORTING § 743.2 High performance computers: Post shipment verification... certain computers to destinations in Computer Tier 3, see § 740.7(d) for a list of these destinations...

  11. Scalable High Performance Computing: Direct and Large-Eddy Turbulent Flow Simulations Using Massively Parallel Computers

    Science.gov (United States)

    Morgan, Philip E.

    2004-01-01

    This final report contains reports of research related to the tasks "Scalable High Performance Computing: Direct and Lark-Eddy Turbulent FLow Simulations Using Massively Parallel Computers" and "Devleop High-Performance Time-Domain Computational Electromagnetics Capability for RCS Prediction, Wave Propagation in Dispersive Media, and Dual-Use Applications. The discussion of Scalable High Performance Computing reports on three objectives: validate, access scalability, and apply two parallel flow solvers for three-dimensional Navier-Stokes flows; develop and validate a high-order parallel solver for Direct Numerical Simulations (DNS) and Large Eddy Simulation (LES) problems; and Investigate and develop a high-order Reynolds averaged Navier-Stokes turbulence model. The discussion of High-Performance Time-Domain Computational Electromagnetics reports on five objectives: enhancement of an electromagnetics code (CHARGE) to be able to effectively model antenna problems; utilize lessons learned in high-order/spectral solution of swirling 3D jets to apply to solving electromagnetics project; transition a high-order fluids code, FDL3DI, to be able to solve Maxwell's Equations using compact-differencing; develop and demonstrate improved radiation absorbing boundary conditions for high-order CEM; and extend high-order CEM solver to address variable material properties. The report also contains a review of work done by the systems engineer.

  12. Configurable computing for high-security/high-performance ambient systems

    OpenAIRE

    Gogniat, Guy; Bossuet, Lilian; Burleson, Wayne

    2005-01-01

    This paper stresses why configurable computing is a promising target to guarantee the hardware security of ambient systems. Many works have focused on configurable computing to demonstrate its efficiency but as far as we know none have addressed the security issue from system to circuit levels. This paper recalls main hardware attacks before focusing on issues to build secure systems on configurable computing. Two complementary views are presented to provide a guide for security and main issues ...

  13. Scientific and high-performance computing at FAIR

    Directory of Open Access Journals (Sweden)

    Kisel Ivan

    2015-01-01

    Full Text Available Future FAIR experiments have to deal with very high input rates, large track multiplicities, make full event reconstruction and selection on-line on a large dedicated computer farm equipped with heterogeneous many-core CPU/GPU compute nodes. To develop efficient and fast algorithms, which are optimized for parallel computations, is a challenge for the groups of experts dealing with the HPC computing. Here we present and discuss the status and perspectives of the data reconstruction and physics analysis software of one of the future FAIR experiments, namely, the CBM experiment.

  14. Building a High Performance Computing Infrastructure for Novosibirsk Scientific Center

    Science.gov (United States)

    Adakin, A.; Belov, S.; Chubarov, D.; Kalyuzhny, V.; Kaplin, V.; Kuchin, N.; Lomakin, S.; Nikultsev, V.; Sukharev, A.; Zaytsev, A.

    2011-12-01

    Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies (ICT), and Institute of Computational Mathematics and Mathematical Geophysics (ICM&MG). Since each institute has specific requirements on the architecture of the computing farms involved in its research field, currentiy we've got several computing facilities hosted by NSC institutes, each optimized for the particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM&MG), and a Grid Computing Facility of BINP. Recendy a dedicated optical network with the initial bandwidth of 10 Gbps connecting these three facilities was built in order to make it possible to share the computing resources among the research communities of participating institutes, thus providing a common platform for building the computing infrastructure for various scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technologies based on XEN and KVM platforms. The solution implemented was tested thoroughly within the computing environment of KEDR detector experiment which is being carried out at BINP, and foreseen to be applied to the use cases of other HEP experiments in the upcoming future.

  15. Securing Cloud Infrastructure for High Performance Scientific Computations Using Cryptographic Techniques

    OpenAIRE

    Patra, G. K.; Nilotpal Chakraborty

    2014-01-01

    In today's scenario, a large scale of engineering and scientific applications requires high performance computation power in order to simulate various models. Scientific and Engineering models such as Climate Modeling, Weather Forecasting, Large Scale Ocean Modeling, Cyclone Prediction etc require parallel processing of data on high performance computing infrastructure. With the rise of cloud computing, it would be great if such high performance computations can be provided as a service to th...

  16. Ceph: A Scalable, High-Performance Distributed File System

    OpenAIRE

    Weil, Sage; Brandt, Scott A.; Miller, Ethan L; Long, Darrell D. E.; Maltzahn, Carlos

    2006-01-01

    We have developed Ceph, a distributed file system that provides excellent performance, reliability, and scala- bility. Ceph maximizes the separation between data and metadata management by replacing allocation ta- bles with a pseudo-random data distribution function (CRUSH) designed for heterogeneous and dynamic clus- ters of unreliable object storage devices (OSDs). We leverage device intelligence by distributing data replica- tion, failure detection and recovery to semi-autonomous ...

  17. High performance computing methods for the integration and analysis of biomedical data using SAS.

    Science.gov (United States)

    Brown, Justin R; Dinu, Valentin

    2013-12-01

    From microarrays and next generation sequencing to clinical records, the amount of biomedical data is growing at an exponential rate. Handling and analyzing these large amounts of data demands that computing power and methodologies keep pace. The goal of this paper is to illustrate how high performance computing methods in SAS can be easily implemented without the need of extensive computer programming knowledge or access to supercomputing clusters to help address the challenges posed by large biomedical datasets. We illustrate the utility of database connectivity, pipeline parallelism, multi-core parallel process and distributed processing across multiple machines. Simulation results are presented for parallel and distributed processing. Finally, a discussion of the costs and benefits of such methods compared to traditional HPC supercomputing clusters is given. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. Department of Energy research in utilization of high-performance computers

    Energy Technology Data Exchange (ETDEWEB)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-08-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure.

  19. RAPPORT: running scientific high-performance computing applications on the cloud

    National Research Council Canada - National Science Library

    Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

    2013-01-01

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software...

  20. RAPPORT: running scientific high-performance computing applications on the cloud

    National Research Council Canada - National Science Library

    Jeremy Cohen; Ioannis Filippis; Mark Woodbridge; Daniela Bauer; Neil Chue Hong; Mike Jackson; Sarah Butcher; David Colling; John Darlington; Brian Fuchs; Matt Harvey

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software...

  1. Benchmark Numerical Toolkits for High Performance Computing Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Computational codes in physics and engineering often use implicit solution algorithms that require linear algebra tools such as Ax=b solvers, eigenvalue,...

  2. Scalability of DL_POLY on High Performance Computing Platform

    CSIR Research Space (South Africa)

    Mabakane, Mabule S

    2017-12-01

    Full Text Available running on up to 100 processors (cores) (W. Smith & Todorov, 2006). DL_POLY_3 (including 3.09) utilises a static/equi-spacial Domain Decomposition parallelisation strategy in which the simulation cell (comprising the atoms, ions or molecules) is divided..., 1997; Lange et al., 2011). Traditionally, it is expected that codes should scale linearly when one increases computational resources such as compute nodes or servers (Chamberlain, Chace, & Patil, 1998; Gropp & Snir, 2009). However, several studies...

  3. High Performance Computing Innovation Service Portal Study (HPC-ISP)

    Science.gov (United States)

    2009-04-01

    based electronic commerce interface for the goods and services available through the brokerage service. This infrastructure will also support the... electronic commerce backend functionality for third parties that want to sell custom computing services. • Tailored Industry Portals are web portals for...broker shown in Figure 8 is essentially a web server that provides remote access to computing and software resources through an electronic commerce

  4. Full custom VLSI - A technology for high performance computing

    Science.gov (United States)

    Maki, Gary K.; Whitaker, Sterling R.

    1990-01-01

    Full custom VLSI is presented as a viable technology for addressing the need for the computing capabilities required for the real-time health monitoring of spacecraft systems. This technology presents solutions that cannot be realized with stored program computers or semicustom VLSI; also, it is not dependent on current IC processes. It is argued that, while design time is longer, full custom VLSI produces the fastest and densest VLSI solution and that high density normally also yields low manufacturing costs.

  5. Cooperative high-performance storage in the accelerated strategic computing initiative

    Science.gov (United States)

    Gary, Mark; Howard, Barry; Louis, Steve; Minuzzo, Kim; Seager, Mark

    1996-01-01

    The use and acceptance of new high-performance, parallel computing platforms will be impeded by the absence of an infrastructure capable of supporting orders-of-magnitude improvement in hierarchical storage and high-speed I/O (Input/Output). The distribution of these high-performance platforms and supporting infrastructures across a wide-area network further compounds this problem. We describe an architectural design and phased implementation plan for a distributed, Cooperative Storage Environment (CSE) to achieve the necessary performance, user transparency, site autonomy, communication, and security features needed to support the Accelerated Strategic Computing Initiative (ASCI). ASCI is a Department of Energy (DOE) program attempting to apply terascale platforms and Problem-Solving Environments (PSEs) toward real-world computational modeling and simulation problems. The ASCI mission must be carried out through a unified, multilaboratory effort, and will require highly secure, efficient access to vast amounts of data. The CSE provides a logically simple, geographically distributed, storage infrastructure of semi-autonomous cooperating sites to meet the strategic ASCI PSE goal of highperformance data storage and access at the user desktop.

  6. High performance computing system for flight simulation at NASA Langley

    Science.gov (United States)

    Cleveland, Jeff I., II; Sudik, Steven J.; Grove, Randall D.

    1991-01-01

    The computer architecture and components used in the NASA Langley Advanced Real-Time Simulation System (ARTSS) are briefly described and illustrated with diagrams and graphs. Particular attention is given to the advanced Convex C220 processing units, the UNIX-based operating system, the software interface to the fiber-optic-linked Computer Automated Measurement and Control system, configuration-management and real-time supervisor software, ARTSS hardware modifications, and the current implementation status. Simulation applications considered include the Transport Systems Research Vehicle, the Differential Maneuvering Simulator, the General Aviation Simulator, and the Visual Motion Simulator.

  7. High Performance Computing Assets for Ocean Acoustics Research

    Science.gov (United States)

    2016-11-18

    can be tested against theory , practicality of computational methods can be determined, and studies of underwater acoustics phenomena can be...our models. APPROACH As outlined above, each processor in the acoustic calculation must access a large amount of memory at tills time. This is

  8. High Performance Computing Modernization Program Kerberos Throughput Test Report

    Science.gov (United States)

    2017-10-26

    ORGANIZATION REPORT NUMBER 7. PERFORMING ORGANIZATION NAME(S) AND ADDRESS(ES) 10. SPONSOR / MONITOR’S ACRONYM(S)9. SPONSORING / MONITORING AGENCY...the high computing power of the main supercomputer. Each supercomputer is different in node architecture as well as hardware specifications. 2

  9. Computational Aeroscience at NAS and Future Trends in High Performance Computing

    Science.gov (United States)

    Cooper, David M.

    1994-01-01

    NASA's Numerical Aerodynamic Simulation (NAS) program provides a unique world class supercomputing capability which is readily accessed by the U.S. aeronautics community and offers them the opportunity to perform high speed computations and simulations for a broad range of aerospace research applications. In addition, the NAS program performs on-going research and advanced technology development to ensure the innovative application of newly emerging technologies to computational fluid dynamics and other important aeroscience disciplines. The NAS system and its interaction with industry are described. Furthermore, the results of an analysis of current and future technologies and their potential impact on high performance computing are also presented.

  10. High performance graphical data trending in a distributed system

    Science.gov (United States)

    Maureira, Cristián; Hoffstadt, Arturo; López, Joao; Troncoso, Nicolás; Tobar, Rodrigo; von Brand, Horst H.

    2010-07-01

    Trending near real-time data is a complex task, specially in distributed environments. This problem was typically tackled in financial and transaction systems, but it now applies to its utmost in other contexts, such as hardware monitoring in large-scale projects. Data handling requires subscription to specific data feeds that need to be implemented avoiding replication, and rate of transmission has to be assured. On the side of the graphical client, rendering needs to be fast enough so it may be perceived as real-time processing and display. ALMA Common Software (ACS) provides a software infrastructure for distributed projects which may require trending large volumes of data. For theses requirements ACS offers a Sampling System, which allows sampling selected data feeds at different frequencies. Along with this, it provides a graphical tool to plot the collected information, which needs to perform as well as possible. Currently there are many graphical libraries available for data trending. This imposes a problem when trying to choose one: It is necessary to know which has the best performance, and which combination of programming language and library is the best decision. This document analyzes the performance of different graphical libraries and languages in order to present the optimal environment when writing or re-factoring an application using trending technologies in distributed systems. To properly address the complexity of the problem, a specific set of alternative was pre-selected, including libraries in Java and Python, languages which are part of ACS. A stress benchmark will be developed in a simulated distributed environment using ACS in order to test the trending libraries.

  11. Detecting Distributed Scans Using High-Performance Query-DrivenVisualization

    Energy Technology Data Exchange (ETDEWEB)

    Stockinger, Kurt; Bethel, E. Wes; Campbell, Scott; Dart, Eli; Wu,Kesheng

    2006-09-01

    Modern forensic analytics applications, like network trafficanalysis, perform high-performance hypothesis testing, knowledgediscovery and data mining on very large datasets. One essential strategyto reduce the time required for these operations is to select only themost relevant data records for a given computation. In this paper, wepresent a set of parallel algorithms that demonstrate how an efficientselection mechanism -- bitmap indexing -- significantly speeds up acommon analysist ask, namely, computing conditional histogram on verylarge datasets. We present a thorough study of the performancecharacteristics of the parallel conditional histogram algorithms. Asacase study, we compute conditional histograms for detecting distributedscans hidden in a dataset consisting of approximately 2.5 billion networkconnection records. We show that these conditional histograms can becomputed on interactive timescale (i.e., in seconds). We also show how toprogressively modify the selection criteria to narrow the analysis andfind the sources of the distributed scans.

  12. Using High Performance Computing to Support Water Resource Planning

    Energy Technology Data Exchange (ETDEWEB)

    Groves, David G. [RAND Corporation, Santa Monica, CA (United States); Lembert, Robert J. [RAND Corporation, Santa Monica, CA (United States); May, Deborah W. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Leek, James R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Syme, James [RAND Corporation, Santa Monica, CA (United States)

    2015-10-22

    In recent years, decision support modeling has embraced deliberation-withanalysis— an iterative process in which decisionmakers come together with experts to evaluate a complex problem and alternative solutions in a scientifically rigorous and transparent manner. Simulation modeling supports decisionmaking throughout this process; visualizations enable decisionmakers to assess how proposed strategies stand up over time in uncertain conditions. But running these simulation models over standard computers can be slow. This, in turn, can slow the entire decisionmaking process, interrupting valuable interaction between decisionmakers and analytics.

  13. Energy-efficient high performance computing measurement and tuning

    CERN Document Server

    III, James H Laros; Kelly, Sue

    2012-01-01

    In this work, the unique power measurement capabilities of the Cray XT architecture were exploited to gain an understanding of power and energy use, and the effects of tuning both CPU and network bandwidth. Modifications were made to deterministically halt cores when idle. Additionally, capabilities were added to alter operating P-state. At the application level, an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale is gained by simultaneously collecting current and voltage measurements on the hosting nod

  14. High-performance computational solutions in protein bioinformatics

    CERN Document Server

    Mrozek, Dariusz

    2014-01-01

    Recent developments in computer science enable algorithms previously perceived as too time-consuming to now be efficiently used for applications in bioinformatics and life sciences. This work focuses on proteins and their structures, protein structure similarity searching at main representation levels and various techniques that can be used to accelerate similarity searches. Divided into four parts, the first part provides a formal model of 3D protein structures for functional genomics, comparative bioinformatics and molecular modeling. The second part focuses on the use of multithreading for

  15. Interactive Data Exploration for High-Performance Fluid Flow Computations through Porous Media

    KAUST Repository

    Perovic, Nevena

    2014-09-01

    © 2014 IEEE. Huge data advent in high-performance computing (HPC) applications such as fluid flow simulations usually hinders the interactive processing and exploration of simulation results. Such an interactive data exploration not only allows scientiest to \\'play\\' with their data but also to visualise huge (distributed) data sets in both an efficient and easy way. Therefore, we propose an HPC data exploration service based on a sliding window concept, that enables researches to access remote data (available on a supercomputer or cluster) during simulation runtime without exceeding any bandwidth limitations between the HPC back-end and the user front-end.

  16. Matrix element method for high performance computing platforms

    Science.gov (United States)

    Grasseau, G.; Chamont, D.; Beaudette, F.; Bianchini, L.; Davignon, O.; Mastrolorenzo, L.; Ochando, C.; Paganini, P.; Strebler, T.

    2015-12-01

    Lot of efforts have been devoted by ATLAS and CMS teams to improve the quality of LHC events analysis with the Matrix Element Method (MEM). Up to now, very few implementations try to face up the huge computing resources required by this method. We propose here a highly parallel version, combining MPI and OpenCL, which makes the MEM exploitation reachable for the whole CMS datasets with a moderate cost. In the article, we describe the status of two software projects under development, one focused on physics and one focused on computing. We also showcase their preliminary performance obtained with classical multi-core processors, CUDA accelerators and MIC co-processors. This let us extrapolate that with the help of 6 high-end accelerators, we should be able to reprocess the whole LHC run 1 within 10 days, and that we have a satisfying metric for the upcoming run 2. The future work will consist in finalizing a single merged system including all the physics and all the parallelism infrastructure, thus optimizing implementation for best hardware platforms.

  17. A high-performance brain-computer interface

    Science.gov (United States)

    Santhanam, Gopal; Ryu, Stephen I.; Yu, Byron M.; Afshar, Afsheen; Shenoy, Krishna V.

    2006-07-01

    Recent studies have demonstrated that monkeys and humans can use signals from the brain to guide computer cursors. Brain-computer interfaces (BCIs) may one day assist patients suffering from neurological injury or disease, but relatively low system performance remains a major obstacle. In fact, the speed and accuracy with which keys can be selected using BCIs is still far lower than for systems relying on eye movements. This is true whether BCIs use recordings from populations of individual neurons using invasive electrode techniques or electroencephalogram recordings using less- or non-invasive techniques. Here we present the design and demonstration, using electrode arrays implanted in monkey dorsal premotor cortex, of a manyfold higher performance BCI than previously reported. These results indicate that a fast and accurate key selection system, capable of operating with a range of keyboard sizes, is possible (up to 6.5 bits per second, or ~15wordsperminute, with 96 electrodes). The highest information throughput is achieved with unprecedentedly brief neural recordings, even as recording quality degrades over time. These performance results and their implications for system design should substantially increase the clinical viability of BCIs in humans.

  18. High performance computing in power and energy systems

    CERN Document Server

    Khaitan, Siddhartha Kumar

    2012-01-01

    The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would  need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, casc

  19. A Linear Algebra Framework for Static High Performance Fortran Code Distribution

    Directory of Open Access Journals (Sweden)

    Corinne Ancourt

    1997-01-01

    Full Text Available High Performance Fortran (HPF was developed to support data parallel programming for single-instruction multiple-data (SIMD and multiple-instruction multiple-data (MIMD machines with distributed memory. The programmer is provided a familiar uniform logical address space and specifies the data distribution by directives. The compiler then exploits these directives to allocate arrays in the local memories, to assign computations to elementary processors, and to migrate data between processors when required. We show here that linear algebra is a powerful framework to encode HPF directives and to synthesize distributed code with space-efficient array allocation, tight loop bounds, and vectorized communications for INDEPENDENT loops. The generated code includes traditional optimizations such as guard elimination, message vectorization and aggregation, and overlap analysis. The systematic use of an affine framework makes it possible to prove the compilation scheme correct.

  20. High performance computing and communications grand challenges program

    Energy Technology Data Exchange (ETDEWEB)

    Solomon, J.E.; Barr, A.; Chandy, K.M.; Goddard, W.A., III; Kesselman, C.

    1994-10-01

    The so-called protein folding problem has numerous aspects, however it is principally concerned with the {ital de novo} prediction of three-dimensional (3D) structure from the protein primary amino acid sequence, and with the kinetics of the protein folding process. Our current project focuses on the 3D structure prediction problem which has proved to be an elusive goal of molecular biology and biochemistry. The number of local energy minima is exponential in the number of amino acids in the protein. All current methods of 3D structure prediction attempt to alleviate this problem by imposing various constraints that effectively limit the volume of conformational space which must be searched. Our Grand Challenge project consists of two elements: (1) a hierarchical methodology for 3D protein structure prediction; and (2) development of a parallel computing environment, the Protein Folding Workbench, for carrying out a variety of protein structure prediction/modeling computations. During the first three years of this project, we are focusing on the use of two proteins selected from the Brookhaven Protein Data Base (PDB) of known structure to provide validation of our prediction algorithms and their software implementation, both serial and parallel. Both proteins, protein L from {ital peptostreptococcus magnus}, and {ital streptococcal} protein G, are known to bind to IgG, and both have an {alpha} {plus} {beta} sandwich conformation. Although both proteins bind to IgG, they do so at different sites on the immunoglobin and it is of considerable biological interest to understand structurally why this is so. 12 refs., 1 fig.

  1. Generalized Portable SHMEM Library for High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Parzyszek, Krzysztof [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    This dissertation describes the efforts to design and implement the Generalized Portable SHMEM library, GPSHMEM, as well as supplementary tools. There are two major components of the GPSHMEM project: the GPSHMEM library itself and the Fortran 77 source-to-source translator. The rest of this thesis is divided into two parts. Part I introduces the shared memory model and the distributed shared memory model. It explains the motivation behind GPSHMEM and presents its functionality and performance results. Part II is entirely devoted to the Fortran 77 translator call fgpp. The need for such a tool is demonstrated, functionality goals are stated, and the design issues are presented along with the development of the solutions.

  2. FY 1992 Blue Book: Grand Challenges: High Performance Computing and Communications

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

  3. FY 1993 Blue Book: Grand Challenges 1993: High Performance Computing and Communications

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

  4. A New Countermeasure against Brute-Force Attacks That Use High Performance Computers for Big Data Analysis

    OpenAIRE

    Hyun-Ju Jo; Ji Won Yoon

    2015-01-01

    Using high performance parallel and distributed computing systems, we can collect, generate, handle, and transmit ever-increasing amounts of data. However, these technical advancements also allow malicious individuals to obtain high computational power to attack cryptosystems. Traditional cryptosystem countermeasures have been somewhat passive in response to this change, because they simply increase computational costs by increasing key lengths. Cryptosystems that use the conventional counter...

  5. Analysis and modeling of social influence in high performance computing workloads

    KAUST Repository

    Zheng, Shuai

    2011-01-01

    Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time. © 2011 Springer-Verlag.

  6. High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

    Science.gov (United States)

    Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

    2012-09-01

    , and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.

  7. Demonstration of Cost-Effective, High-Performance Computing at Performance and Reliability Levels Equivalent to a 1994 Vector Supercomputer

    Science.gov (United States)

    Babrauckas, Theresa

    2000-01-01

    The Affordable High Performance Computing (AHPC) project demonstrated that high-performance computing based on a distributed network of computer workstations is a cost-effective alternative to vector supercomputers for running CPU and memory intensive design and analysis tools. The AHPC project created an integrated system called a Network Supercomputer. By connecting computer work-stations through a network and utilizing the workstations when they are idle, the resulting distributed-workstation environment has the same performance and reliability levels as the Cray C90 vector Supercomputer at less than 25 percent of the C90 cost. In fact, the cost comparison between a Cray C90 Supercomputer and Sun workstations showed that the number of distributed networked workstations equivalent to a C90 costs approximately 8 percent of the C90.

  8. High-performance computing and networking as tools for accurate emission computed tomography reconstruction

    Energy Technology Data Exchange (ETDEWEB)

    Passeri, A. [Dipartimento di Fisiopatologia Clinica - Sezione di Medicina Nucleare, Universita` di Firenze (Italy); Formiconi, A.R. [Dipartimento di Fisiopatologia Clinica - Sezione di Medicina Nucleare, Universita` di Firenze (Italy); De Cristofaro, M.T.E.R. [Dipartimento di Fisiopatologia Clinica - Sezione di Medicina Nucleare, Universita` di Firenze (Italy); Pupi, A. [Dipartimento di Fisiopatologia Clinica - Sezione di Medicina Nucleare, Universita` di Firenze (Italy); Meldolesi, U. [Dipartimento di Fisiopatologia Clinica - Sezione di Medicina Nucleare, Universita` di Firenze (Italy)

    1997-04-01

    It is well known that the quantitative potential of emission computed tomography (ECT) relies on the ability to compensate for resolution, attenuation and scatter effects. Reconstruction algorithms which are able to take these effects into account are highly demanding in terms of computing resources. The reported work aimed to investigate the use of a parallel high-performance computing platform for ECT reconstruction taking into account an accurate model of the acquisition of single-photon emission tomographic (SPET) data. An iterative algorithm with an accurate model of the variable system response was ported on the MIMD (Multiple Instruction Multiple Data) parallel architecture of a 64-node Cray T3D massively parallel computer. The system was organized to make it easily accessible even from low-cost PC-based workstations through standard TCP/IP networking. A complete brain study of 30 (64 x 64) slices could be reconstructed from a set of 90 (64 x 64) projections with ten iterations of the conjugate gradients algorithm in 9 s, corresponding to an actual speed-up factor of 135. This work demonstrated the possibility of exploiting remote high-performance computing and networking resources from hospital sites by means of low-cost workstations using standard communication protocols without particular problems for routine use. The achievable speed-up factors allow the assessment of the clinical benefit of advanced reconstruction techniques which require a heavy computational burden for the compensation effects such as variable spatial resolution, scatter and attenuation. The possibility of using the same software on the same hardware platform with data acquired in different laboratories with various kinds of SPET instrumentation is appealing for software quality control and for the evaluation of the clinical impact of the reconstruction methods. (orig.). With 4 figs., 1 tab.

  9. CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

    OpenAIRE

    YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

    2009-01-01

    [ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...

  10. High performance computing for three-dimensional agent-based molecular models.

    Science.gov (United States)

    Pérez-Rodríguez, G; Pérez-Pérez, M; Fdez-Riverola, F; Lourenço, A

    2016-07-01

    Agent-based simulations are increasingly popular in exploring and understanding cellular systems, but the natural complexity of these systems and the desire to grasp different modelling levels demand cost-effective simulation strategies and tools. In this context, the present paper introduces novel sequential and distributed approaches for the three-dimensional agent-based simulation of individual molecules in cellular events. These approaches are able to describe the dimensions and position of the molecules with high accuracy and thus, study the critical effect of spatial distribution on cellular events. Moreover, two of the approaches allow multi-thread high performance simulations, distributing the three-dimensional model in a platform independent and computationally efficient way. Evaluation addressed the reproduction of molecular scenarios and different scalability aspects of agent creation and agent interaction. The three approaches simulate common biophysical and biochemical laws faithfully. The distributed approaches show improved performance when dealing with large agent populations while the sequential approach is better suited for small to medium size agent populations. Overall, the main new contribution of the approaches is the ability to simulate three-dimensional agent-based models at the molecular level with reduced implementation effort and moderate-level computational capacity. Since these approaches have a generic design, they have the major potential of being used in any event-driven agent-based tool. Copyright © 2016 Elsevier Inc. All rights reserved.

  11. Department of Energy Mathematical, Information, and Computational Sciences Division: High Performance Computing and Communications Program

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-11-01

    This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, The DOE Program in HPCC), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW).

  12. Department of Energy: MICS (Mathematical Information, and Computational Sciences Division). High performance computing and communications program

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-06-01

    This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, {open_quotes}The DOE Program in HPCC{close_quotes}), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW). The information pointed to by the URL is updated frequently, and the interested reader is urged to access the WWW for the latest information.

  13. Calculating CyberShake Map 1.0 on Shared Open Science High Performance Computing Resources

    Science.gov (United States)

    Callaghan, S.; Maechling, P. J.; Deelman, E.; Small, P.; Vahi, K.; Mehta, G.; Juve, G.; Milner, K.; Graves, R. W.; Field, E. H.; Okaya, D. A.; Jordan, T. H.

    2009-12-01

    Current Probabilistic Seismic Hazard Analysis (PSHA) calculations produce predictive seismic hazard curves using an earthquake rupture forecast (ERF) and a ground motion prediction equation (GMPE) that defines how ground motions decay with distance from an earthquake. Traditionally, GMPEs are empirically-based attenuation models. However, these GMPEs have important limitations. The observational data used to develop the attenuation relationships do not cover the full range of possible earthquake magnitudes. These GMPEs predict only peak ground motions, and do not produce ground motion time series. Phenomena such as rupture directivity and basin effects may not be well captured with these GMPEs. To improve the accuracy of PSHA calculations, researchers at the Southern California Earthquake Center (SCEC) are performing physics-based PSHA using the CyberShake project. CyberShake utilizes full 3D waveform modeling as the GMPE to compute PSHA hazard curves for various sites in Southern California. For each rupture in the Uniform California Earthquake Rupture Forecast (UCERF) 2.0, we capture rupture variability by varying the hypocenter and slip distribution to produce about 410,000 different events (“rupture variations”) per site of interest. We calculate strain Green’s tensors for each site and use seismic reciprocity to compute the intensity measure of interest for each rupture variation. These intensity measures are then synthesized into a hazard curve, relating shaking levels to probability of exceedance. The goal of CyberShake is to calculate more accurate hazard curves using high performance computing techniques. A set of hazard curves, computed for many sites in a region, can be integrated to produce a regional hazard map. In 2009, a series of CyberShake calculations, called the CyberShake 1.0 Map calculation, were performed using distributed resources on Texas Advanced Computing Center’s Ranger system, part of the NSF-funded TeraGrid, and the University

  14. Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.

    2009-10-01

    This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.

  15. Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

    Science.gov (United States)

    Moon, Hongsik

    changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.

  16. FY 1996 Blue Book: High Performance Computing and Communications: Foundations for America`s Information Future

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...

  17. Workshop on programming languages for high performance computing (HPCWPL): final report.

    Energy Technology Data Exchange (ETDEWEB)

    Murphy, Richard C.

    2007-05-01

    This report summarizes the deliberations and conclusions of the Workshop on Programming Languages for High Performance Computing (HPCWPL) held at the Sandia CSRI facility in Albuquerque, NM on December 12-13, 2006.

  18. Comprehensive Simulation Lifecycle Management for High Performance Computing Modeling and Simulation Project

    Data.gov (United States)

    National Aeronautics and Space Administration — There are significant logistical barriers to entry-level high performance computing (HPC) modeling and simulation (M&S) users. Performing large-scale, massively...

  19. FY 1997 Blue Book: High Performance Computing and Communications: Advancing the Frontiers of Information Technology

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...

  20. Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

    Science.gov (United States)

    Caulfield, H John; Dolev, Shlomi; Green, William M J

    2009-08-01

    The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.

  1. An Embedded System for applying High Performance Computing in Educational Learning Activity

    Directory of Open Access Journals (Sweden)

    Irene Erlyn Wina Rachmawan

    2016-08-01

    Full Text Available HPC (High Performance Computing has become more popular in the last few years. With the benefits on high computational power, HPC has impact on industry, scientific research and educational activities. Implementing HPC as a curriculum in universities could be consuming a lot of resources because well-known HPC system are using Personal Computer or Server. By using PC as the practical moduls it is need great resources and spaces.  This paper presents an innovative high performance computing cluster system to support education learning activities in HPC course with small size, low cost, and yet powerful enough. In recent years, High Performance computing usually implanted in cluster computing and require high specification computer and expensive cost. It is not efficient applying High Performance Computing in Educational research activiry such as learning in Class. Therefore, our proposed system is created with inexpensive component by using Embedded System to make High Performance Computing applicable for leaning in the class. Students involved in the construction of embedded system, built clusters from basic embedded and network components, do benchmark performance, and implement simple parallel case using the cluster.  In this research we performed evaluation of embedded systems comparing with i5 PC, the results of our embedded system performance of NAS benchmark are similar with i5 PCs. We also conducted surveys about student learning satisfaction that with embedded system students are able to learn about HPC from building the system until making an application that use HPC system.

  2. DPR-tree: a distributed parallel spatial index structure for high performance spatial databases

    Science.gov (United States)

    Zhou, Yan; Zhu, Qing; Liu, Qiang

    2008-12-01

    Parallelism of spatial index could significantly improve the performance of spatial queries, special for massive spatial databases, so the research of parallel spatial index takes a important role in high performance spatial databases. Existing parallel spatial index methods have two main shortcoming: one is accessing hotspot and bottleneck of index items located in main server, the other is high costs and complicated operations for maintaining index consistency. Aim at these, a distributed parallel spatial index structure called DPR-tree is proposed. It splits whole index region into partition sub-regions by using Hilbert space-filling curve grid and organizes index sub-regions according to locality of spatial objects, then maps index sub-regions to partition sub-regions and assigns these index sub-regions to different computer nodes by a appointed map function, Each computer node manages a multi-level distributed sub-Rtree which is built from a index sub-region. Our experimental results indicate that the proposed parallel spatial index can achieve speedup well and offer significant potential for reducing query response time.

  3. A Convolve-And-MErge Approach for Exact Computations on High-Performance Reconfigurable Computers

    Directory of Open Access Journals (Sweden)

    Esam El-Araby

    2012-01-01

    Full Text Available This work presents an approach for accelerating arbitrary-precision arithmetic on high-performance reconfigurable computers (HPRCs. Although faster and smaller, fixed-precision arithmetic has inherent rounding and overflow problems that can cause errors in scientific or engineering applications. This recurring phenomenon is usually referred to as numerical nonrobustness. Therefore, there is an increasing interest in the paradigm of exact computation, based on arbitrary-precision arithmetic. There are a number of libraries and/or languages supporting this paradigm, for example, the GNU multiprecision (GMP library. However, the performance of computations is significantly reduced in comparison to that of fixed-precision arithmetic. In order to reduce this performance gap, this paper investigates the acceleration of arbitrary-precision arithmetic on HPRCs. A Convolve-And-MErge approach is proposed, that implements virtual convolution schedules derived from the formal representation of the arbitrary-precision multiplication problem. Additionally, dynamic (nonlinear pipeline techniques are also exploited in order to achieve speedups ranging from 5x (addition to 9x (multiplication, while keeping resource usage of the reconfigurable device low, ranging from 11% to 19%.

  4. Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

    2017-07-14

    As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.

  5. High Performance Computing Facility Operational Assessment, FY 2010 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; White, Julia C [ORNL

    2010-08-01

    Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energy assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools

  6. High Performance Computing (HPC) Innovation Service Portal Pilots Cloud Computing (HPC-ISP Pilot Cloud Computing)

    Science.gov (United States)

    2011-08-01

    connection. The login node runs Redhat Enterprise Linux (RHEL) 5, and has several responsibilities. It manages the Ghost licenses, runs the compute node...Center PaaS Platform as a service PNCC Power node control center QDR Quad data rate RHEL Redhat Enterprise Linux ROI Return on investment SaaS

  7. Process for selecting NEAMS applications for access to Idaho National Laboratory high performance computing resources

    Energy Technology Data Exchange (ETDEWEB)

    Michael Pernice

    2010-09-01

    INL has agreed to provide participants in the Nuclear Energy Advanced Mod- eling and Simulation (NEAMS) program with access to its high performance computing (HPC) resources under sponsorship of the Enabling Computational Technologies (ECT) program element. This report documents the process used to select applications and the software stack in place at INL.

  8. The DOE Program in HPCC: High-Performance Computing and Communications.

    Science.gov (United States)

    Department of Energy, Washington, DC. Office of Energy Research.

    This document reports to Congress on the progress that the Department of Energy has made in 1992 toward achieving the goals of the High Performance Computing and Communications (HPCC) program. Its second purpose is to provide a picture of the many programs administered by the Office of Scientific Computing under the auspices of the HPCC program.…

  9. High-performance computing on GPUs for resistivity logging of oil and gas wells

    Science.gov (United States)

    Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

    2017-10-01

    We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.

  10. Analysis and Modeling of Social In uence in High Performance Computing Workloads

    KAUST Repository

    Zheng, Shuai

    2011-06-01

    High Performance Computing (HPC) is becoming a common tool in many research areas. Social influence (e.g., project collaboration) among increasing users of HPC systems creates bursty behavior in underlying workloads. This bursty behavior is increasingly common with the advent of grid computing and cloud computing. Mining the user bursty behavior is important for HPC workloads prediction and scheduling, which has direct impact on overall HPC computing performance. A representative work in this area is the Mixed User Group Model (MUGM), which clusters users according to the resource demand features of their submissions, such as duration time and parallelism. However, MUGM has some difficulties when implemented in real-world system. First, representing user behaviors by the features of their resource demand is usually difficult. Second, these features are not always available. Third, measuring the similarities among users is not a well-defined problem. In this work, we propose a Social Influence Model (SIM) to identify, analyze, and quantify the level of social influence across HPC users. The advantage of the SIM model is that it finds HPC communities by analyzing user job submission time, thereby avoiding the difficulties of MUGM. An offline algorithm and a fast-converging, computationally-efficient online learning algorithm for identifying social groups are proposed. Both offline and online algorithms are applied on several HPC and grid workloads, including Grid 5000, EGEE 2005 and 2007, and KAUST Supercomputing Lab (KSL) BGP data. From the experimental results, we show the existence of a social graph, which is characterized by a pattern of dominant users and followers. In order to evaluate the effectiveness of identified user groups, we show the pattern discovered by the offline algorithm follows a power-law distribution, which is consistent with those observed in mainstream social networks. We finally conclude the thesis and discuss future directions of our work.

  11. High Performance Computing Facility Operational Assessment 2015: Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Barker, Ashley D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bland, Arthur S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Gary, Jeff D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Hack, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; McNally, Stephen T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Rogers, James H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Smith, Brian E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Straatsma, T. P. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Sukumar, Sreenivas Rangan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Thach, Kevin G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Tichenor, Suzy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Vazhkudai, Sudharshan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Wells, Jack C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility

    2016-03-01

    Oak Ridge National Laboratory’s (ORNL’s) Leadership Computing Facility (OLCF) continues to surpass its operational target goals: supporting users; delivering fast, reliable systems; creating innovative solutions for high-performance computing (HPC) needs; and managing risks, safety, and security aspects associated with operating one of the most powerful computers in the world. The results can be seen in the cutting-edge science delivered by users and the praise from the research community. Calendar year (CY) 2015 was filled with outstanding operational results and accomplishments: a very high rating from users on overall satisfaction that ties the highest-ever mark set in CY 2014; the greatest number of core-hours delivered to research projects; the largest percentage of capability usage since the OLCF began tracking the metric in 2009; and success in delivering on the allocation of 60, 30, and 10% of core hours offered for the INCITE (Innovative and Novel Computational Impact on Theory and Experiment), ALCC (Advanced Scientific Computing Research Leadership Computing Challenge), and Director’s Discretionary programs, respectively. These accomplishments, coupled with the extremely high utilization rate, represent the fulfillment of the promise of Titan: maximum use by maximum-size simulations. The impact of all of these successes and more is reflected in the accomplishments of OLCF users, with publications this year in notable journals Nature, Nature Materials, Nature Chemistry, Nature Physics, Nature Climate Change, ACS Nano, Journal of the American Chemical Society, and Physical Review Letters, as well as many others. The achievements included in the 2015 OLCF Operational Assessment Report reflect first-ever or largest simulations in their communities; for example Titan enabled engineers in Los Angeles and the surrounding region to design and begin building improved critical infrastructure by enabling the highest-resolution Cybershake map for Southern

  12. Towards a high performance parallel library to compute fluid and flexible structures interactions

    Science.gov (United States)

    Nagar, Prateek

    LBM-IB method is useful and popular simulation technique that is adopted ubiquitously to solve Fluid-Structure interaction problems in computational fluid dynamics. These problems are known for utilizing computing resources intensively while solving mathematical equations involved in simulations. Problems involving such interactions are omnipresent, therefore, it is eminent that a faster and accurate algorithm exists for solving these equations, to reproduce a real-life model of such complex analytical problems in a shorter time period. LBM-IB being inherently parallel, proves to be an ideal candidate for developing a parallel software. This research focuses on developing a parallel software library, LBM-IB based on the algorithm proposed by [1] which is first of its kind that utilizes the high performance computing abilities of supercomputers procurable today. An initial sequential version of LBM-IB is developed that is used as a benchmark for correctness and performance evaluation of shared memory parallel versions. Two shared memory parallel versions of LBM-IB have been developed using OpenMP and Pthread library respectively. The OpenMP version is able to scale well enough, as good as 83% speedup on multicore machines for 8 cores. Based on the profiling and instrumentation done on this version, to improve the data-locality and increase the degree of parallelism, Pthread based data centric version is developed which is able to outperform the OpenMP version by 53% on manycore machines. A distributed version using the MPI interfaces on top of the cube based Pthread version has also been designed to be used by extreme scale distributed memory manycore systems.

  13. Issues in undergraduate education in computational science and high performance computing

    Energy Technology Data Exchange (ETDEWEB)

    Marchioro, T.L. II; Martin, D. [Ames Lab., IA (United States)

    1994-12-31

    The ever increasing need for mathematical and computational literacy within their society and among members of the work force has generated enormous pressure to revise and improve the teaching of related subjects throughout the curriculum, particularly at the undergraduate level. The Calculus Reform movement is perhaps the best known example of an organized initiative in this regard. The UCES (Undergraduate Computational Engineering and Science) project, an effort funded by the Department of Energy and administered through the Ames Laboratory, is sponsoring an informal and open discussion of the salient issues confronting efforts to improve and expand the teaching of computational science as a problem oriented, interdisciplinary approach to scientific investigation. Although the format is open, the authors hope to consider pertinent questions such as: (1) How can faculty and research scientists obtain the recognition necessary to further excellence in teaching the mathematical and computational sciences? (2) What sort of educational resources--both hardware and software--are needed to teach computational science at the undergraduate level? Are traditional procedural languages sufficient? Are PCs enough? Are massively parallel platforms needed? (3) How can electronic educational materials be distributed in an efficient way? Can they be made interactive in nature? How should such materials be tied to the World Wide Web and the growing ``Information Superhighway``?

  14. Possibility of high performance quantum computation by superluminal evanescent photons in living systems.

    Science.gov (United States)

    Musha, Takaaki

    2009-06-01

    Penrose and Hameroff have suggested that microtubules in living systems function as quantum computers by utilizing evanescent photons. On the basis of the theorem that the evanescent photon is a superluminal particle, the possibility of high performance computation in living systems has been studied. From the theoretical analysis, it is shown that the biological brain can achieve large quantum bits computation compared with the conventional processors at room temperature.

  15. Guest Editorial High Performance Computing (HPC) Applications for a More Resilient and Efficient Power Grid

    Energy Technology Data Exchange (ETDEWEB)

    Huang, Zhenyu Henry; Tate, Zeb; Abhyankar, Shrirang; Dong, Zhaoyang; Khaitan, Siddhartha; Min, Liang; Taylor, Gary

    2017-05-01

    The power grid has been evolving over the last 120 years, but it is seeing more changes in this decade and next than it has seen over the past century. In particular, the widespread deployment of intermittent renewable generation, smart loads and devices, hierarchical and distributed control technologies, phasor measurement units, energy storage, and widespread usage of electric vehicles will require fundamental changes in methods and tools for the operation and planning of the power grid. The resulting new dynamic and stochastic behaviors will demand the inclusion of more complexity in modeling the power grid. Solving such complex models in the traditional computing environment will be a major challenge. Along with the increasing complexity of power system models, the increasing complexity of smart grid data further adds to the prevailing challenges. In this environment, the myriad of smart sensors and meters in the power grid increase by multiple orders of magnitude, so do the volume and speed of the data. The information infrastructure will need to drastically change to support the exchange of enormous amounts of data as smart grid applications will need the capability to collect, assimilate, analyze and process the data, to meet real-time grid functions. High performance computing (HPC) holds the promise to enhance these functions, but it is a great resource that has not been fully explored and adopted for the power grid domain.

  16. Exploring Graphics Processing Unit (GPU Resource Sharing Efficiency for High Performance Computing

    Directory of Open Access Journals (Sweden)

    Teng Li

    2013-11-01

    Full Text Available The increasing incorporation of Graphics Processing Units (GPUs as accelerators has been one of the forefront High Performance Computing (HPC trends and provides unprecedented performance; however, the prevalent adoption of the Single-Program Multiple-Data (SPMD programming model brings with it challenges of resource underutilization. In other words, under SPMD, every CPU needs GPU capability available to it. However, since CPUs generally outnumber GPUs, the asymmetric resource distribution gives rise to overall computing resource underutilization. In this paper, we propose to efficiently share the GPU under SPMD and formally define a series of GPU sharing scenarios. We provide performance-modeling analysis for each sharing scenario with accurate experimentation validation. With the modeling basis, we further conduct experimental studies to explore potential GPU sharing efficiency improvements from multiple perspectives. Both further theoretical and experimental GPU sharing performance analysis and results are presented. Our results not only demonstrate the significant performance gain for SPMD programs with the proposed efficient GPU sharing, but also the further improved sharing efficiency with the optimization techniques based on our accurate modeling.

  17. Video indexing using a high-performance and low-computation color-based opportunistic technique

    Science.gov (United States)

    Ahmed, Mohamed; Karmouch, Ahmed

    2002-02-01

    Video information, image processing, and computer vision techniques are developing rapidly because of the availability of acquisition, processing, and editing tools that use current hardware and software systems. However, problems still remain in conveying this video data to the end users. Limiting factors are the resource capabilities in distributed architectures and the features of the users' terminals. The efficient use of image processing, video indexing, and analysis techniques can provide users with solutions or alternatives. We see the video stream as a sequence of correlated images containing in its structure temporal events such as camera editing effects and presents a new algorithm for achieving video segmentation, indexing, and key framing tasks. The algorithm is based on color histograms and uses a binary penetration technique. Although much has been done in this area, most work does not adequately consider the optimization of timing performance and processing storage. This is especially the case if the techniques are designed for use in run-time distributed environments. Our main contribution is to blend high performance and storage criteria with the need to achieve effective results. The algorithm exploits the temporal heuristic characteristic of the visual information within a video stream. It takes into consideration the issues of detecting false cuts and missing true cuts due to the movement of the camera, the optical flow of large objects, or both. We provide a discussion, together with results from experiments and from the implementation of our application, to show the merits of the new algorithm as compared to the existing one.

  18. Automatic Identification of LAMOST Spectra with Continuum Problem Based on High Performance Computing

    Science.gov (United States)

    Yu, Jingjing; Pan, Jingchang; Meng, Fanlong

    2017-10-01

    Continuum problem is a phenomenon that the continuum of spectra get off their actual continuum even break off due to interstellar extinction and flux calibration, and it will have negative impact on the subsequent process such as spectral line extraction and so on. Based on this problem and fully considering of the continuum features of the stellar spectra, a method of automatic detection and recognition of the continuum problem in the stellar spectra based on High Performance Computing (HPC) is proposed, which will improve work efficiency greatly compared with the traditional human eyes examination under the condition of maintaining high accuracy. Continuum template matching is used to identify the continuum problem spectra in this paper. The first step goes to fit continuum of the test stellar spectra and the template spectra, then the flux differences of the two continuum we fitted before at every point of wavelength in the continuum spectra will be calculated based on full spectra matching to analyze the features of its distribution. The features we count are average (called β) value and standard deviation (called δ). The percentage of points distributed in rangeβ±ɑ*δwill be detected to confirm that if there is continuum problem. Experiments was taken to identify the continuum problem spectra on HPC, which demonstrate the validity and efficiency of the method, and this has important reference value to resolve similar massive spectra data processing problems in the future.

  19. Current developments in and importance of high-performance computing in drug discovery.

    Science.gov (United States)

    Pitera, Jed W

    2009-05-01

    A number of current trends that are being adopted to reshape the field of high-performance computing exist, including multi-core systems, accelerators, and software frameworks for large-scale intrinsically parallel applications. These trends intersect with recent developments in computational chemistry to provide new capabilities for computer-aided drug discovery. Although this review focuses primarily on the application domains of molecular modeling and biomolecular simulation, these computing changes are relevant for other computationally intensive tasks, such as instrument data processing and chemoinformatics.

  20. Grand Challenges: High Performance Computing and Communications. The FY 1992 U.S. Research and Development Program.

    Science.gov (United States)

    Federal Coordinating Council for Science, Engineering and Technology, Washington, DC.

    This report presents a review of the High Performance Computing and Communications (HPCC) Program, which has as its goal the acceleration of the commercial availability and utilization of the next generation of high performance computers and networks in order to: (1) extend U.S. technological leadership in high performance computing and computer…

  1. High performance computing in science and engineering Garching/Munich 2016

    Energy Technology Data Exchange (ETDEWEB)

    Wagner, Siegfried; Bode, Arndt; Bruechle, Helmut; Brehm, Matthias (eds.)

    2016-11-01

    Computer simulations are the well-established third pillar of natural sciences along with theory and experimentation. Particularly high performance computing is growing fast and constantly demands more and more powerful machines. To keep pace with this development, in spring 2015, the Leibniz Supercomputing Centre installed the high performance computing system SuperMUC Phase 2, only three years after the inauguration of its sibling SuperMUC Phase 1. Thereby, the compute capabilities were more than doubled. This book covers the time-frame June 2014 until June 2016. Readers will find many examples of outstanding research in the more than 130 projects that are covered in this book, with each one of these projects using at least 4 million core-hours on SuperMUC. The largest scientific communities using SuperMUC in the last two years were computational fluid dynamics simulations, chemistry and material sciences, astrophysics, and life sciences.

  2. High Performance Computing Application: Solar Dynamo Model Project II, Corona and Heliosphere Component Initialization, Integration and Validation

    Science.gov (United States)

    2015-06-24

    AFRL-RD-PS- TR-2015-0028 AFRL-RD-PS- TR-2015-0028 HIGH PERFORMANCE COMPUTING APPLICATION: SOLAR DYNAMO MODEL PROJECT II; CORONA AND HELIOSPHERE...Dynamo Model Project II, Corona and Heliosphere Component Initialization, Integration and Validation 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6...Approved for public release; distribution is unlimited. 13. SUPPLEMENTARY NOTES 14. ABSTRACT This report reviews the status of current day solar corona and

  3. The Convergence of High Performance Computing and Large Scale Data Analytics

    Science.gov (United States)

    Duffy, D.; Bowen, M. K.; Thompson, J. H.; Yang, C. P.; Hu, F.; Wills, B.

    2015-12-01

    As the combinations of remote sensing observations and model outputs have grown, scientists are increasingly burdened with both the necessity and complexity of large-scale data analysis. Scientists are increasingly applying traditional high performance computing (HPC) solutions to solve their "Big Data" problems. While this approach has the benefit of limiting data movement, the HPC system is not optimized to run analytics, which can create problems that permeate throughout the HPC environment. To solve these issues and to alleviate some of the strain on the HPC environment, the NASA Center for Climate Simulation (NCCS) has created the Advanced Data Analytics Platform (ADAPT), which combines both HPC and cloud technologies to create an agile system designed for analytics. Large, commonly used data sets are stored in this system in a write once/read many file system, such as Landsat, MODIS, MERRA, and NGA. High performance virtual machines are deployed and scaled according to the individual scientist's requirements specifically for data analysis. On the software side, the NCCS and GMU are working with emerging commercial technologies and applying them to structured, binary scientific data in order to expose the data in new ways. Native NetCDF data is being stored within a Hadoop Distributed File System (HDFS) enabling storage-proximal processing through MapReduce while continuing to provide accessibility of the data to traditional applications. Once the data is stored within HDFS, an additional indexing scheme is built on top of the data and placed into a relational database. This spatiotemporal index enables extremely fast mappings of queries to data locations to dramatically speed up analytics. These are some of the first steps toward a single unified platform that optimizes for both HPC and large-scale data analysis, and this presentation will elucidate the resulting and necessary exascale architectures required for future systems.

  4. A performance model for the communication in fast multipole methods on high-performance computing platforms

    KAUST Repository

    Ibeid, Huda

    2016-03-04

    Exascale systems are predicted to have approximately 1 billion cores, assuming gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics but has recently been extended to a wider range of problems. Its high arithmetic intensity combined with its linear complexity and asynchronous communication patterns make it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on internode communication. We focus on the communication part only; the efficiency of the computational kernels are beyond the scope of the present study. We develop a performance model that considers the communication patterns of the FMM and observe a good match between our model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization of internode communication in FMM that validates the model against actual measurements of communication time. The ultimate communication model is predictive in an absolute sense; however, on complex systems, this objective is often out of reach or of a difficulty out of proportion to its benefit when there exists a simpler model that is inexpensive and sufficient to guide coding decisions leading to improved scaling. The current model provides such guidance.

  5. High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

    Directory of Open Access Journals (Sweden)

    Florin Pop

    2014-01-01

    Full Text Available Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.

  6. 14th annual Results and Review Workshop on High Performance Computing in Science and Engineering

    CERN Document Server

    Nagel, Wolfgang E; Resch, Michael M; Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2011; High Performance Computing in Science and Engineering '11

    2012-01-01

    This book presents the state-of-the-art in simulation on supercomputers. Leading researchers present results achieved on systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2011. The reports cover all fields of computational science and engineering, ranging from CFD to computational physics and chemistry, to computer science, with a special emphasis on industrially relevant applications. Presenting results for both vector systems and microprocessor-based systems, the book allows readers to compare the performance levels and usability of various architectures. As HLRS

  7. A Study of Complex Deep Learning Networks on High Performance, Neuromorphic, and Quantum Computers

    Energy Technology Data Exchange (ETDEWEB)

    Potok, Thomas E [ORNL; Schuman, Catherine D [ORNL; Young, Steven R [ORNL; Patton, Robert M [ORNL; Spedalieri, Federico [University of Southern California, Information Sciences Institute; Liu, Jeremy [University of Southern California, Information Sciences Institute; Yao, Ke-Thia [University of Southern California, Information Sciences Institute; Rose, Garrett [University of Tennessee (UT); Chakma, Gangotree [University of Tennessee (UT)

    2016-01-01

    Current Deep Learning models use highly optimized convolutional neural networks (CNN) trained on large graphical processing units (GPU)-based computers with a fairly simple layered network topology, i.e., highly connected layers, without intra-layer connections. Complex topologies have been proposed, but are intractable to train on current systems. Building the topologies of the deep learning network requires hand tuning, and implementing the network in hardware is expensive in both cost and power. In this paper, we evaluate deep learning models using three different computing architectures to address these problems: quantum computing to train complex topologies, high performance computing (HPC) to automatically determine network topology, and neuromorphic computing for a low-power hardware implementation. Due to input size limitations of current quantum computers we use the MNIST dataset for our evaluation. The results show the possibility of using the three architectures in tandem to explore complex deep learning networks that are untrainable using a von Neumann architecture. We show that a quantum computer can find high quality values of intra-layer connections and weights, while yielding a tractable time result as the complexity of the network increases; a high performance computer can find optimal layer-based topologies; and a neuromorphic computer can represent the complex topology and weights derived from the other architectures in low power memristive hardware. This represents a new capability that is not feasible with current von Neumann architecture. It potentially enables the ability to solve very complicated problems unsolvable with current computing technologies.

  8. Medical applications for high-performance computers in SKIF-GRID network.

    Science.gov (United States)

    Zhuchkov, Alexey; Tverdokhlebov, Nikolay

    2009-01-01

    The paper presents a set of software services for massive mammography image processing by using high-performance parallel computers of SKIF-family which are linked into a service-oriented grid-network. An experience of a prototype system implementation in two medical institutions is also described.

  9. Business Models of High Performance Computing Centres in Higher Education in Europe

    Science.gov (United States)

    Eurich, Markus; Calleja, Paul; Boutellier, Roman

    2013-01-01

    High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…

  10. 76 FR 36986 - Export Controls for High Performance Computers: Wassenaar Arrangement Agreement Implementation...

    Science.gov (United States)

    2011-06-24

    ... understanding of the risks associated with the transfers of these items. For more information on the Wassenaar... Bureau of Industry and Security 15 CFR Parts 734, 740, 743 and 774 RIN 0694-AF15 Export Controls for High Performance Computers: Wassenaar Arrangement Agreement Implementation for ECCN 4A003 and Revisions to License...

  11. InfoMall: An Innovative Strategy for High-Performance Computing and Communications Applications Development.

    Science.gov (United States)

    Mills, Kim; Fox, Geoffrey

    1994-01-01

    Describes the InfoMall, a program led by the Northeast Parallel Architectures Center (NPAC) at Syracuse University (New York). The InfoMall features a partnership of approximately 24 organizations offering linked programs in High Performance Computing and Communications (HPCC) technology integration, software development, marketing, education and…

  12. Federal High Performance Computing and Communications Program. The Department of Energy Component.

    Science.gov (United States)

    Department of Energy, Washington, DC. Office of Energy Research.

    This report, profusely illustrated with color photographs and other graphics, elaborates on the Department of Energy (DOE) research program in High Performance Computing and Communications (HPCC). The DOE is one of seven agency programs within the Federal Research and Development Program working on HPCC. The DOE HPCC program emphasizes research in…

  13. High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination.

    Science.gov (United States)

    Bouchard, Kristofer E; Aimone, James B; Chun, Miyoung; Dean, Thomas; Denker, Michael; Diesmann, Markus; Donofrio, David D; Frank, Loren M; Kasthuri, Narayanan; Koch, Chirstof; Ruebel, Oliver; Simon, Horst D; Sommer, Friedrich T; Prabhat

    2016-11-02

    Opportunities offered by new neuro-technologies are threatened by lack of coherent plans to analyze, manage, and understand the data. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations. Published by Elsevier Inc.

  14. Power grid simulation applications developed using the GridPACK™ high performance computing framework

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Shuangshuang; Chen, Yousu; Diao, Ruisheng; Huang, Zhenyu (Henry); Perkins, William; Palmer, Bruce

    2016-12-01

    This paper describes the GridPACK™ software framework for developing power grid simulations that can run on high performance computing platforms, with several example applications (dynamic simulation, static contingency analysis, and dynamic contingency analysis) that have been developed using GridPACK.

  15. Virtualization in High-Performance Computing: An Analysis of Physical and Virtual Node Performance

    OpenAIRE

    Jungels, Glendon M

    2012-01-01

    The process of virtualizing computing resources allows an organization to make more efficient use of it's resources. In addtion, this process enables flexibility that deployment on raw hardware does not. Virtualization, however, comes with a performance penalty. This study examines the performance of utilizing virtualization technology for use in high performance computing to determine the suitibility of using this technology. It makes use of a small (4 node) virtual cluster as well as a ...

  16. Challenges and Opportunities for Security in High-Performance Computing Environments

    OpenAIRE

    Peisert, S

    2017-01-01

    High-performance computing (HPC) environments have numerous distinctive elements that make securing them different than securing traditional computing systems. In some cases this is due to the way that HPC systems are implemented. In other cases, it is due to the way that HPC systems are used, or a combination of both issues. In this article, we discuss these distinctions and also discuss which security procedures and mechanisms are and are not appropriate in HPC environments, and where gaps ...

  17. STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

    Energy Technology Data Exchange (ETDEWEB)

    Oelerich, Jan Oliver, E-mail: jan.oliver.oelerich@physik.uni-marburg.de; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

    2017-06-15

    Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.

  18. P2P Technology for High-Performance Computing: An Overview

    Science.gov (United States)

    Follen, Gregory J. (Technical Monitor); Berry, Jason

    2003-01-01

    The transition from cluster computing to peer-to-peer (P2P) high-performance computing has recently attracted the attention of the computer science community. It has been recognized that existing local networks and dedicated clusters of headless workstations can serve as inexpensive yet powerful virtual supercomputers. It has also been recognized that the vast number of lower-end computers connected to the Internet stay idle for as long as 90% of the time. The growing speed of Internet connections and the high availability of free CPU time encourage exploration of the possibility to use the whole Internet rather than local clusters as massively parallel yet almost freely available P2P supercomputer. As a part of a larger project on P2P high-performance computing, it has been my goal to compile an overview of the 2P2 paradigm. I have studied various P2P platforms and I have compiled systematic brief descriptions of their most important characteristics. I have also experimented and obtained hands-on experience with selected P2P platforms focusing on those that seem promising with respect to P2P high-performance computing. I have also compiled relevant literature and web references. I have prepared a draft technical report and I have summarized my findings in a poster paper.

  19. Intelligent distributed computing

    CERN Document Server

    Thampi, Sabu

    2015-01-01

    This book contains a selection of refereed and revised papers of the Intelligent Distributed Computing Track originally presented at the third International Symposium on Intelligent Informatics (ISI-2014), September 24-27, 2014, Delhi, India.  The papers selected for this Track cover several Distributed Computing and related topics including Peer-to-Peer Networks, Cloud Computing, Mobile Clouds, Wireless Sensor Networks, and their applications.

  20. Distributed GIS Computing for High Performance Simulation and Visualization Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Today, the ability of sensors to generate geographical data is virtually limitless. Although NASA now provides (together with other agencies such as the USGS) a...

  1. Measurements over distributed high performance computing and storage systems

    Science.gov (United States)

    Williams, Elizabeth; Myers, Tom

    1993-01-01

    A strawman proposal is given for a framework for presenting a common set of metrics for supercomputers, workstations, file servers, mass storage systems, and the networks that interconnect them. Production control and database systems are also included. Though other applications and third part software systems are not addressed, it is important to measure them as well.

  2. 8th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2015-01-01

    Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.

  3. Developing on-demand secure high-performance computing services for biomedical data analytics.

    Science.gov (United States)

    Robison, Nicholas; Anderson, Nick

    2013-01-01

    We propose a technical and process model to support biomedical researchers requiring on-demand high performance computing on potentially sensitive medical datasets. Our approach describes the use of cost-effective, secure and scalable techniques for processing medical information via protected and encrypted computing clusters within a model High Performance Computing (HPC) environment. The process model supports an investigator defined data analytics platform capable of accepting secure data migration from local clinical research data silos into a dedicated analytic environment, and secure environment cleanup upon completion. We define metrics to support the evaluation of this pilot model through performance and stability tests, and describe evaluation of its suitability towards enabling rapid deployment by individual investigators.

  4. High-performance computing for structural mechanics and earthquake/tsunami engineering

    CERN Document Server

    Hori, Muneo; Ohsaki, Makoto

    2016-01-01

    Huge earthquakes and tsunamis have caused serious damage to important structures such as civil infrastructure elements, buildings and power plants around the globe.  To quantitatively evaluate such damage processes and to design effective prevention and mitigation measures, the latest high-performance computational mechanics technologies, which include telascale to petascale computers, can offer powerful tools. The phenomena covered in this book include seismic wave propagation in the crust and soil, seismic response of infrastructure elements such as tunnels considering soil-structure interactions, seismic response of high-rise buildings, seismic response of nuclear power plants, tsunami run-up over coastal towns and tsunami inundation considering fluid-structure interactions. The book provides all necessary information for addressing these phenomena, ranging from the fundamentals of high-performance computing for finite element methods, key algorithms of accurate dynamic structural analysis, fluid flows ...

  5. High performance computing and communications: Advancing the frontiers of information technology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.

  6. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...

  7. Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

    KAUST Repository

    Downes, Turlough P.

    2013-01-01

    As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.

  8. Thinking processes used by high-performing students in a computer programming task

    Directory of Open Access Journals (Sweden)

    Marietjie Havenga

    2011-07-01

    Full Text Available Computer programmers must be able to understand programming source code and write programs that execute complex tasks to solve real-world problems. This article is a trans- disciplinary study at the intersection of computer programming, education and psychology. It outlines the role of mental processes in the process of programming and indicates how successful thinking processes can support computer science students in writing correct and well-defined programs. A mixed methods approach was used to better understand the thinking activities and programming processes of participating students. Data collection involved both computer programs and students’ reflective thinking processes recorded in their journals. This enabled analysis of psychological dimensions of participants’ thinking processes and their problem-solving activities as they considered a programming problem. Findings indicate that the cognitive, reflective and psychological processes used by high-performing programmers contributed to their success in solving a complex programming problem. Based on the thinking processes of high performers, we propose a model of integrated thinking processes, which can support computer programming students. Keywords: Computer programming, education, mixed methods research, thinking processes.  Disciplines: Computer programming, education, psychology

  9. Investigating methods of supporting dynamically linked executables on high performance computing platforms.

    Energy Technology Data Exchange (ETDEWEB)

    Kelly, Suzanne Marie; Laros, James H., III; Pedretti, Kevin Thomas Tauke; Levenhagen, Michael J.

    2009-09-01

    Shared libraries have become ubiquitous and are used to achieve great resource efficiencies on many platforms. The same properties that enable efficiencies on time-shared computers and convenience on small clusters prove to be great obstacles to scalability on large clusters and High Performance Computing platforms. In addition, Light Weight operating systems such as Catamount have historically not supported the use of shared libraries specifically because they hinder scalability. In this report we will outline the methods of supporting shared libraries on High Performance Computing platforms using Light Weight kernels that we investigated. The considerations necessary to evaluate utility in this area are many and sometimes conflicting. While our initial path forward has been determined based on this evaluation we consider this effort ongoing and remain prepared to re-evaluate any technology that might provide a scalable solution. This report is an evaluation of a range of possible methods of supporting dynamically linked executables on capability class1 High Performance Computing platforms. Efforts are ongoing and extensive testing at scale is necessary to evaluate performance. While performance is a critical driving factor, supporting whatever method is used in a production environment is an equally important and challenging task.

  10. Visualization of Distributed Data Structures for High Performance Fortran-Like Languages

    Directory of Open Access Journals (Sweden)

    Rainer Koppler

    1997-01-01

    Full Text Available This article motivates the usage of graphics and visualization for efficient utilization of High Performance Fortran's (HPF's data distribution facilities. It proposes a graphical toolkit consisting of exploratory and estimation tools which allow the programmer to navigate through complex distributions and to obtain graphical ratings with respect to load distribution and communication. The toolkit has been implemented in a mapping design and visualization tool which is coupled with a compilation system for the HPF predecessor Vienna Fortran. Since this language covers a superset of HPF's facilities, the tool may also be used for visualization of HPF data structures.

  11. Linear static structural and vibration analysis on high-performance computers

    Science.gov (United States)

    Baddourah, M. A.; Storaasli, O. O.; Bostic, S. W.

    1993-01-01

    Parallel computers offer the oppurtunity to significantly reduce the computation time necessary to analyze large-scale aerospace structures. This paper presents algorithms developed for and implemented on massively-parallel computers hereafter referred to as Scalable High-Performance Computers (SHPC), for the most computationally intensive tasks involved in structural analysis, namely, generation and assembly of system matrices, solution of systems of equations and calculation of the eigenvalues and eigenvectors. Results on SHPC are presented for large-scale structural problems (i.e. models for High-Speed Civil Transport). The goal of this research is to develop a new, efficient technique which extends structural analysis to SHPC and makes large-scale structural analyses tractable.

  12. 10th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2017-01-01

    This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.

  13. International Conference on Modern Mathematical Methods and High Performance Computing in Science and Technology

    CERN Document Server

    Srivastava, HM; Venturino, Ezio; Resch, Michael; Gupta, Vijay

    2016-01-01

    The book discusses important results in modern mathematical models and high performance computing, such as applied operations research, simulation of operations, statistical modeling and applications, invisibility regions and regular meta-materials, unmanned vehicles, modern radar techniques/SAR imaging, satellite remote sensing, coding, and robotic systems. Furthermore, it is valuable as a reference work and as a basis for further study and research. All contributing authors are respected academicians, scientists and researchers from around the globe. All the papers were presented at the international conference on Modern Mathematical Methods and High Performance Computing in Science & Technology (M3HPCST 2015), held at Raj Kumar Goel Institute of Technology, Ghaziabad, India, from 27–29 December 2015, and peer-reviewed by international experts. The conference provided an exceptional platform for leading researchers, academicians, developers, engineers and technocrats from a broad range of disciplines ...

  14. Exploring Graphics Processing Unit (GPU) Resource Sharing Efficiency for High Performance Computing

    OpenAIRE

    Li, Teng; Narayana, Vikram; El-Ghazawi, Tarek

    2013-01-01

    The increasing incorporation of Graphics Processing Units (GPUs) as accelerators has been one of the forefront High Performance Computing (HPC) trends and provides unprecedented performance; however, the prevalent adoption of the Single-Program Multiple-Data (SPMD) programming model brings with it challenges of resource underutilization. In other words, under SPMD, every CPU needs GPU capability available to it. However, since CPUs generally outnumber GPUs, the asymmetric resource distributio...

  15. Learning by Doing, High Performance Computing Education in the MOOC Era

    Science.gov (United States)

    2016-06-14

    attention to the information that is being described in the audio track . An active video with a solid soundtrack allows students to see what is being...system. Course material is presented through the lens of common HPC use cases and the strategies for parallelizing them. Using personalized paths, we... personal educational experience that integrates the researcher’s HPC system and application. 1.1. High Performance Computing Education High

  16. Astrocomp: web technologies for high performance computing on a grid of supercomputers

    Science.gov (United States)

    Costa, A.; Becciani, U.; Antonuccio, V.; Capuzzo Dolcetta, R.; Miocchi, P.; Di Matteo, P.; Rosato, V.

    Astrocomp is a project developed by the Astrophysical Observatory of Catania, University of Roma La Sapienza and ENEA. The project goal is to build a web-based user-friendly interface which allows the international community to run some parellel codes on a set of high performance computing (HPC) resources. There' s no need for specific knowledge about Unix commands and Operating Systems. Astrocomp makes some CPU times, on large parallel platform, available to the referenced user .

  17. High-Performance Computing Opportunities and Challenges for Army R&D

    Science.gov (United States)

    2006-01-01

    sec Gigabits per Second GMO Genetically Modified Organism GPS Global Positioning System HEC High-End Computing HP Hewlett-Packard HPC High Performance...missing body organs). It includes the engi- neering (design, fabrication, and testing) of genetically modified organisms ( GMOs ). Molecular Modeling...accurate modeling of the physiological processes present in organisms able to down- regulate their metabolism. Research has shown that sleep deprivation is

  18. Solving Problems in Various Domains by Hybrid Models of High Performance Computations

    Directory of Open Access Journals (Sweden)

    Yurii Rogozhin

    2014-03-01

    Full Text Available This work presents a hybrid model of high performance computations. The model is based on membrane system (P~system where some membranes may contain quantum device that is triggered by the data entering the membrane. This model is supposed to take advantages of both biomolecular and quantum paradigms and to overcome some of their inherent limitations. The proposed approach is demonstrated through two selected problems: SAT, and image retrieving.

  19. The NetLogger Methodology for High Performance Distributed Systems Performance Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Tierney, Brian; Johnston, William; Crowley, Brian; Hoo, Gary; Brooks, Chris; Gunter, Dan

    1999-12-23

    The authors describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generating precision event logs that can be used to provide detailed end-to-end application and system level monitoring; a Java agent-based system for managing the large amount of logging data; and tools for visualizing the log data and real-time state of the distributed system. The authors developed these tools for analyzing a high-performance distributed system centered around the transfer of large amounts of data at high speeds from a distributed storage server to a remote visualization client. However, this methodology should be generally applicable to any distributed system. This methodology, called NetLogger, has proven invaluable for diagnosing problems in networks and in distributed systems code. This approach is novel in that it combines network, host, and application-level monitoring, providing a complete view of the entire system.

  20. The contribution of high-performance computing and modelling for industrial development

    CSIR Research Space (South Africa)

    Sithole, Happy M

    2017-10-01

    Full Text Available Performance Computing and Modelling for Industrial Development Dr Happy Sithole and Dr Onno Ubbink 2 Strategic context • High-performance computing (HPC) combined with machine Learning and artificial intelligence present opportunities to non....1 with Bright Cluster Manager and Altair PBS Pro Total Linpack Performance (Tflop/s) 783 1029 7 HPC Systems • 127 on TOP500 • Awarded the fastest supercomputer in Africa 8 Resource utilisation Nov/Dec 2016 9 Jan/Feb 2017 Resource utilisation 10...

  1. A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

    Energy Technology Data Exchange (ETDEWEB)

    Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-09-01

    As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.

  2. High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

    CERN Document Server

    Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

    2014-01-01

    The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment

  3. A Queue Simulation Tool for a High Performance Scientific Computing Center

    Science.gov (United States)

    Spear, Carrie; McGalliard, James

    2007-01-01

    The NASA Center for Computational Sciences (NCCS) at the Goddard Space Flight Center provides high performance highly parallel processors, mass storage, and supporting infrastructure to a community of computational Earth and space scientists. Long running (days) and highly parallel (hundreds of CPUs) jobs are common in the workload. NCCS management structures batch queues and allocates resources to optimize system use and prioritize workloads. NCCS technical staff use a locally developed discrete event simulation tool to model the impacts of evolving workloads, potential system upgrades, alternative queue structures and resource allocation policies.

  4. High Performance Computing - Power Application Programming Interface Specification Version 2.0.

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ward, H. Lee [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-03-01

    Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

  5. High Performance Computing and Communications: Technology for the National Information Infrastructure. Supplement to the President's Fiscal Year 1995 Budget.

    Science.gov (United States)

    Office of Science and Technology Policy, Washington, DC. National Science and Technology Council.

    The federal High Performance Computing and Communications (HPCC) program was created to accelerate the development of future generations of high performance computers and networks and the use of these resources in the federal government and throughout the American economy. This report describes the HPCC program which is developing computing,…

  6. ATLAS Distributed Computing Automation

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Borrego, C; Campana, S; Di Girolamo, A; Elmsheuser, J; Hejbal, J; Kouba, T; Legger, F; Magradze, E; Medrano Llamas, R; Negri, G; Rinaldi, L; Sciacca, G; Serfon, C; Van Der Ster, D C

    2012-01-01

    The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.

  7. Session on High Speed Civil Transport Design Capability Using MDO and High Performance Computing

    Science.gov (United States)

    Rehder, Joe

    2000-01-01

    Since the inception of CAS in 1992, NASA Langley has been conducting research into applying multidisciplinary optimization (MDO) and high performance computing toward reducing aircraft design cycle time. The focus of this research has been the development of a series of computational frameworks and associated applications that increased in capability, complexity, and performance over time. The culmination of this effort is an automated high-fidelity analysis capability for a high speed civil transport (HSCT) vehicle installed on a network of heterogeneous computers with a computational framework built using Common Object Request Broker Architecture (CORBA) and Java. The main focus of the research in the early years was the development of the Framework for Interdisciplinary Design Optimization (FIDO) and associated HSCT applications. While the FIDO effort was eventually halted, work continued on HSCT applications of ever increasing complexity. The current application, HSCT4.0, employs high fidelity CFD and FEM analysis codes. For each analysis cycle, the vehicle geometry and computational grids are updated using new values for design variables. Processes for aeroelastic trim, loads convergence, displacement transfer, stress and buckling, and performance have been developed. In all, a total of 70 processes are integrated in the analysis framework. Many of the key processes include automatic differentiation capabilities to provide sensitivity information that can be used in optimization. A software engineering process was developed to manage this large project. Defining the interactions among 70 processes turned out to be an enormous, but essential, task. A formal requirements document was prepared that defined data flow among processes and subprocesses. A design document was then developed that translated the requirements into actual software design. A validation program was defined and implemented to ensure that codes integrated into the framework produced the same

  8. RAPPORT: running scientific high-performance computing applications on the cloud.

    Science.gov (United States)

    Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

    2013-01-28

    Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.

  9. Cloud object store for archive storage of high performance computing data using decoupling middleware

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2015-06-30

    Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.

  10. Cloud object store for checkpoints of high performance computing applications using decoupling middleware

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2016-04-19

    Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.

  11. Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

    KAUST Repository

    Rojas, Jhonathan Prieto

    2013-09-10

    State-of-the art computers need high performance transistors, which consume ultra-low power resulting in longer battery lifetime. Billions of transistors are integrated neatly using matured silicon fabrication process to maintain the performance per cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today\\'s computers. One limitation is silicon\\'s rigidity and brittleness. Here we show a generic batch process to convert high performance silicon electronics into flexible and semi-transparent one while retaining its performance, process compatibility, integration density and cost. We demonstrate high-k/metal gate stack based p-type metal oxide semiconductor field effect transistors on 4 inch silicon fabric released from bulk silicon (100) wafers with sub-threshold swing of 80 mV dec(-1) and on/off ratio of near 10(4) within 10% device uniformity with a minimum bending radius of 5 mm and an average transmittance of similar to 7% in the visible spectrum.

  12. Can we build a truly high performance computer which is flexible and transparent?

    Science.gov (United States)

    Rojas, Jhonathan P; Torres Sevilla, Galo A; Hussain, Muhammad M

    2013-01-01

    State-of-the art computers need high performance transistors, which consume ultra-low power resulting in longer battery lifetime. Billions of transistors are integrated neatly using matured silicon fabrication process to maintain the performance per cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today's computers. One limitation is silicon's rigidity and brittleness. Here we show a generic batch process to convert high performance silicon electronics into flexible and semi-transparent one while retaining its performance, process compatibility, integration density and cost. We demonstrate high-k/metal gate stack based p-type metal oxide semiconductor field effect transistors on 4 inch silicon fabric released from bulk silicon (100) wafers with sub-threshold swing of 80 mV dec(-1) and on/off ratio of near 10(4) within 10% device uniformity with a minimum bending radius of 5 mm and an average transmittance of ~7% in the visible spectrum.

  13. Developing a High Performance Software Library with MPI and CUDA for Matrix Computations

    Directory of Open Access Journals (Sweden)

    Bogdan Oancea

    2014-04-01

    Full Text Available Nowadays, the paradigm of parallel computing is changing. CUDA is now a popular programming model for general purpose computations on GPUs and a great number of applications were ported to CUDA obtaining speedups of orders of magnitude comparing to optimized CPU implementations. Hybrid approaches that combine the message passing model with the shared memory model for parallel computing are a solution for very large applications. We considered a heterogeneous cluster that combines the CPU and GPU computations using MPI and CUDA for developing a high performance linear algebra library. Our library deals with large linear systems solvers because they are a common problem in the fields of science and engineering. Direct methods for computing the solution of such systems can be very expensive due to high memory requirements and computational cost. An efficient alternative are iterative methods which computes only an approximation of the solution. In this paper we present an implementation of a library that uses a hybrid model of computation using MPI and CUDA implementing both direct and iterative linear systems solvers. Our library implements LU and Cholesky factorization based solvers and some of the non-stationary iterative methods using the MPI/CUDA combination. We compared the performance of our MPI/CUDA implementation with classic programs written to be run on a single CPU.

  14. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

    Science.gov (United States)

    Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

    2015-01-01

    With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.

  15. On the impact of quantum computing technology on future developments in high-performance scientific computing

    NARCIS (Netherlands)

    Möller, M.; Vuik, C.

    2017-01-01

    Quantum computing technologies have become a hot topic in academia and industry receiving much attention and financial support from all sides. Building a quantum computer that can be used practically is in itself an outstanding challenge that has become the ‘new race to the moon’. Next to

  16. Automating fault tolerance in high-performance computational biological jobs using multi-agent approaches.

    Science.gov (United States)

    Varghese, Blesson; McKee, Gerard; Alexandrov, Vassil

    2014-05-01

    Large-scale biological jobs on high-performance computing systems require manual intervention if one or more computing cores on which they execute fail. This places not only a cost on the maintenance of the job, but also a cost on the time taken for reinstating the job and the risk of losing data and execution accomplished by the job before it failed. Approaches which can proactively detect computing core failures and take action to relocate the computing core׳s job onto reliable cores can make a significant step towards automating fault tolerance. This paper describes an experimental investigation into the use of multi-agent approaches for fault tolerance. Two approaches are studied, the first at the job level and the second at the core level. The approaches are investigated for single core failure scenarios that can occur in the execution of parallel reduction algorithms on computer clusters. A third approach is proposed that incorporates multi-agent technology both at the job and core level. Experiments are pursued in the context of genome searching, a popular computational biology application. The key conclusion is that the approaches proposed are feasible for automating fault tolerance in high-performance computing systems with minimal human intervention. In a typical experiment in which the fault tolerance is studied, centralised and decentralised checkpointing approaches on an average add 90% to the actual time for executing the job. On the other hand, in the same experiment the multi-agent approaches add only 10% to the overall execution time. Copyright © 2014 Elsevier Ltd. All rights reserved.

  17. A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing.

    Energy Technology Data Exchange (ETDEWEB)

    James, Conrad D.; Schiess, Adrian B.; Howell, Jamie; Baca, Michael J.; Partridge, L. Donald; Finnegan, Patrick Sean; Wolfley, Steven L.; Dagel, Daryl James; Spahn, Olga Blum; Harper, Jason C.; Pohl, Kenneth Roy; Mickel, Patrick R.; Lohn, Andrew; Marinella, Matthew

    2013-10-01

    The human brain (volume=1200cm3) consumes 20W and is capable of performing > 10^16 operations/s. Current supercomputer technology has reached 1015 operations/s, yet it requires 1500m^3 and 3MW, giving the brain a 10^12 advantage in operations/s/W/cm^3. Thus, to reach exascale computation, two achievements are required: 1) improved understanding of computation in biological tissue, and 2) a paradigm shift towards neuromorphic computing where hardware circuits mimic properties of neural tissue. To address 1), we will interrogate corticostriatal networks in mouse brain tissue slices, specifically with regard to their frequency filtering capabilities as a function of input stimulus. To address 2), we will instantiate biological computing characteristics such as multi-bit storage into hardware devices with future computational and memory applications. Resistive memory devices will be modeled, designed, and fabricated in the MESA facility in consultation with our internal and external collaborators.

  18. DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

    2014-10-17

    U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.

  19. Speeding up ecological and evolutionary computations in R; essentials of high performance computing for biologists.

    Science.gov (United States)

    Visser, Marco D; McMahon, Sean M; Merow, Cory; Dixon, Philip M; Record, Sydne; Jongejans, Eelke

    2015-03-01

    Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof) that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1-S3 Texts) that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster). By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research.

  20. Speeding up ecological and evolutionary computations in R; essentials of high performance computing for biologists.

    Directory of Open Access Journals (Sweden)

    Marco D Visser

    2015-03-01

    Full Text Available Computation has become a critical component of research in biology. A risk has emerged that computational and programming challenges may limit research scope, depth, and quality. We review various solutions to common computational efficiency problems in ecological and evolutionary research. Our review pulls together material that is currently scattered across many sources and emphasizes those techniques that are especially effective for typical ecological and environmental problems. We demonstrate how straightforward it can be to write efficient code and implement techniques such as profiling or parallel computing. We supply a newly developed R package (aprof that helps to identify computational bottlenecks in R code and determine whether optimization can be effective. Our review is complemented by a practical set of examples and detailed Supporting Information material (S1-S3 Texts that demonstrate large improvements in computational speed (ranging from 10.5 times to 14,000 times faster. By improving computational efficiency, biologists can feasibly solve more complex tasks, ask more ambitious questions, and include more sophisticated analyses in their research.

  1. Molecular dynamics-based virtual screening: accelerating the drug discovery process by high-performance computing.

    Science.gov (United States)

    Ge, Hu; Wang, Yu; Li, Chanjuan; Chen, Nanhao; Xie, Yufang; Xu, Mengyan; He, Yingyan; Gu, Xinchun; Wu, Ruibo; Gu, Qiong; Zeng, Liang; Xu, Jun

    2013-10-28

    High-performance computing (HPC) has become a state strategic technology in a number of countries. One hypothesis is that HPC can accelerate biopharmaceutical innovation. Our experimental data demonstrate that HPC can significantly accelerate biopharmaceutical innovation by employing molecular dynamics-based virtual screening (MDVS). Without using HPC, MDVS for a 10K compound library with tens of nanoseconds of MD simulations requires years of computer time. In contrast, a state of the art HPC can be 600 times faster than an eight-core PC server is in screening a typical drug target (which contains about 40K atoms). Also, careful design of the GPU/CPU architecture can reduce the HPC costs. However, the communication cost of parallel computing is a bottleneck that acts as the main limit of further virtual screening improvements for drug innovations.

  2. High performance photonic reservoir computer based on a coherently driven passive cavity

    CERN Document Server

    Vinckier, Quentin; Smerieri, Anteo; Vandoorne, Kristof; Bienstman, Peter; Haelterman, Marc; Massar, Serge

    2015-01-01

    Reservoir computing is a recent bio-inspired approach for processing time-dependent signals. It has enabled a breakthrough in analog information processing, with several experiments, both electronic and optical, demonstrating state-of-the-art performances for hard tasks such as speech recognition, time series prediction and nonlinear channel equalization. A proof-of-principle experiment using a linear optical circuit on a photonic chip to process digital signals was recently reported. Here we present the first implementation of a photonic reservoir computer based on a coherently driven passive fiber cavity processing analog signals. Our experiment surpasses all previous experiments on a wide variety of tasks, and also has lower power consumption. Furthermore, the analytical model describing our experiment is also of interest, as it arguably constitutes the simplest high performance reservoir computer algorithm introduced so far. The present experiment, given its remarkable performances, low energy consumption...

  3. SCEAPI: A unified Restful Web API for High-Performance Computing

    Science.gov (United States)

    Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi

    2017-10-01

    The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.

  4. Efficient use of high performance computers for integrated controls and structures design. [of large space platforms

    Science.gov (United States)

    Belvin, W. K.; Maghami, P. G.; Nguyen, D. T.

    1992-01-01

    Simply transporting design codes from sequential-scalar computers to parallel-vector computers does not fully utilize the computational benefits offered by high performance computers. By performing integrated controls and structures design on an experimental truss platform with both sequential-scalar and parallel-vector design codes, conclusive results are presented to substantiate this claim. The efficiency of a Cholesky factorization scheme in conjunction with a variable-band row data structure is presented. In addition, the Lanczos eigensolution algorithm has been incorporated in the design code for both parallel and vector computations. Comparisons of computational efficiency between the initial design code and the parallel-vector design code are presented. It is shown that the Lanczos algorithm with the Cholesky factorization scheme is far superior to the sub-space iteration method of eigensolution when substantial numbers of eigenvectors are required for control design and/or performance optimization. Integrated design results show the need for continued efficiency studies in the area of element computations and matrix assembly.

  5. Soft Computing Techniques for the Protein Folding Problem on High Performance Computing Architectures.

    Science.gov (United States)

    Llanes, Antonio; Muñoz, Andrés; Bueno-Crespo, Andrés; García-Valverde, Teresa; Sánchez, Antonia; Arcas-Túnez, Francisco; Pérez-Sánchez, Horacio; Cecilia, José M

    2016-01-01

    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.

  6. Computational Modeling and High Performance Computing in Advanced Materials Processing, Synthesis, and Design

    Science.gov (United States)

    2014-12-07

    methods Journal of Computational Physics, Volume 200, Issue 2, (2004) 251-266. [17] Andersen C. H., Molecular dynamics simulations at constant pressure...Physics, Volume 200, Issue 2, (2004) 251-266. 47 [22] Andersen C. H., Molecular dynamics simulations at constant pressure and/or temperature...thriving game industry [1, 2, 7]. The differences in CPU and GPU computational power can be traced back to the designs upon which the two are predicated

  7. HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY.

    Energy Technology Data Exchange (ETDEWEB)

    FENG,H.; JONES,K.W.; MCGUIGAN,M.; SMITH,G.J.; SPILETIC,J.

    2001-10-12

    Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.

  8. Development of high performance casting analysis software by coupled parallel computation

    Directory of Open Access Journals (Sweden)

    Sang Hyun CHO

    2007-08-01

    Full Text Available Up to now, so much casting analysis software has been continuing to develop the new access way to real casting processes. Those include the melt flow analysis, heat transfer analysis for solidification calculation, mechanical property predictions and microstructure predictions. These trials were successful to obtain the ideal results comparing with real situations, so that CAE technologies became inevitable to design or develop new casting processes. But for manufacturing fields, CAE technologies are not so frequently being used because of their difficulties in using the software or insufficient computing performances. To introduce CAE technologies to manufacturing field, the high performance analysis is essential to shorten the gap between product designing time and prototyping time. The software code optimization can be helpful, but it is not enough, because the codes developed by software experts are already optimized enough. As an alternative proposal for high performance computations, the parallel computation technologies are eagerly being applied to CAE technologies to make the analysis time shorter. In this research, SMP (Shared Memory Processing and MPI (Message Passing Interface (1 methods for parallelization were applied to commercial software "Z-Cast" to calculate the casting processes. In the code parallelizing processes, the network stabilization, core optimization were also carried out under Microsoft Windows platform and their performances and results were compared with those of normal linear analysis codes.

  9. A high-performance reconfigurable computing solution for Peptide mass fingerprinting.

    Science.gov (United States)

    Coca, Daniel; Bogdan, Istvan; Beynon, Robert J

    2010-01-01

    High-throughput, MS-based proteomics studies are generating very large volumes of biologically relevant data. Given the central role of proteomics in emerging fields such as system/synthetic biology and biomarker discovery, the amount of proteomic data is expected to grow at unprecedented rates over the next decades. At the moment, there is pressing need for high-performance computational solutions to accelerate the analysis and interpretation of this data.Performance gains achieved by grid computing in this area are not spectacular, especially given the significant power consumption, maintenance costs and floor space required by large server farms.This paper introduces an alternative, cost-effective high-performance bioinformatics solution for peptide mass fingerprinting based on Field Programmable Gate Array (FPGA) devices. At the heart of this approach stands the concept of mapping algorithms on custom digital hardware that can be programmed to run on FPGA. Specifically in this case, the entire computational flow associated with peptide mass fingerprinting, namely raw mass spectra processing and database searching, has been mapped on custom hardware processors that are programmed to run on a multi-FPGA system coupled with a conventional PC server. The system achieves an almost 2,000-fold speed-up when compared with a conventional implementation of the algorithms in software running on a 3.06 GHz Xeon PC server.

  10. Image Processor Electronics (IPE): The High-Performance Computing System for NASA SWIFT Mission

    Science.gov (United States)

    Nguyen, Quang H.; Settles, Beverly A.

    2003-01-01

    Gamma Ray Bursts (GRBs) are believed to be the most powerful explosions that have occurred in the Universe since the Big Bang and are a mystery to the scientific community. Swift, a NASA mission that includes international participation, was designed and built in preparation for a 2003 launch to help to determine the origin of Gamma Ray Bursts. Locating the position in the sky where a burst originates requires intensive computing, because the duration of a GRB can range between a few milliseconds up to approximately a minute. The instrument data system must constantly accept multiple images representing large regions of the sky that are generated by sixteen gamma ray detectors operating in parallel. It then must process the received images very quickly in order to determine the existence of possible gamma ray bursts and their locations. The high-performance instrument data computing system that accomplishes this is called the Image Processor Electronics (IPE). The IPE was designed, built and tested by NASA Goddard Space Flight Center (GSFC) in order to meet these challenging requirements. The IPE is a small size, low power and high performing computing system for space applications. This paper addresses the system implementation and the system hardware architecture of the IPE. The paper concludes with the IPE system performance that was measured during end-to-end system testing.

  11. Kemari: A Portable High Performance Fortran System for Distributed Memory Parallel Processors

    Directory of Open Access Journals (Sweden)

    T. Kamachi

    1997-01-01

    Full Text Available We have developed a compilation system which extends High Performance Fortran (HPF in various aspects. We support the parallelization of well-structured problems with loop distribution and alignment directives similar to HPF's data distribution directives. Such directives give both additional control to the user and simplify the compilation process. For the support of unstructured problems, we provide directives for dynamic data distribution through user-defined mappings. The compiler also allows integration of message-passing interface (MPI primitives. The system is part of a complete programming environment which also comprises a parallel debugger and a performance monitor and analyzer. After an overview of the compiler, we describe the language extensions and related compilation mechanisms in detail. Performance measurements demonstrate the compiler's applicability to a variety of application classes.

  12. DOE Greenbook - Needs and Directions in High-Performance Computing for the Office of Science

    Energy Technology Data Exchange (ETDEWEB)

    Rotman, D; Harding, P

    2002-04-01

    researchers. (1) High-Performance Computing Technology; (2) Advanced Software Technology and Algorithms; (3) Energy Sciences Network; and (4) Basic Research and Human Resources. In addition to the availability from the vendor community, these components determine the implementation and direction of the development of the supercomputing resources for the OS community. In this document we will identify scientific and computational needs from across the five Office of Science organizations: High Energy and Nuclear Physics, Basic Energy Sciences, Fusion Energy Science, Biological and Environmental Research, and Advanced Scientific Computing Research. We will also delineate the current suite of NERSC computational and human resources. Finally, we will provide a set of recommendations that will guide the utilization of current and future computational resources at the DOE NERSC.

  13. High Performance Computing and Storage Requirements for Nuclear Physics: Target 2017

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2014-04-30

    In April 2014, NERSC, ASCR, and the DOE Office of Nuclear Physics (NP) held a review to characterize high performance computing (HPC) and storage requirements for NP research through 2017. This review is the 12th in a series of reviews held by NERSC and Office of Science program offices that began in 2009. It is the second for NP, and the final in the second round of reviews that covered the six Office of Science program offices. This report is the result of that review

  14. Failure detection in high-performance clusters and computers using chaotic map computations

    Science.gov (United States)

    Rao, Nageswara S.

    2015-09-01

    A programmable media includes a processing unit capable of independent operation in a machine that is capable of executing 10.sup.18 floating point operations per second. The processing unit is in communication with a memory element and an interconnect that couples computing nodes. The programmable media includes a logical unit configured to execute arithmetic functions, comparative functions, and/or logical functions. The processing unit is configured to detect computing component failures, memory element failures and/or interconnect failures by executing programming threads that generate one or more chaotic map trajectories. The central processing unit or graphical processing unit is configured to detect a computing component failure, memory element failure and/or an interconnect failure through an automated comparison of signal trajectories generated by the chaotic maps.

  15. National High-Performance Computing and Networking Act. Report To Accompany S. 343, Senate, 102d Congess, 1st Session.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Senate Committee on Energy and Natural Resources.

    The purpose of the bill (S. 343), as reported by the Senate Committee on Energy and Natural Resources, is to establish a federal commitment to the advancement of high-performance computing, improve interagency planning and coordination of federal high-performance computing and networking activities, authorize a national high-speed computer…

  16. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

    Directory of Open Access Journals (Sweden)

    Cordes Ben

    2009-01-01

    Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.

  17. Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

    Directory of Open Access Journals (Sweden)

    2009-03-01

    Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.

  18. 9th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

    2016-01-01

    High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...

  19. A new massively parallel version of CRYSTAL for large systems on high performance computing architectures.

    Science.gov (United States)

    Orlando, Roberto; Delle Piane, Massimo; Bush, Ian J; Ugliengo, Piero; Ferrabone, Matteo; Dovesi, Roberto

    2012-10-30

    Fully ab initio treatment of complex solid systems needs computational software which is able to efficiently take advantage of the growing power of high performance computing (HPC) architectures. Recent improvements in CRYSTAL, a periodic ab initio code that uses a Gaussian basis set, allows treatment of very large unit cells for crystalline systems on HPC architectures with high parallel efficiency in terms of running time and memory requirements. The latter is a crucial point, due to the trend toward architectures relying on a very high number of cores with associated relatively low memory availability. An exhaustive performance analysis shows that density functional calculations, based on a hybrid functional, of low-symmetry systems containing up to 100,000 atomic orbitals and 8000 atoms are feasible on the most advanced HPC architectures available to European researchers today, using thousands of processors. Copyright © 2012 Wiley Periodicals, Inc.

  20. High Performance Computing - Power Application Programming Interface Specification Version 1.4

    Energy Technology Data Exchange (ETDEWEB)

    Laros III, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); DeBonis, David [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2016-10-01

    Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

  1. High Performance Computing and Visualization Infrastructure for Simultaneous Parallel Computing and Parallel Visualization Research

    Science.gov (United States)

    2016-11-09

    NVIDIA GPGPU’s, and 14 Intel Co-Phi processors. The visualization infrastructure is a next generation tiled Mini CAVE for semi-immersive visualization...compute nodes with 340 cores, 20 NVIDIA GPGPU’s, and 14 Intel Co-Phi processors. The visualization infrastructure is a next generation tiled Mini CAVE...gas engines, long-range acoustic propagation simulations, numerical modeling of nonlinear nanophotonic devices, and molecular dynamics simulations

  2. Towards Real-Time High Performance Computing For Power Grid Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hui, Peter SY; Lee, Barry; Chikkagoudar, Satish

    2012-11-16

    Real-time computing has traditionally been considered largely in the context of single-processor and embedded systems, and indeed, the terms real-time computing, embedded systems, and control systems are often mentioned in closely related contexts. However, real-time computing in the context of multinode systems, specifically high-performance, cluster-computing systems, remains relatively unexplored. Imposing real-time constraints on a parallel (cluster) computing environment introduces a variety of challenges with respect to the formal verification of the system's timing properties. In this paper, we give a motivating example to demonstrate the need for such a system--- an application to estimate the electromechanical states of the power grid--- and we introduce a formal method for performing verification of certain temporal properties within a system of parallel processes. We describe our work towards a full real-time implementation of the target application--- namely, our progress towards extracting a key mathematical kernel from the application, the formal process by which we analyze the intricate timing behavior of the processes on the cluster, as well as timing measurements taken on our test cluster to demonstrate use of these concepts.

  3. Acceleration of FDTD mode solver by high-performance computing techniques.

    Science.gov (United States)

    Han, Lin; Xi, Yanping; Huang, Wei-Ping

    2010-06-21

    A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

  4. The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC

    Science.gov (United States)

    Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan

    2016-04-01

    The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.

  5. Measuring and tuning energy efficiency on large scale high performance computing platforms.

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H., III

    2011-08-01

    Recognition of the importance of power in the field of High Performance Computing, whether it be as an obstacle, expense or design consideration, has never been greater and more pervasive. While research has been conducted on many related aspects, there is a stark absence of work focused on large scale High Performance Computing. Part of the reason is the lack of measurement capability currently available on small or large platforms. Typically, research is conducted using coarse methods of measurement such as inserting a power meter between the power source and the platform, or fine grained measurements using custom instrumented boards (with obvious limitations in scale). To collect the measurements necessary to analyze real scientific computing applications at large scale, an in-situ measurement capability must exist on a large scale capability class platform. In response to this challenge, we exploit the unique power measurement capabilities of the Cray XT architecture to gain an understanding of power use and the effects of tuning. We apply these capabilities at the operating system level by deterministically halting cores when idle. At the application level, we gain an understanding of the power requirements of a range of important DOE/NNSA production scientific computing applications running at large scale (thousands of nodes), while simultaneously collecting current and voltage measurements on the hosting nodes. We examine the effects of both CPU and network bandwidth tuning and demonstrate energy savings opportunities of up to 39% with little or no impact on run-time performance. Capturing scale effects in our experimental results was key. Our results provide strong evidence that next generation large-scale platforms should not only approach CPU frequency scaling differently, but could also benefit from the capability to tune other platform components, such as the network, to achieve energy efficient performance.

  6. Computational visualization of unsteady flow around vehicles using high performance computing

    OpenAIRE

    Tsubokura, Makoto; Kobayashi, Toshio; Nakashima, Takuji; Nouzawa, Takahide; Nakamura, Takaki; Zhang, Huilai; Onishi, Keiji; Oshima, Nobuyuki

    2009-01-01

    One of the largest-scale unstructured Large Eddy Simulation (LES) of flow around a full-scale road vehicle is conducted on the Earth Simulator in Japan. The main objective of our study is to look into the validity of LES for the assessment of vehicle aerodynamics, especially in the context of its possibility for unsteady or transient aerodynamic forces. Firstly, the aerodynamic LES proposed is quantitatively validated on the ASMO simplified model by comparing the mean pressure distributions o...

  7. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...

  8. High Performance Computing Facility Operational Assessment, CY 2011 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Bland, Arthur S Buddy [ORNL; Boudwin, Kathlyn J. [ORNL; Hack, James J [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL; Hudson, Douglas L [ORNL

    2012-02-01

    Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.4 billion core hours in calendar year (CY) 2011 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Users reported more than 670 publications this year arising from their use of OLCF resources. Of these we report the 300 in this review that are consistent with guidance provided. Scientific achievements by OLCF users cut across all range scales from atomic to molecular to large-scale structures. At the atomic scale, researchers discovered that the anomalously long half-life of Carbon-14 can be explained by calculating, for the first time, the very complex three-body interactions between all the neutrons and protons in the nucleus. At the molecular scale, researchers combined experimental results from LBL's light source and simulations on Jaguar to discover how DNA replication continues past a damaged site so a mutation can be repaired later. Other researchers combined experimental results from ORNL's Spallation Neutron Source and simulations on Jaguar to reveal the molecular structure of ligno-cellulosic material used in bioethanol production. This year, Jaguar has been used to do billion-cell CFD calculations to develop shock wave compression turbo machinery as a means to meet DOE goals for reducing carbon sequestration costs. General Electric used Jaguar to calculate the unsteady flow through turbo machinery to learn what efficiencies the traditional steady flow assumption is hiding from designers. Even a 1% improvement in turbine design can save the nation

  9. Enabling the ATLAS Experiment at the LHC for High Performance Computing

    CERN Document Server

    AUTHOR|(CDS)2091107; Ereditato, Antonio

    In this thesis, I studied the feasibility of running computer data analysis programs from the Worldwide LHC Computing Grid, in particular large-scale simulations of the ATLAS experiment at the CERN LHC, on current general purpose High Performance Computing (HPC) systems. An approach for integrating HPC systems into the Grid is proposed, which has been implemented and tested on the „Todi” HPC machine at the Swiss National Supercomputing Centre (CSCS). Over the course of the test, more than 500000 CPU-hours of processing time have been provided to ATLAS, which is roughly equivalent to the combined computing power of the two ATLAS clusters at the University of Bern. This showed that current HPC systems can be used to efficiently run large-scale simulations of the ATLAS detector and of the detected physics processes. As a first conclusion of my work, one can argue that, in perspective, running large-scale tasks on a few large machines might be more cost-effective than running on relatively small dedicated com...

  10. High Performance Computing of Meshless Time Domain Method on Multi-GPU Cluster

    Science.gov (United States)

    Ikuno, Soichiro; Nakata, Susumu; Hirokawa, Yuta; Itoh, Taku

    2015-01-01

    High performance computing of Meshless Time Domain Method (MTDM) on multi-GPU using the supercomputer HA-PACS (Highly Accelerated Parallel Advanced system for Computational Sciences) at University of Tsukuba is investigated. Generally, the finite difference time domain (FDTD) method is adopted for the numerical simulation of the electromagnetic wave propagation phenomena. However, the numerical domain must be divided into rectangle meshes, and it is difficult to adopt the problem in a complexed domain to the method. On the other hand, MTDM can be easily adept to the problem because MTDM does not requires meshes. In the present study, we implement MTDM on multi-GPU cluster to speedup the method, and numerically investigate the performance of the method on multi-GPU cluster. To reduce the computation time, the communication time between the decomposed domain is hided below the perfect matched layer (PML) calculation procedure. The results of computation show that speedup of MTDM on 128 GPUs is 173 times faster than that of single CPU calculation.

  11. Near Real-Time Probabilistic Damage Diagnosis Using Surrogate Modeling and High Performance Computing

    Science.gov (United States)

    Warner, James E.; Zubair, Mohammad; Ranjan, Desh

    2017-01-01

    This work investigates novel approaches to probabilistic damage diagnosis that utilize surrogate modeling and high performance computing (HPC) to achieve substantial computational speedup. Motivated by Digital Twin, a structural health management (SHM) paradigm that integrates vehicle-specific characteristics with continual in-situ damage diagnosis and prognosis, the methods studied herein yield near real-time damage assessments that could enable monitoring of a vehicle's health while it is operating (i.e. online SHM). High-fidelity modeling and uncertainty quantification (UQ), both critical to Digital Twin, are incorporated using finite element method simulations and Bayesian inference, respectively. The crux of the proposed Bayesian diagnosis methods, however, is the reformulation of the numerical sampling algorithms (e.g. Markov chain Monte Carlo) used to generate the resulting probabilistic damage estimates. To this end, three distinct methods are demonstrated for rapid sampling that utilize surrogate modeling and exploit various degrees of parallelism for leveraging HPC. The accuracy and computational efficiency of the methods are compared on the problem of strain-based crack identification in thin plates. While each approach has inherent problem-specific strengths and weaknesses, all approaches are shown to provide accurate probabilistic damage diagnoses and several orders of magnitude computational speedup relative to a baseline Bayesian diagnosis implementation.

  12. Multidisciplinary Design Optimization of a Full Vehicle with High Performance Computing

    Science.gov (United States)

    Yang, R. J.; Gu, L.; Tho, C. H.; Sobieszczanski-Sobieski, Jaroslaw

    2001-01-01

    Multidisciplinary design optimization (MDO) of a full vehicle under the constraints of crashworthiness, NVH (Noise, Vibration and Harshness), durability, and other performance attributes is one of the imperative goals for automotive industry. However, it is often infeasible due to the lack of computational resources, robust simulation capabilities, and efficient optimization methodologies. This paper intends to move closer towards that goal by using parallel computers for the intensive computation and combining different approximations for dissimilar analyses in the MDO process. The MDO process presented in this paper is an extension of the previous work reported by Sobieski et al. In addition to the roof crush, two full vehicle crash modes are added: full frontal impact and 50% frontal offset crash. Instead of using an adaptive polynomial response surface method, this paper employs a DOE/RSM method for exploring the design space and constructing highly nonlinear crash functions. Two NMO strategies are used and results are compared. This paper demonstrates that with high performance computing, a conventionally intractable real world full vehicle multidisciplinary optimization problem considering all performance attributes with large number of design variables become feasible.

  13. Sign: large-scale gene network estimation environment for high performance computing.

    Science.gov (United States)

    Tamada, Yoshinori; Shimamura, Teppei; Yamaguchi, Rui; Imoto, Seiya; Nagasaki, Masao; Miyano, Satoru

    2011-01-01

    Our research group is currently developing software for estimating large-scale gene networks from gene expression data. The software, called SiGN, is specifically designed for the Japanese flagship supercomputer "K computer" which is planned to achieve 10 petaflops in 2012, and other high performance computing environments including Human Genome Center (HGC) supercomputer system. SiGN is a collection of gene network estimation software with three different sub-programs: SiGN-BN, SiGN-SSM and SiGN-L1. In these three programs, five different models are available: static and dynamic nonparametric Bayesian networks, state space models, graphical Gaussian models, and vector autoregressive models. All these models require a huge amount of computational resources for estimating large-scale gene networks and therefore are designed to be able to exploit the speed of 10 petaflops. The software will be available freely for "K computer" and HGC supercomputer system users. The estimated networks can be viewed and analyzed by Cell Illustrator Online and SBiP (Systems Biology integrative Pipeline). The software project web site is available at http://sign.hgc.jp/ .

  14. High Performance Computing Facility Operational Assessment, FY 2011 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Baker, Ann E [ORNL; Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL

    2011-08-01

    Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.5 billion core hours in calendar year (CY) 2010 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Scientific achievements by OLCF users range from collaboration with university experimentalists to produce a working supercapacitor that uses atom-thick sheets of carbon materials to finely determining the resolution requirements for simulations of coal gasifiers and their components, thus laying the foundation for development of commercial-scale gasifiers. OLCF users are pushing the boundaries with software applications sustaining more than one petaflop of performance in the quest to illuminate the fundamental nature of electronic devices. Other teams of researchers are working to resolve predictive capabilities of climate models, to refine and validate genome sequencing, and to explore the most fundamental materials in nature - quarks and gluons - and their unique properties. Details of these scientific endeavors - not possible without access to leadership-class computing resources - are detailed in Section 4 of this report and in the INCITE in Review. Effective operations of the OLCF play a key role in the scientific missions and accomplishments of its users. This Operational Assessment Report (OAR) will delineate the policies, procedures, and innovations implemented by the OLCF to continue delivering a petaflop-scale resource for cutting-edge research. The 2010 operational assessment of the OLCF yielded recommendations that have been addressed (Reference Section 1) and

  15. High performance parallel computing of flows in complex geometries: II. Applications

    Energy Technology Data Exchange (ETDEWEB)

    Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F [Computational Fluid Dynamics Team, CERFACS, Toulouse, 31057 (France); Poinsot, T [Institut de Mecanique des Fluides de Toulouse, Toulouse, 31400 (France)], E-mail: Nicolas.gourdain@cerfacs.fr

    2009-01-01

    Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.

  16. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

    Science.gov (United States)

    Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

    2012-01-01

    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.

  17. Ultra-fast processing of gigapixel Tissue MicroArray images using high performance computing.

    Science.gov (United States)

    Wang, Yinhai; McCleary, David; Wang, Ching-Wei; Kelly, Paul; James, Jackie; Fennell, Dean A; Hamilton, Peter W

    2011-10-01

    Tissue MicroArrays (TMAs) are a valuable platform for tissue based translational research and the discovery of tissue biomarkers. The digitised TMA slides or TMA Virtual Slides, are ultra-large digital images, and can contain several hundred samples. The processing of such slides is time-consuming, bottlenecking a potentially high throughput platform. A High Performance Computing (HPC) platform for the rapid analysis of TMA virtual slides is presented in this study. Using an HP high performance cluster and a centralised dynamic load balancing approach, the simultaneous analysis of multiple tissue-cores were established. This was evaluated on Non-Small Cell Lung Cancer TMAs for complex analysis of tissue pattern and immunohistochemical positivity. The automated processing of a single TMA virtual slide containing 230 patient samples can be significantly speeded up by a factor of circa 22, bringing the analysis time to one minute. Over 90 TMAs could also be analysed simultaneously, speeding up multiplex biomarker experiments enormously. The methodologies developed in this paper provide for the first time a genuine high throughput analysis platform for TMA biomarker discovery that will significantly enhance the reliability and speed for biomarker research. This will have widespread implications in translational tissue based research.

  18. Low-cost, high-performance and efficiency computational photometer design

    Science.gov (United States)

    Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly

    2014-05-01

    Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.

  19. 7th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Nagel, Wolfgang; Resch, Michael

    2014-01-01

    Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.  

  20. Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

    2014-11-11

    has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).

  1. A C++11 implementation of arbitrary-rank tensors for high-performance computing

    Science.gov (United States)

    Aragón, Alejandro M.

    2014-11-01

    This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.

  2. Next generation Purex modeling by way of parallel processing with high performance computers

    Energy Technology Data Exchange (ETDEWEB)

    DeMuth, S.F.

    1993-08-01

    The Plutonium and Uranium Extraction (Purex) process is the predominant method used worldwide for solvent extraction in reprocessing spent nuclear fuels. Proper flowsheet design has a significant impact on the character of the process waste. Past Purex flowsheet modeling has been based on equilibrium conditions. It can be shown for the Purex process that optimum separation does not necessarily occur at equilibrium conditions. The next generation Purex flowsheet models should incorporate the fundamental diffusion and chemical kinetic processes required to study time-dependent behavior. Use of parallel processing with high-performance computers will permit transient multistage and multispecies design calculations based on mass transfer with simultaneous chemical reaction models. This paper presents an applicable mass transfer with chemical reaction model for the Purex system and presents a parallel processing solution methodology.

  3. Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

    Science.gov (United States)

    Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

    2013-11-05

    Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.

  4. A scalable silicon photonic chip-scale optical switch for high performance computing systems.

    Science.gov (United States)

    Yu, Runxiang; Cheung, Stanley; Li, Yuliang; Okamoto, Katsunari; Proietti, Roberto; Yin, Yawei; Yoo, S J B

    2013-12-30

    This paper discusses the architecture and provides performance studies of a silicon photonic chip-scale optical switch for scalable interconnect network in high performance computing systems. The proposed switch exploits optical wavelength parallelism and wavelength routing characteristics of an Arrayed Waveguide Grating Router (AWGR) to allow contention resolution in the wavelength domain. Simulation results from a cycle-accurate network simulator indicate that, even with only two transmitter/receiver pairs per node, the switch exhibits lower end-to-end latency and higher throughput at high (>90%) input loads compared with electronic switches. On the device integration level, we propose to integrate all the components (ring modulators, photodetectors and AWGR) on a CMOS-compatible silicon photonic platform to ensure a compact, energy efficient and cost-effective device. We successfully demonstrate proof-of-concept routing functions on an 8 × 8 prototype fabricated using foundry services provided by OpSIS-IME.

  5. Indemics: An Interactive High-Performance Computing Framework for Data Intensive Epidemic Modeling.

    Science.gov (United States)

    Bisset, Keith R; Chen, Jiangzhuo; Deodhar, Suruchi; Feng, Xizhou; Ma, Yifei; Marathe, Madhav V

    2014-01-01

    We describe the design and prototype implementation of Indemics (Interactive EpidemicSimulation)-a modeling environment utilizing high-performance computing technologies for supporting complex epidemic simulations. Indemics can support policy analysts and epidemiologists interested in planning and control of pandemics. Indemics goes beyond traditional epidemic simulations by providing a simple and powerful way to represent and analyze policy-based as well as individual-based adaptive interventions. Users can also stop the simulation at any point, assess the state of the simulated system, and add additional interventions. Indemics is available to end-users via a web-based interface. Detailed performance analysis shows that Indemics greatly enhances the capability and productivity of simulating complex intervention strategies with a marginal decrease in performance. We also demonstrate how Indemics was applied in some real case studies where complex interventions were implemented.

  6. Secure Enclaves: An Isolation-centric Approach for Creating Secure High Performance Computing Environments

    Energy Technology Data Exchange (ETDEWEB)

    Aderholdt, Ferrol [Tennessee Technological Univ., Cookeville, TN (United States); Caldwell, Blake A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Hicks, Susan Elaine [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Koch, Scott M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Naughton, III, Thomas J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Pelfrey, Daniel S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Pogge, James R [Tennessee Technological Univ., Cookeville, TN (United States); Scott, Stephen L [Tennessee Technological Univ., Cookeville, TN (United States); Shipman, Galen M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sorrillo, Lawrence [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2017-01-01

    High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data at various security levels but in so doing are often enclaved at the highest security posture. This approach places significant restrictions on the users of the system even when processing data at a lower security level and exposes data at higher levels of confidentiality to a much broader population than otherwise necessary. The traditional approach of isolation, while effective in establishing security enclaves poses significant challenges for the use of shared infrastructure in HPC environments. This report details current state-of-the-art in virtualization, reconfigurable network enclaving via Software Defined Networking (SDN), and storage architectures and bridging techniques for creating secure enclaves in HPC environments.

  7. Additive Manufacturing and High-Performance Computing: a Disruptive Latent Technology

    Science.gov (United States)

    Goodwin, Bruce

    2015-03-01

    This presentation will discuss the relationship between recent advances in Additive Manufacturing (AM) technology, High-Performance Computing (HPC) simulation and design capabilities, and related advances in Uncertainty Quantification (UQ), and then examines their impacts upon national and international security. The presentation surveys how AM accelerates the fabrication process, while HPC combined with UQ provides a fast track for the engineering design cycle. The combination of AM and HPC/UQ almost eliminates the engineering design and prototype iterative cycle, thereby dramatically reducing cost of production and time-to-market. These methods thereby present significant benefits for US national interests, both civilian and military, in an age of austerity. Finally, considering cyber security issues and the advent of the ``cloud,'' these disruptive, currently latent technologies may well enable proliferation and so challenge both nuclear and non-nuclear aspects of international security.

  8. Cloud CPFP: a shotgun proteomics data analysis pipeline using cloud and high performance computing.

    Science.gov (United States)

    Trudgian, David C; Mirzaei, Hamid

    2012-12-07

    We have extended the functionality of the Central Proteomics Facilities Pipeline (CPFP) to allow use of remote cloud and high performance computing (HPC) resources for shotgun proteomics data processing. CPFP has been modified to include modular local and remote scheduling for data processing jobs. The pipeline can now be run on a single PC or server, a local cluster, a remote HPC cluster, and/or the Amazon Web Services (AWS) cloud. We provide public images that allow easy deployment of CPFP in its entirety in the AWS cloud. This significantly reduces the effort necessary to use the software, and allows proteomics laboratories to pay for compute time ad hoc, rather than obtaining and maintaining expensive local server clusters. Alternatively the Amazon cloud can be used to increase the throughput of a local installation of CPFP as necessary. We demonstrate that cloud CPFP allows users to process data at higher speed than local installations but with similar cost and lower staff requirements. In addition to the computational improvements, the web interface to CPFP is simplified, and other functionalities are enhanced. The software is under active development at two leading institutions and continues to be released under an open-source license at http://cpfp.sourceforge.net.

  9. Parallelization and High-Performance Computing Enables Automated Statistical Inference of Multi-scale Models.

    Science.gov (United States)

    Jagiella, Nick; Rickert, Dennis; Theis, Fabian J; Hasenauer, Jan

    2017-02-22

    Mechanistic understanding of multi-scale biological processes, such as cell proliferation in a changing biological tissue, is readily facilitated by computational models. While tools exist to construct and simulate multi-scale models, the statistical inference of the unknown model parameters remains an open problem. Here, we present and benchmark a parallel approximate Bayesian computation sequential Monte Carlo (pABC SMC) algorithm, tailored for high-performance computing clusters. pABC SMC is fully automated and returns reliable parameter estimates and confidence intervals. By running the pABC SMC algorithm for ∼10(6) hr, we parameterize multi-scale models that accurately describe quantitative growth curves and histological data obtained in vivo from individual tumor spheroid growth in media droplets. The models capture the hybrid deterministic-stochastic behaviors of 10(5)-10(6) of cells growing in a 3D dynamically changing nutrient environment. The pABC SMC algorithm reliably converges to a consistent set of parameters. Our study demonstrates a proof of principle for robust, data-driven modeling of multi-scale biological systems and the feasibility of multi-scale model parameterization through statistical inference. Copyright © 2017 The Author(s). Published by Elsevier Inc. All rights reserved.

  10. High performance parallel computing of flows in complex geometries: I. Methods

    Energy Technology Data Exchange (ETDEWEB)

    Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F [Computational Fluid Dynamics Team, CERFACS, Toulouse, 31057 (France); Gazaix, M [Computational Fluid Dynamics and Aero-acoustics Department, ONERA, Chatillon, 92320 (France); Poinsot, T [Institut de Mecanique des Fluides de Toulouse, Toulouse, 31400 (France)], E-mail: Nicolas.gourdain@cerfacs.fr

    2009-01-01

    Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.

  11. DistributedFBA.jl: high-level, high-performance flux balance analysis in Julia.

    Science.gov (United States)

    Heirendt, Laurent; Thiele, Ines; Fleming, Ronan M T

    2017-05-01

    Flux balance analysis and its variants are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes. The code is freely available on github.com/opencobra/COBRA.jl. The documentation can be found at opencobra.github.io/COBRA.jl. ronan.mt.fleming@gmail.com. Supplementary data are available at Bioinformatics online.

  12. Cactus and Visapult: A case study of ultra-high performance distributed visualization using connectionless protocols

    Energy Technology Data Exchange (ETDEWEB)

    Shalf, John; Bethel, E. Wes

    2002-05-07

    This past decade has seen rapid growth in the size, resolution, and complexity of Grand Challenge simulation codes. Many such problems still require interactive visualization tools to make sense of multi-terabyte data stores. Visapult is a parallel volume rendering tool that employs distributed components, latency tolerant algorithms, and high performance network I/O for effective remote visualization of massive datasets. In this paper we discuss using connectionless protocols to accelerate Visapult network I/O and interfacing Visapult to the Cactus General Relativity code to enable scalable remote monitoring and steering capabilities. With these modifications, network utilization has moved from 25 percent of line-rate using tuned multi-streamed TCP to sustaining 88 percent of line rate using the new UDP-based transport protocol.

  13. DistributedFBA.jl: High-level, high-performance flux balance analysis in Julia.

    Science.gov (United States)

    Heirendt, Laurent; Thiele, Ines; Fleming, Ronan M T

    2017-01-16

    Flux balance analysis, and its variants, are widely used methods for predicting steady-state reaction rates in biochemical reaction networks. The exploration of high dimensional networks with such methods is currently hampered by software performance limitations. DistributedFBA.jl is a high-level, high-performance, open-source implementation of flux balance analysis in Julia. It is tailored to solve multiple flux balance analyses on a subset or all the reactions of large and huge-scale networks, on any number of threads or nodes. The code is freely available on github.com/opencobra/COBRA.jl. The documentation can be found at opencobra.github.io/COBRA.jl. ronan.mt.fleming@gmail.com. © The Author(s) 2017. Published by Oxford University Press.

  14. Automated extraction of natural drainage density patterns for the conterminous United States through high performance computing

    Science.gov (United States)

    Stanislawski, Larry V.; Falgout, Jeff T.; Buttenfield, Barbara P.

    2015-01-01

    Hydrographic networks form an important data foundation for cartographic base mapping and for hydrologic analysis. Drainage density patterns for these networks can be derived to characterize local landscape, bedrock and climate conditions, and further inform hydrologic and geomorphological analysis by indicating areas where too few headwater channels have been extracted. But natural drainage density patterns are not consistently available in existing hydrographic data for the United States because compilation and capture criteria historically varied, along with climate, during the period of data collection over the various terrain types throughout the country. This paper demonstrates an automated workflow that is being tested in a high-performance computing environment by the U.S. Geological Survey (USGS) to map natural drainage density patterns at the 1:24,000-scale (24K) for the conterminous United States. Hydrographic network drainage patterns may be extracted from elevation data to guide corrections for existing hydrographic network data. The paper describes three stages in this workflow including data pre-processing, natural channel extraction, and generation of drainage density patterns from extracted channels. The workflow is concurrently implemented by executing procedures on multiple subbasin watersheds within the U.S. National Hydrography Dataset (NHD). Pre-processing defines parameters that are needed for the extraction process. Extraction proceeds in standard fashion: filling sinks, developing flow direction and weighted flow accumulation rasters. Drainage channels with assigned Strahler stream order are extracted within a subbasin and simplified. Drainage density patterns are then estimated with 100-meter resolution and subsequently smoothed with a low-pass filter. The extraction process is found to be of better quality in higher slope terrains. Concurrent processing through the high performance computing environment is shown to facilitate and refine

  15. Bulk data transfer distributer: a high performance multicast model in ALMA ACS

    Science.gov (United States)

    Cirami, R.; Di Marcantonio, P.; Chiozzi, G.; Jeram, B.

    2006-06-01

    A high performance multicast model for the bulk data transfer mechanism in the ALMA (Atacama Large Millimeter Array) Common Software (ACS) is presented. The ALMA astronomical interferometer will consist of at least 50 12-m antennas operating at millimeter wavelength. The whole software infrastructure for ALMA is based on ACS, which is a set of application frameworks built on top of CORBA. To cope with the very strong requirements for the amount of data that needs to be transported by the software communication channels of the ALMA subsystems (a typical output data rate expected from the Correlator is of the order of 64 MB per second) and with the potential CORBA bottleneck due to parameter marshalling/de-marshalling, usage of IIOP protocol, etc., a transfer mechanism based on the ACE/TAO CORBA Audio/Video (A/V) Streaming Service has been developed. The ACS Bulk Data Transfer architecture bypasses the CORBA protocol with an out-of-bound connection for the data streams (transmitting data directly in TCP or UDP format), using at the same time CORBA for handshaking and leveraging the benefits of ACS middleware. Such a mechanism has proven to be capable of high performances, of the order of 800 Mbits per second on a 1Gbit Ethernet network. Besides a point-to-point communication model, the ACS Bulk Data Transfer provides a multicast model. Since the TCP protocol does not support multicasting and all the data must be correctly delivered to all ALMA subsystems, a distributer mechanism has been developed. This paper focuses on the ACS Bulk Data Distributer, which mimics a multicast behaviour managing data dispatching to all receivers willing to get data from the same sender.

  16. Performance management of high performance computing for medical image processing in Amazon Web Services

    Science.gov (United States)

    Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

    2016-03-01

    Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.

  17. Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

    Science.gov (United States)

    Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

    2016-02-27

    Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.

  18. Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

    Energy Technology Data Exchange (ETDEWEB)

    Aimone, James Bradley [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Betty, Rita [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-03-01

    Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.

  19. Acceleration of Image Segmentation Algorithm for (Breast) Mammogram Images Using High-Performance Reconfigurable Dataflow Computers.

    Science.gov (United States)

    Milankovic, Ivan L; Mijailovic, Nikola V; Filipovic, Nenad D; Peulic, Aleksandar S

    2017-01-01

    Image segmentation is one of the most common procedures in medical imaging applications. It is also a very important task in breast cancer detection. Breast cancer detection procedure based on mammography can be divided into several stages. The first stage is the extraction of the region of interest from a breast image, followed by the identification of suspicious mass regions, their classification, and comparison with the existing image database. It is often the case that already existing image databases have large sets of data whose processing requires a lot of time, and thus the acceleration of each of the processing stages in breast cancer detection is a very important issue. In this paper, the implementation of the already existing algorithm for region-of-interest based image segmentation for mammogram images on High-Performance Reconfigurable Dataflow Computers (HPRDCs) is proposed. As a dataflow engine (DFE) of such HPRDC, Maxeler's acceleration card is used. The experiments for examining the acceleration of that algorithm on the Reconfigurable Dataflow Computers (RDCs) are performed with two types of mammogram images with different resolutions. There were, also, several DFE configurations and each of them gave a different acceleration value of algorithm execution. Those acceleration values are presented and experimental results showed good acceleration.

  20. Improving the Eco-Efficiency of High Performance Computing Clusters Using EECluster

    Directory of Open Access Journals (Sweden)

    Alberto Cocaña-Fernández

    2016-03-01

    Full Text Available As data and supercomputing centres increase their performance to improve service quality and target more ambitious challenges every day, their carbon footprint also continues to grow, and has already reached the magnitude of the aviation industry. Also, high power consumptions are building up to a remarkable bottleneck for the expansion of these infrastructures in economic terms due to the unavailability of sufficient energy sources. A substantial part of the problem is caused by current energy consumptions of High Performance Computing (HPC clusters. To alleviate this situation, we present in this work EECluster, a tool that integrates with multiple open-source Resource Management Systems to significantly reduce the carbon footprint of clusters by improving their energy efficiency. EECluster implements a dynamic power management mechanism based on Computational Intelligence techniques by learning a set of rules through multi-criteria evolutionary algorithms. This approach enables cluster operators to find the optimal balance between a reduction in the cluster energy consumptions, service quality, and number of reconfigurations. Experimental studies using both synthetic and actual workloads from a real world cluster support the adoption of this tool to reduce the carbon footprint of HPC clusters.

  1. Astrocomp: web technologies for high performance computing on a network of supercomputers

    Science.gov (United States)

    Costa, A.; Becciani, U.; Miocchi, P.; Antonuccio, V.; Capuzzo Dolcetta, R.; Di Matteo, P.; Rosato, V.

    2005-02-01

    Astrocomp is a project developed by the INAF-Astrophysical Observatory of Catania, University of Roma La Sapienza and Enea in collaboration with Oneiros s.r.l. The project has the goal of building a web-based user-friendly interface which allows the international community to run some parallel codes on a set of high-performance computing (HPC) resources, with no need for specific knowledge about Unix and Operating Systems commands. Astrocomp provides CPU times, on parallel systems, available to the authorized user. The portal makes codes for astronomy available: FLY code, a cosmological code for studying three-dimensional collisionless self-gravitating systems with periodic boundary conditions [Becciani, Antonuccio, Comput. Phys. Comm. 136 (2001) 54]. ATD treecode, a parallel tree-code for the simulation of the dynamics of self-gravitating systems [Miocchi, Capuzzo Dolcetta, A&A 382 (2002) 758]. MARA a code for stellar light curves analysis [Rodonò et al., A&A 371 (2001) 174]. Other codes will be added to the portal in the future.

  2. Review: High-performance computing to detect epistasis in genome scale data sets.

    Science.gov (United States)

    Upton, Alex; Trelles, Oswaldo; Cornejo-García, José Antonio; Perkins, James Richard

    2016-05-01

    It is becoming clear that most human diseases have a complex etiology that cannot be explained by single nucleotide polymorphisms (SNPs) or simple additive combinations; the general consensus is that they are caused by combinations of multiple genetic variations. The limited success of some genome-wide association studies is partly a result of this focus on single genetic markers. A more promising approach is to take into account epistasis, by considering the association of multiple SNP interactions with disease. However, as genomic data continues to grow in resolution, and genome and exome sequencing become more established, the number of combinations of variants to consider increases rapidly. Two potential solutions should be considered: the use of high-performance computing, which allows us to consider a larger number of variables, and heuristics to make the solution more tractable, essential in the case of genome sequencing. In this review, we look at different computational methods to analyse epistatic interactions within disease-related genetic data sets created by microarray technology. We also review efforts to use epistatic analysis results to produce biomarkers for diagnostic tests and give our views on future directions in this field in light of advances in sequencing technology and variants in non-coding regions. © The Author 2015. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  3. Development of high performance scientific components for interoperability of computing packages

    Energy Technology Data Exchange (ETDEWEB)

    Gulabani, Teena Pratap [Iowa State Univ., Ames, IA (United States)

    2008-01-01

    Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achieved by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components. Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications.

  4. Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

    Science.gov (United States)

    Beck, Jeffrey; Bos, Jeremy P.

    2017-05-01

    We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.

  5. A high performance computing framework for physics-based modeling and simulation of military ground vehicles

    Science.gov (United States)

    Negrut, Dan; Lamb, David; Gorsich, David

    2011-06-01

    This paper describes a software infrastructure made up of tools and libraries designed to assist developers in implementing computational dynamics applications running on heterogeneous and distributed computing environments. Together, these tools and libraries compose a so called Heterogeneous Computing Template (HCT). The heterogeneous and distributed computing hardware infrastructure is assumed herein to be made up of a combination of CPUs and Graphics Processing Units (GPUs). The computational dynamics applications targeted to execute on such a hardware topology include many-body dynamics, smoothed-particle hydrodynamics (SPH) fluid simulation, and fluid-solid interaction analysis. The underlying theme of the solution approach embraced by HCT is that of partitioning the domain of interest into a number of subdomains that are each managed by a separate core/accelerator (CPU/GPU) pair. Five components at the core of HCT enable the envisioned distributed computing approach to large-scale dynamical system simulation: (a) the ability to partition the problem according to the one-to-one mapping; i.e., spatial subdivision, discussed above (pre-processing); (b) a protocol for passing data between any two co-processors; (c) algorithms for element proximity computation; and (d) the ability to carry out post-processing in a distributed fashion. In this contribution the components (a) and (b) of the HCT are demonstrated via the example of the Discrete Element Method (DEM) for rigid body dynamics with friction and contact. The collision detection task required in frictional-contact dynamics (task (c) above), is shown to benefit on the GPU of a two order of magnitude gain in efficiency when compared to traditional sequential implementations. Note: Reference herein to any specific commercial products, process, or service by trade name, trademark, manufacturer, or otherwise, does not imply its endorsement, recommendation, or favoring by the United States Army. The views and

  6. High-Throughput Computing on High-Performance Platforms: A Case Study

    Energy Technology Data Exchange (ETDEWEB)

    Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Matteo, Turilli [Rutgers University; Angius, Alessio [Rutgers University; Oral, H Sarp [ORNL; De, K [University of Texas at Arlington; Klimentov, A [Brookhaven National Laboratory (BNL); Wells, Jack C. [ORNL; Jha, S [Rutgers University

    2017-10-01

    The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan -- a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.

  7. al3c: high-performance software for parameter inference using Approximate Bayesian Computation.

    Science.gov (United States)

    Stram, Alexander H; Marjoram, Paul; Chen, Gary K

    2015-11-01

    The development of Approximate Bayesian Computation (ABC) algorithms for parameter inference which are both computationally efficient and scalable in parallel computing environments is an important area of research. Monte Carlo rejection sampling, a fundamental component of ABC algorithms, is trivial to distribute over multiple processors but is inherently inefficient. While development of algorithms such as ABC Sequential Monte Carlo (ABC-SMC) help address the inherent inefficiencies of rejection sampling, such approaches are not as easily scaled on multiple processors. As a result, current Bayesian inference software offerings that use ABC-SMC lack the ability to scale in parallel computing environments. We present al3c, a C++ framework for implementing ABC-SMC in parallel. By requiring only that users define essential functions such as the simulation model and prior distribution function, al3c abstracts the user from both the complexities of parallel programming and the details of the ABC-SMC algorithm. By using the al3c framework, the user is able to scale the ABC-SMC algorithm in parallel computing environments for his or her specific application, with minimal programming overhead. al3c is offered as a static binary for Linux and OS-X computing environments. The user completes an XML configuration file and C++ plug-in template for the specific application, which are used by al3c to obtain the desired results. Users can download the static binaries, source code, reference documentation and examples (including those in this article) by visiting https://github.com/ahstram/al3c. astram@usc.edu Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Tackling some of the most intricate geophysical challenges via high-performance computing

    Science.gov (United States)

    Khosronejad, A.

    2016-12-01

    Recently, world has been witnessing significant enhancements in computing power of supercomputers. Computer clusters in conjunction with the advanced mathematical algorithms has set the stage for developing and applying powerful numerical tools to tackle some of the most intricate geophysical challenges that today`s engineers face. One such challenge is to understand how turbulent flows, in real-world settings, interact with (a) rigid and/or mobile complex bed bathymetry of waterways and sea-beds in the coastal areas; (b) objects with complex geometry that are fully or partially immersed; and (c) free-surface of waterways and water surface waves in the coastal area. This understanding is especially important because the turbulent flows in real-world environments are often bounded by geometrically complex boundaries, which dynamically deform and give rise to multi-scale and multi-physics transport phenomena, and characterized by multi-lateral interactions among various phases (e.g. air/water/sediment phases). Herein, I present some of the multi-scale and multi-physics geophysical fluid mechanics processes that I have attempted to study using an in-house high-performance computational model, the so-called VFS-Geophysics. More specifically, I will present the simulation results of turbulence/sediment/solute/turbine interactions in real-world settings. Parts of the simulations I present are performed to gain scientific insights into the processes such as sand wave formation (A. Khosronejad, and F. Sotiropoulos, (2014), Numerical simulation of sand waves in a turbulent open channel flow, Journal of Fluid Mechanics, 753:150-216), while others are carried out to predict the effects of climate change and large flood events on societal infrastructures ( A. Khosronejad, et al., (2016), Large eddy simulation of turbulence and solute transport in a forested headwater stream, Journal of Geophysical Research:, doi: 10.1002/2014JF003423).

  9. Strengthening LLNL Missions through Laboratory Directed Research and Development in High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Willis, D. K. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-12-01

    High performance computing (HPC) has been a defining strength of Lawrence Livermore National Laboratory (LLNL) since its founding. Livermore scientists have designed and used some of the world’s most powerful computers to drive breakthroughs in nearly every mission area. Today, the Laboratory is recognized as a world leader in the application of HPC to complex science, technology, and engineering challenges. Most importantly, HPC has been integral to the National Nuclear Security Administration’s (NNSA’s) Stockpile Stewardship Program—designed to ensure the safety, security, and reliability of our nuclear deterrent without nuclear testing. A critical factor behind Lawrence Livermore’s preeminence in HPC is the ongoing investments made by the Laboratory Directed Research and Development (LDRD) Program in cutting-edge concepts to enable efficient utilization of these powerful machines. Congress established the LDRD Program in 1991 to maintain the technical vitality of the Department of Energy (DOE) national laboratories. Since then, LDRD has been, and continues to be, an essential tool for exploring anticipated needs that lie beyond the planning horizon of our programs and for attracting the next generation of talented visionaries. Through LDRD, Livermore researchers can examine future challenges, propose and explore innovative solutions, and deliver creative approaches to support our missions. The present scientific and technical strengths of the Laboratory are, in large part, a product of past LDRD investments in HPC. Here, we provide seven examples of LDRD projects from the past decade that have played a critical role in building LLNL’s HPC, computer science, mathematics, and data science research capabilities, and describe how they have impacted LLNL’s mission.

  10. STATE-SPACE SOLUTIONS TO THE DYNAMIC MAGNETOENCEPHALOGRAPHY INVERSE PROBLEM USING HIGH PERFORMANCE COMPUTING.

    Science.gov (United States)

    Long, Christopher J; Purdon, Patrick L; Temereanca, Simona; Desai, Neil U; Hämäläinen, Matti S; Brown, Emery N

    2011-06-01

    Determining the magnitude and location of neural sources within the brain that are responsible for generating magnetoencephalography (MEG) signals measured on the surface of the head is a challenging problem in functional neuroimaging. The number of potential sources within the brain exceeds by an order of magnitude the number of recording sites. As a consequence, the estimates for the magnitude and location of the neural sources will be ill-conditioned because of the underdetermined nature of the problem. One well-known technique designed to address this imbalance is the minimum norm estimator (MNE). This approach imposes an L(2) regularization constraint that serves to stabilize and condition the source parameter estimates. However, these classes of regularizer are static in time and do not consider the temporal constraints inherent to the biophysics of the MEG experiment. In this paper we propose a dynamic state-space model that accounts for both spatial and temporal correlations within and across candidate intra-cortical sources. In our model, the observation model is derived from the steady-state solution to Maxwell's equations while the latent model representing neural dynamics is given by a random walk process. We show that the Kalman filter (KF) and the Kalman smoother [also known as the fixed-interval smoother (FIS)] may be used to solve the ensuing high-dimensional state-estimation problem. Using a well-known relationship between Bayesian estimation and Kalman filtering, we show that the MNE estimates carry a significant zero bias. Calculating these high-dimensional state estimates is a computationally challenging task that requires High Performance Computing (HPC) resources. To this end, we employ the NSF Teragrid Supercomputing Network to compute the source estimates. We demonstrate improvement in performance of the state-space algorithm relative to MNE in analyses of simulated and actual somatosensory MEG experiments. Our findings establish the

  11. High-performance computing for flight vehicles; Proceedings of the Symposium, Washington, Dec. 7-9, 1992

    Science.gov (United States)

    Noor, Ahmed K. (Editor); Venneri, Samuel L. (Editor)

    1992-01-01

    The present conference discusses high-performance computing systems for flight vehicles, large-scale simulations on high-performance flight computers and software, multidisciplinary and design/optimization applications of computers, computational electromagnetics and acoustics, the simulation of aircraft powerplant turbomachinery and reacting flows, and flow calculations on parallel machines. Also discussed are direct flow simulation Monte Carlo methods, structural mechanics sensitivity and fracture calculations on parallel machines, grid-generation and advanced algorithms for CFD, advanced solid-mechanics and structures applications, and advancements in flow visualization technology and neural networks.

  12. A Lightweight, High-performance I/O Management Package for Data-intensive Computing

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Jun

    2011-06-22

    Our group has been working with ANL collaborators on the topic bridging the gap between parallel file system and local file system during the course of this project period. We visited Argonne National Lab -- Dr. Robert Ross's group for one week in the past summer 2007. We looked over our current project progress and planned the activities for the incoming years 2008-09. The PI met Dr. Robert Ross several times such as HEC FSIO workshop 08, SC08 and SC10. We explored the opportunities to develop a production system by leveraging our current prototype to (SOGP+PVFS) a new PVFS version. We delivered SOGP+PVFS codes to ANL PVFS2 group in 2008.We also talked about exploring a potential project on developing new parallel programming models and runtime systems for data-intensive scalable computing (DISC). The methodology is to evolve MPI towards DISC by incorporating some functions of Google MapReduce parallel programming model. More recently, we are together exploring how to leverage existing works to perform (1) coordination/aggregation of local I/O operations prior to movement over the WAN, (2) efficient bulk data movement over the WAN, (3) latency hiding techniques for latency-intensive operations. Since 2009, we start applying Hadoop/MapReduce to some HEC applications with LANL scientists John Bent and Salman Habib. Another on-going work is to improve checkpoint performance at I/O forwarding Layer for the Road Runner super computer with James Nuetz and Gary Gridder at LANL. Two senior undergraduates from our research group did summer internships about high-performance file and storage system projects in LANL since 2008 for consecutive three years. Both of them are now pursuing Ph.D. degree in our group and will be 4th year in the PhD program in Fall 2011 and go to LANL to advance two above-mentioned works during this winter break. Since 2009, we have been collaborating with several computer scientists (Gary Grider, John bent, Parks Fields, James Nunez, Hsing

  13. Subsurface multiphase flow and multicomponent reactive transport modeling using high-performance computing

    Science.gov (United States)

    Hammond, Glenn; Lichtner, Peter; Lu, Chuan

    2007-07-01

    Numerical modeling is a critical tool to the U.S. Department of Energy for evaluating the environmental impact of remediation strategies for subsurface legacy waste sites. Unfortunately, the physical and chemical complexity of many sites overwhelms the capabilities of even most state of the art groundwater models. Of particular concern is the representation of highly-heterogeneous stratified rock/soil layers in the subsurface and the biological and geochemical interactions of chemical species within multiple fluid phases. There is clearly a need for higher-resolution modeling (i.e. increased spatial and temporal resolution) and increasingly mechanistic descriptions of subsurface physicochemical processes (i.e. increased chemical degrees of freedom). We present SciDAC-funded research being performed in furthering the development of PFLOTRAN, a parallel multiphase flow and multicomponent reactive transport model. Written in Fortran90, PFLOTRAN is founded upon PETSc data structures and solvers. We are employing PFLOTRAN to simulate uranium transport at the Hanford 300 Area, a contaminated site of major concern to the Department of Energy, the State of Washington, and other government agencies. By leveraging the billions of degrees of freedom available through high-performance computation using tens of thousands of processors, we can better characterize the release of uranium into groundwater and its subsequent transport to the Columbia River, and thereby better understand and evaluate the effectiveness of various proposed remediation strategies.

  14. Analysis of Application Power and Schedule Composition in a High Performance Computing Environment

    Energy Technology Data Exchange (ETDEWEB)

    Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Gruchalla, Kenny [National Renewable Energy Lab. (NREL), Golden, CO (United States); Phillips, Caleb [National Renewable Energy Lab. (NREL), Golden, CO (United States); Purkayastha, Avi [National Renewable Energy Lab. (NREL), Golden, CO (United States); Wunder, Nick [National Renewable Energy Lab. (NREL), Golden, CO (United States)

    2016-01-05

    As the capacity of high performance computing (HPC) systems continues to grow, small changes in energy management have the potential to produce significant energy savings. In this paper, we employ an extensive informatics system for aggregating and analyzing real-time performance and power use data to evaluate energy footprints of jobs running in an HPC data center. We look at the effects of algorithmic choices for a given job on the resulting energy footprints, and analyze application-specific power consumption, and summarize average power use in the aggregate. All of these views reveal meaningful power variance between classes of applications as well as chosen methods for a given job. Using these data, we discuss energy-aware cost-saving strategies based on reordering the HPC job schedule. Using historical job and power data, we present a hypothetical job schedule reordering that: (1) reduces the facility's peak power draw and (2) manages power in conjunction with a large-scale photovoltaic array. Lastly, we leverage this data to understand the practical limits on predicting key power use metrics at the time of submission.

  15. Analysis of Metabolomics Datasets with High-Performance Computing and Metabolite Atlases

    Directory of Open Access Journals (Sweden)

    Yushu Yao

    2015-07-01

    Full Text Available Even with the widespread use of liquid chromatography mass spectrometry (LC/MS based metabolomics, there are still a number of challenges facing this promising technique. Many, diverse experimental workflows exist; yet there is a lack of infrastructure and systems for tracking and sharing of information. Here, we describe the Metabolite Atlas framework and interface that provides highly-efficient, web-based access to raw mass spectrometry data in concert with assertions about chemicals detected to help address some of these challenges. This integration, by design, enables experimentalists to explore their raw data, specify and refine features annotations such that they can be leveraged for future experiments. Fast queries of the data through the web using SciDB, a parallelized database for high performance computing, make this process operate quickly. By using scripting containers, such as IPython or Jupyter, to analyze the data, scientists can utilize a wide variety of freely available graphing, statistics, and information management resources. In addition, the interfaces facilitate integration with systems biology tools to ultimately link metabolomics data with biological models.

  16. National cyber defense high performance computing and analysis : concepts, planning and roadmap.

    Energy Technology Data Exchange (ETDEWEB)

    Hamlet, Jason R.; Keliiaa, Curtis M.

    2010-09-01

    There is a national cyber dilemma that threatens the very fabric of government, commercial and private use operations worldwide. Much is written about 'what' the problem is, and though the basis for this paper is an assessment of the problem space, we target the 'how' solution space of the wide-area national information infrastructure through the advancement of science, technology, evaluation and analysis with actionable results intended to produce a more secure national information infrastructure and a comprehensive national cyber defense capability. This cybersecurity High Performance Computing (HPC) analysis concepts, planning and roadmap activity was conducted as an assessment of cybersecurity analysis as a fertile area of research and investment for high value cybersecurity wide-area solutions. This report and a related SAND2010-4765 Assessment of Current Cybersecurity Practices in the Public Domain: Cyber Indications and Warnings Domain report are intended to provoke discussion throughout a broad audience about developing a cohesive HPC centric solution to wide-area cybersecurity problems.

  17. Implementing the Data Center Energy Productivity Metric in a High Performance Computing Data Center

    Energy Technology Data Exchange (ETDEWEB)

    Sego, Landon H.; Marquez, Andres; Rawson, Andrew; Cader, Tahir; Fox, Kevin M.; Gustafson, William I.; Mundy, Christopher J.

    2013-06-30

    As data centers proliferate in size and number, the improvement of their energy efficiency and productivity has become an economic and environmental imperative. Making these improvements requires metrics that are robust, interpretable, and practical. We discuss the properties of a number of the proposed metrics of energy efficiency and productivity. In particular, we focus on the Data Center Energy Productivity (DCeP) metric, which is the ratio of useful work produced by the data center to the energy consumed performing that work. We describe our approach for using DCeP as the principal outcome of a designed experiment using a highly instrumented, high-performance computing data center. We found that DCeP was successful in clearly distinguishing different operational states in the data center, thereby validating its utility as a metric for identifying configurations of hardware and software that would improve energy productivity. We also discuss some of the challenges and benefits associated with implementing the DCeP metric, and we examine the efficacy of the metric in making comparisons within a data center and between data centers.

  18. High-Performance Computer Modeling of the Cosmos-Iridium Collision

    Energy Technology Data Exchange (ETDEWEB)

    Olivier, S; Cook, K; Fasenfest, B; Jefferson, D; Jiang, M; Leek, J; Levatin, J; Nikolaev, S; Pertica, A; Phillion, D; Springer, K; De Vries, W

    2009-08-28

    This paper describes the application of a new, integrated modeling and simulation framework, encompassing the space situational awareness (SSA) enterprise, to the recent Cosmos-Iridium collision. This framework is based on a flexible, scalable architecture to enable efficient simulation of the current SSA enterprise, and to accommodate future advancements in SSA systems. In particular, the code is designed to take advantage of massively parallel, high-performance computer systems available, for example, at Lawrence Livermore National Laboratory. We will describe the application of this framework to the recent collision of the Cosmos and Iridium satellites, including (1) detailed hydrodynamic modeling of the satellite collision and resulting debris generation, (2) orbital propagation of the simulated debris and analysis of the increased risk to other satellites (3) calculation of the radar and optical signatures of the simulated debris and modeling of debris detection with space surveillance radar and optical systems (4) determination of simulated debris orbits from modeled space surveillance observations and analysis of the resulting orbital accuracy, (5) comparison of these modeling and simulation results with Space Surveillance Network observations. We will also discuss the use of this integrated modeling and simulation framework to analyze the risks and consequences of future satellite collisions and to assess strategies for mitigating or avoiding future incidents, including the addition of new sensor systems, used in conjunction with the Space Surveillance Network, for improving space situational awareness.

  19. Reconfigurable and adaptive photonic networks for high-performance computing systems.

    Science.gov (United States)

    Kodi, Avinash; Louri, Ahmed

    2009-08-01

    As feature sizes decrease to the submicrometer regime and clock rates increase to the multigigahertz range, the limited bandwidth at higher bit rates and longer communication distances in electrical interconnects will create a major bandwidth imbalance in future high-performance computing (HPC) systems. We explore the application of an optoelectronic interconnect for the design of flexible, high-bandwidth, reconfigurable and adaptive interconnection architectures for chip-to-chip and board-to-board HPC systems. Reconfigurability is realized by interconnecting arrays of optical transmitters, and adaptivity is implemented by a dynamic bandwidth reallocation (DBR) technique that balances the load on each communication channel. We evaluate a DBR technique, the lockstep (LS) protocol, that monitors traffic intensities, reallocates bandwidth, and adapts to changes in communication patterns. We incorporate this DBR technique into a detailed discrete-event network simulator to evaluate the performance for uniform, nonuniform, and permutation communication patterns. Simulation results indicate that, without reconfiguration techniques being applied, optical based system architecture shows better performance than electrical interconnects for uniform and nonuniform patterns; with reconfiguration techniques being applied, the dynamically reconfigurable optoelectronic interconnect provides much better performance for all communication patterns. Based on the performance study, the reconfigured architecture shows 30%-50% increased throughput and 50%-75% reduced network latency compared with HPC electrical networks.

  20. Analysis of Metabolomics Datasets with High-Performance Computing and Metabolite Atlases.

    Science.gov (United States)

    Yao, Yushu; Sun, Terence; Wang, Tony; Ruebel, Oliver; Northen, Trent; Bowen, Benjamin P

    2015-07-20

    Even with the widespread use of liquid chromatography mass spectrometry (LC/MS) based metabolomics, there are still a number of challenges facing this promising technique. Many, diverse experimental workflows exist; yet there is a lack of infrastructure and systems for tracking and sharing of information. Here, we describe the Metabolite Atlas framework and interface that provides highly-efficient, web-based access to raw mass spectrometry data in concert with assertions about chemicals detected to help address some of these challenges. This integration, by design, enables experimentalists to explore their raw data, specify and refine features annotations such that they can be leveraged for future experiments. Fast queries of the data through the web using SciDB, a parallelized database for high performance computing, make this process operate quickly. By using scripting containers, such as IPython or Jupyter, to analyze the data, scientists can utilize a wide variety of freely available graphing, statistics, and information management resources. In addition, the interfaces facilitate integration with systems biology tools to ultimately link metabolomics data with biological models.

  1. High performance near-ultraviolet flip-chip light-emitting diodes with distributed Bragg reflector

    Science.gov (United States)

    Choi, Il-Gyun; Jin, Geun-Mo; Park, Jun-Cheon; Jeon, Soo-Kun; Park, Eun-Hyun

    2015-09-01

    We have fabricated the near-ultraviolet (NUV) flip-chip (FC) light-emitting diodes (LEDs) with the high external quantum efficiency (EQE) using distributed Bragg reflectors (DBRs) and compared with conventional FC-LED using silver (Ag) reflector. Reflectance of Ag is very high (90 ~ 95 %) at visible spectrum region, but sharply decrease at NUV region. Therefore we used DBR composed of two different materials which have high-index contrast, such as TiO2 and SiO2. However, to achieve high-performance NUV flip-chip LEDs, we used Ta2O5 instead of TiO2 that absorbs lights of NUV region. Thus, we have designed a DBR composed of twenty pairs of Ta2O5 and SiO2 using optical coating design software. The DBR designed by our group achieves a reflectance of ~99 % in the NUV region (350 ~ 500 nm), which is much better than Ag reflector. Optical power is higher than the Ag-LED up to 22 % @ 390 nm.

  2. High-performance gravitational N-body simulations on a planet-wide-distributed supercomputer

    Energy Technology Data Exchange (ETDEWEB)

    Groen, Derek; Zwart, Simon Portegies [Leiden Observatory, Leiden University, PO Box 9513, 2300 RA Leiden (Netherlands); Ishiyama, Tomoaki; Makino, Jun, E-mail: djgroen@strw.leidenuniv.nl [National Astronomical Observatory, Mitaka, Tokyo 181-8588 (Japan)

    2011-01-15

    We report on the performance of our cold dark matter cosmological N-body simulation that was carried out concurrently using supercomputers across the globe. We ran simulations on 60-750 cores distributed over a variety of supercomputers in Amsterdam (The Netherlands, Europe), in Tokyo (Japan, Asia), Edinburgh (UK, Europe) and Espoo (Finland, Europe). Regardless of the network latency of 0.32 s and the communication over 30 000 km of optical network cable, we are able to achieve {approx}87% of the performance compared to an equal number of cores on a single supercomputer. We argue that using widely distributed supercomputers in order to acquire more compute power is technically feasible and that the largest obstacle is introduced by local scheduling and reservation policies.

  3. A High-performance temporal-spatial discretization method for the parallel computing of river basins

    Science.gov (United States)

    Wang, Hao; Fu, Xudong; Wang, Yuanjian; Wang, Guangqian

    2013-08-01

    The distributed basin model (DBM) has become one of the most effective tools in river basin studies. In order to overcome the efficiency bottleneck of DBM, an effective parallel-computing method, named temporal-spatial discretization method (TSDM), is proposed. In space, TSDM adopts the sub-basin partitioning manner to the river basin. Compared to the existing sub-basin-based parallel methods, more computable units can be supplied, organized and dispatched using TSDM. Through the characteristic of the temporal-spatial dual discretization, TSDM is capable of exploiting the river-basin parallelization degree to the maximum extent and obtaining higher computing performance. A mathematical formula assessing the maximum speedup ratio (MSR) of TSDM is provided as well. TSDM is independent of the implementation of any physical models and is preliminarily tested in the Lhasa River basin with 1-year rainfall-runoff process simulated. The MSR acquired in the existing traditional way is 7.98. Comparatively, the MSR using TSDM equals to 15.04 under the present limited computing resources, which appears to still have potential to keep increasing. The final results demonstrate the effectiveness and applicability of TSDM.

  4. Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models

    Science.gov (United States)

    Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.

    2015-03-01

    We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.

  5. Heterogeneous Gpu&Cpu Cluster For High Performance Computing In Cryptography

    Directory of Open Access Journals (Sweden)

    Michał Marks

    2012-01-01

    Full Text Available This paper addresses issues associated with distributed computing systems andthe application of mixed GPU&CPU technology to data encryption and decryptionalgorithms. We describe a heterogenous cluster HGCC formed by twotypes of nodes: Intel processor with NVIDIA graphics processing unit and AMDprocessor with AMD graphics processing unit (formerly ATI, and a novel softwareframework that hides the heterogeneity of our cluster and provides toolsfor solving complex scientific and engineering problems. Finally, we present theresults of numerical experiments. The considered case study is concerned withparallel implementations of selected cryptanalysis algorithms. The main goal ofthe paper is to show the wide applicability of the GPU&CPU technology tolarge scale computation and data processing.

  6. High Performance Computing: Advanced Research Projects Agency Should Do More To Foster Program Goals. Report to the Chairman, Committee on Armed Services, House of Representatives.

    Science.gov (United States)

    General Accounting Office, Washington, DC. Information Management and Technology Div.

    High-performance computing refers to the use of advanced computing technologies to solve highly complex problems in the shortest possible time. The federal High Performance Computing and Communications Initiative of the Advanced Research Project Agency (ARPA) attempts to accelerate availability and use of high performance computers and networks.…

  7. Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

    KAUST Repository

    Danani, Bob K.

    2012-07-01

    Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.

  8. High-Performance Computing Act of 1990: Report of the Senate Committee on Commerce, Science, and Transportation on S. 1067.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

    This committee report is intended to accompany S. 1067, a bill designed to provide for a coordinated federal research program in high-performance computing (HPC). The primary objective of the legislation is given as the acceleration of research, development, and application of the most advanced computing technology in research, education, and…

  9. The DoD's High Performance Computing Modernization Program - Ensuing the National Earth Systems Prediction Capability Becomes Operational

    Science.gov (United States)

    Burnett, W.

    2016-12-01

    The Department of Defense's (DoD) High Performance Computing Modernization Program (HPCMP) provides high performance computing to address the most significant challenges in computational resources, software application support and nationwide research and engineering networks. Today, the HPCMP has a critical role in ensuring the National Earth System Prediction Capability (N-ESPC) achieves initial operational status in 2019. A 2015 study commissioned by the HPCMP found that N-ESPC computational requirements will exceed interconnect bandwidth capacity due to the additional load from data assimilation and passing connecting data between ensemble codes. Memory bandwidth and I/O bandwidth will continue to be significant bottlenecks for the Navy's Hybrid Coordinate Ocean Model (HYCOM) scalability - by far the major driver of computing resource requirements in the N-ESPC. The study also found that few of the N-ESPC model developers have detailed plans to ensure their respective codes scale through 2024. Three HPCMP initiatives are designed to directly address and support these issues: Productivity Enhancement, Technology, Transfer and Training (PETTT), the HPCMP Applications Software Initiative (HASI), and Frontier Projects. PETTT supports code conversion by providing assistance, expertise and training in scalable and high-end computing architectures. HASI addresses the continuing need for modern application software that executes effectively and efficiently on next-generation high-performance computers. Frontier Projects enable research and development that could not be achieved using typical HPCMP resources by providing multi-disciplinary teams access to exceptional amounts of high performance computing resources. Finally, the Navy's DoD Supercomputing Resource Center (DSRC) currently operates a 6 Petabyte system, of which Naval Oceanography receives 15% of operational computational system use, or approximately 1 Petabyte of the processing capability. The DSRC will

  10. High performance communication by people with paralysis using an intracortical brain-computer interface

    Science.gov (United States)

    Pandarinath, Chethan; Nuyujukian, Paul; Blabe, Christine H; Sorice, Brittany L; Saab, Jad; Willett, Francis R; Hochberg, Leigh R

    2017-01-01

    Brain-computer interfaces (BCIs) have the potential to restore communication for people with tetraplegia and anarthria by translating neural activity into control signals for assistive communication devices. While previous pre-clinical and clinical studies have demonstrated promising proofs-of-concept (Serruya et al., 2002; Simeral et al., 2011; Bacher et al., 2015; Nuyujukian et al., 2015; Aflalo et al., 2015; Gilja et al., 2015; Jarosiewicz et al., 2015; Wolpaw et al., 1998; Hwang et al., 2012; Spüler et al., 2012; Leuthardt et al., 2004; Taylor et al., 2002; Schalk et al., 2008; Moran, 2010; Brunner et al., 2011; Wang et al., 2013; Townsend and Platsko, 2016; Vansteensel et al., 2016; Nuyujukian et al., 2016; Carmena et al., 2003; Musallam et al., 2004; Santhanam et al., 2006; Hochberg et al., 2006; Ganguly et al., 2011; O’Doherty et al., 2011; Gilja et al., 2012), the performance of human clinical BCI systems is not yet high enough to support widespread adoption by people with physical limitations of speech. Here we report a high-performance intracortical BCI (iBCI) for communication, which was tested by three clinical trial participants with paralysis. The system leveraged advances in decoder design developed in prior pre-clinical and clinical studies (Gilja et al., 2015; Kao et al., 2016; Gilja et al., 2012). For all three participants, performance exceeded previous iBCIs (Bacher et al., 2015; Jarosiewicz et al., 2015) as measured by typing rate (by a factor of 1.4–4.2) and information throughput (by a factor of 2.2–4.0). This high level of performance demonstrates the potential utility of iBCIs as powerful assistive communication devices for people with limited motor function. Clinical Trial No: NCT00912041 DOI: http://dx.doi.org/10.7554/eLife.18554.001 PMID:28220753

  11. High performance communication by people with paralysis using an intracortical brain-computer interface.

    Science.gov (United States)

    Pandarinath, Chethan; Nuyujukian, Paul; Blabe, Christine H; Sorice, Brittany L; Saab, Jad; Willett, Francis R; Hochberg, Leigh R; Shenoy, Krishna V; Henderson, Jaimie M

    2017-02-21

    Brain-computer interfaces (BCIs) have the potential to restore communication for people with tetraplegia and anarthria by translating neural activity into control signals for assistive communication devices. While previous pre-clinical and clinical studies have demonstrated promising proofs-of-concept (Serruya et al., 2002; Simeral et al., 2011; Bacher et al., 2015; Nuyujukian et al., 2015; Aflalo et al., 2015; Gilja et al., 2015; Jarosiewicz et al., 2015; Wolpaw et al., 1998; Hwang et al., 2012; Spüler et al., 2012; Leuthardt et al., 2004; Taylor et al., 2002; Schalk et al., 2008; Moran, 2010; Brunner et al., 2011; Wang et al., 2013; Townsend and Platsko, 2016; Vansteensel et al., 2016; Nuyujukian et al., 2016; Carmena et al., 2003; Musallam et al., 2004; Santhanam et al., 2006; Hochberg et al., 2006; Ganguly et al., 2011; O'Doherty et al., 2011; Gilja et al., 2012), the performance of human clinical BCI systems is not yet high enough to support widespread adoption by people with physical limitations of speech. Here we report a high-performance intracortical BCI (iBCI) for communication, which was tested by three clinical trial participants with paralysis. The system leveraged advances in decoder design developed in prior pre-clinical and clinical studies (Gilja et al., 2015; Kao et al., 2016; Gilja et al., 2012). For all three participants, performance exceeded previous iBCIs (Bacher et al., 2015; Jarosiewicz et al., 2015) as measured by typing rate (by a factor of 1.4-4.2) and information throughput (by a factor of 2.2-4.0). This high level of performance demonstrates the potential utility of iBCIs as powerful assistive communication devices for people with limited motor function.Clinical Trial No: NCT00912041.

  12. Leveraging High Performance Computing for Managing Large and Evolving Data Collections

    Directory of Open Access Journals (Sweden)

    Ritu Arora

    2014-10-01

    Full Text Available The process of developing a digital collection in the context of a research project often involves a pipeline pattern during which data growth, data types, and data authenticity need to be assessed iteratively in relation to the different research steps and in the interest of archiving. Throughout a project’s lifecycle curators organize newly generated data while cleaning and integrating legacy data when it exists, and deciding what data will be preserved for the long term. Although these actions should be part of a well-oiled data management workflow, there are practical challenges in doing so if the collection is very large and heterogeneous, or is accessed by several researchers contemporaneously. There is a need for data management solutions that can help curators with efficient and on-demand analyses of their collection so that they remain well-informed about its evolving characteristics. In this paper, we describe our efforts towards developing a workflow to leverage open science High Performance Computing (HPC resources for routinely and efficiently conducting data management tasks on large collections. We demonstrate that HPC resources and techniques can significantly reduce the time for accomplishing critical data management tasks, and enable a dynamic archiving throughout the research process. We use a large archaeological data collection with a long and complex formation history as our test case. We share our experiences in adopting open science HPC resources for large-scale data management, which entails understanding usage of the open source HPC environment and training users. These experiences can be generalized to meet the needs of other data curators working with large collections.

  13. LIAR -- A computer program for the modeling and simulation of high performance linacs

    Energy Technology Data Exchange (ETDEWEB)

    Assmann, R.; Adolphsen, C.; Bane, K.; Emma, P.; Raubenheimer, T.; Siemann, R.; Thompson, K.; Zimmermann, F.

    1997-04-01

    The computer program LIAR (LInear Accelerator Research Code) is a numerical modeling and simulation tool for high performance linacs. Amongst others, it addresses the needs of state-of-the-art linear colliders where low emittance, high-intensity beams must be accelerated to energies in the 0.05-1 TeV range. LIAR is designed to be used for a variety of different projects. LIAR allows the study of single- and multi-particle beam dynamics in linear accelerators. It calculates emittance dilutions due to wakefield deflections, linear and non-linear dispersion and chromatic effects in the presence of multiple accelerator imperfections. Both single-bunch and multi-bunch beams can be simulated. Several basic and advanced optimization schemes are implemented. Present limitations arise from the incomplete treatment of bending magnets and sextupoles. A major objective of the LIAR project is to provide an open programming platform for the accelerator physics community. Due to its design, LIAR allows straight-forward access to its internal FORTRAN data structures. The program can easily be extended and its interactive command language ensures maximum ease of use. Presently, versions of LIAR are compiled for UNIX and MS Windows operating systems. An interface for the graphical visualization of results is provided. Scientific graphs can be saved in the PS and EPS file formats. In addition a Mathematica interface has been developed. LIAR now contains more than 40,000 lines of source code in more than 130 subroutines. This report describes the theoretical basis of the program, provides a reference for existing features and explains how to add further commands. The LIAR home page and the ONLINE version of this manual can be accessed under: http://www.slac.stanford.edu/grp/arb/rwa/liar.htm.

  14. DAIRRy-BLUP: a high-performance computing approach to genomic prediction.

    Science.gov (United States)

    De Coninck, Arne; Fostier, Jan; Maenhout, Steven; De Baets, Bernard

    2014-07-01

    In genomic prediction, common analysis methods rely on a linear mixed-model framework to estimate SNP marker effects and breeding values of animals or plants. Ridge regression-best linear unbiased prediction (RR-BLUP) is based on the assumptions that SNP marker effects are normally distributed, are uncorrelated, and have equal variances. We propose DAIRRy-BLUP, a parallel, Distributed-memory RR-BLUP implementation, based on single-trait observations ( Y: ), that uses the Average Information algorithm for restricted maximum-likelihood estimation of the variance components. The goal of DAIRRy-BLUP is to enable the analysis of large-scale data sets to provide more accurate estimates of marker effects and breeding values. A distributed-memory framework is required since the dimensionality of the problem, determined by the number of SNP markers, can become too large to be analyzed by a single computing node. Initial results show that DAIRRy-BLUP enables the analysis of very large-scale data sets (up to 1,000,000 individuals and 360,000 SNPs) and indicate that increasing the number of phenotypic and genotypic records has a more significant effect on the prediction accuracy than increasing the density of SNP arrays. Copyright © 2014 by the Genetics Society of America.

  15. iSSH v. Auditd: Intrusion Detection in High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Karns, David M. [Los Alamos National Laboratory; Protin, Kathryn S. [Los Alamos National Laboratory; Wolf, Justin G. [Los Alamos National Laboratory

    2012-07-30

    The goal is to provide insight into intrusions in high performance computing, focusing on tracking intruders motions through the system. The current tools, such as pattern matching, do not provide sufficient tracking capabilities. We tested two tools: an instrumented version of SSH (iSSH) and Linux Auditing Framework (Auditd). First discussed is Instrumented Secure Shell (iSSH): a version of SSH developed at Lawrence Berkeley National Laboratory. The goal is to audit user activity within a computer system to increase security. Capabilities are: Keystroke logging, Records user names and authentication information, and Catching suspicious remote and local commands. Strengths for iSSH are: (1) Good for keystroke logging, making it easier to track malicious users by catching suspicious commands; (2) Works with Bro to send alerts; could be configured to send pages to systems administrators; and (3) Creates visibility into SSH sessions. Weaknesses are: (1) Relatively new, so not very well documented; and (2) No capabilities to see if files have been edited, moved, or copied within the system. Second we discuss Auditd, the user component of the Linux Auditing System. It creates logs of user behavior, and monitors systems calls and file accesses. Its goal is to improve system security by keeping track of users actions within the system. Strenghts of Auditd are: (1) Very thorough logs; (2) Wider variety of tracking abilities than iSSH; and (3) Older, so better documented. Weaknesses are: (1) Logs record everything, not just malicious behavior; (2) The size of the logs can lead to overflowing directories; and (3) This level of logging leads to a lot of false alarms. Auditd is better documented than iSSH, which would help administrators during set up and troubleshooting. iSSH has a cleaner notification system, but the logs are not as detailed as Auditd. From our performance testing: (1) File transfer speed using SCP is increased when using iSSH; and (2) Network benchmarks

  16. SAME4HPC: A Promising Approach in Building a Scalable and Mobile Environment for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Karthik, Rajasekar [ORNL

    2014-01-01

    In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack with Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.

  17. Infrastructure for Multiphysics Software Integration in High Performance Computing-Aided Science and Engineering

    Energy Technology Data Exchange (ETDEWEB)

    Campbell, Michael T. [Illinois Rocstar LLC, Champaign, IL (United States); Safdari, Masoud [Illinois Rocstar LLC, Champaign, IL (United States); Kress, Jessica E. [Illinois Rocstar LLC, Champaign, IL (United States); Anderson, Michael J. [Illinois Rocstar LLC, Champaign, IL (United States); Horvath, Samantha [Illinois Rocstar LLC, Champaign, IL (United States); Brandyberry, Mark D. [Illinois Rocstar LLC, Champaign, IL (United States); Kim, Woohyun [Illinois Rocstar LLC, Champaign, IL (United States); Sarwal, Neil [Illinois Rocstar LLC, Champaign, IL (United States); Weisberg, Brian [Illinois Rocstar LLC, Champaign, IL (United States)

    2016-10-15

    The project described in this report constructed and exercised an innovative multiphysics coupling toolkit called the Illinois Rocstar MultiPhysics Application Coupling Toolkit (IMPACT). IMPACT is an open source, flexible, natively parallel infrastructure for coupling multiple uniphysics simulation codes into multiphysics computational systems. IMPACT works with codes written in several high-performance-computing (HPC) programming languages, and is designed from the beginning for HPC multiphysics code development. It is designed to be minimally invasive to the individual physics codes being integrated, and has few requirements on those physics codes for integration. The goal of IMPACT is to provide the support needed to enable coupling existing tools together in unique and innovative ways to produce powerful new multiphysics technologies without extensive modification and rewrite of the physics packages being integrated. There are three major outcomes from this project: 1) construction, testing, application, and open-source release of the IMPACT infrastructure, 2) production of example open-source multiphysics tools using IMPACT, and 3) identification and engagement of interested organizations in the tools and applications resulting from the project. This last outcome represents the incipient development of a user community and application echosystem being built using IMPACT. Multiphysics coupling standardization can only come from organizations working together to define needs and processes that span the space of necessary multiphysics outcomes, which Illinois Rocstar plans to continue driving toward. The IMPACT system, including source code, documentation, and test problems are all now available through the public gitHUB.org system to anyone interested in multiphysics code coupling. Many of the basic documents explaining use and architecture of IMPACT are also attached as appendices to this document. Online HTML documentation is available through the gitHUB site

  18. HyFlow: A High Performance Distributed Software Transactional Memory Framework

    OpenAIRE

    Saad Ibrahim, Mohamed Mohamed

    2011-01-01

    We present HyFlow - a distributed software transactional memory (D-STM) framework for distributed concurrency control. Lock-based concurrency control suffers from drawbacks including deadlocks, livelocks, and scalability and composability challenges. These problems are exacerbated in distributed systems due to their distributed versions which are more complex to cope with (e.g., distributed deadlocks). STM and D-STM are promising alternatives to lock-based and distributed lock-based concurren...

  19. High performance bioinformatics and computational biology on general-purpose graphics processing units

    OpenAIRE

    Ling, Cheng

    2012-01-01

    Bioinformatics and Computational Biology (BCB) is a relatively new multidisciplinary field which brings together many aspects of the fields of biology, computer science, statistics, and engineering. Bioinformatics extracts useful information from biological data and makes these more intuitive and understandable by applying principles of information sciences, while computational biology harnesses computational approaches and technologies to answer biological questions conveni...

  20. Yabi: An online research environment for grid, high performance and cloud computing

    Directory of Open Access Journals (Sweden)

    Hunter Adam A

    2012-02-01

    Full Text Available Abstract Background There is a significant demand for creating pipelines or workflows in the life science discipline that chain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This need has led to the development of general as well as domain-specific workflow environments that are either complex desktop applications or Internet-based applications. Complexities can arise when configuring these applications in heterogeneous compute and storage environments if the execution and data access models are not designed appropriately. These complexities manifest themselves through limited access to available HPC resources, significant overhead required to configure tools and inability for users to simply manage files across heterogenous HPC storage infrastructure. Results In this paper, we describe the architecture of a software system that is adaptable to a range of both pluggable execution and data backends in an open source implementation called Yabi. Enabling seamless and transparent access to heterogenous HPC environments at its core, Yabi then provides an analysis workflow environment that can create and reuse workflows as well as manage large amounts of both raw and processed data in a secure and flexible way across geographically distributed compute resources. Yabi can be used via a web-based environment to drag-and-drop tools to create sophisticated workflows. Yabi can also be accessed through the Yabi command line which is designed for users that are more comfortable with writing scripts or for enabling external workflow environments to leverage the features in Yabi. Configuring tools can be a significant overhead in workflow environments. Yabi greatly simplifies this task by enabling system administrators to configure as well as manage running tools via a web-based environment and without the need to write or edit software programs or scripts. In this paper, we highlight Yabi's capabilities

  1. Yabi: An online research environment for grid, high performance and cloud computing.

    Science.gov (United States)

    Hunter, Adam A; Macgregor, Andrew B; Szabo, Tamas O; Wellington, Crispin A; Bellgard, Matthew I

    2012-02-15

    There is a significant demand for creating pipelines or workflows in the life science discipline that chain a number of discrete compute and data intensive analysis tasks into sophisticated analysis procedures. This need has led to the development of general as well as domain-specific workflow environments that are either complex desktop applications or Internet-based applications. Complexities can arise when configuring these applications in heterogeneous compute and storage environments if the execution and data access models are not designed appropriately. These complexities manifest themselves through limited access to available HPC resources, significant overhead required to configure tools and inability for users to simply manage files across heterogenous HPC storage infrastructure. In this paper, we describe the architecture of a software system that is adaptable to a range of both pluggable execution and data backends in an open source implementation called Yabi. Enabling seamless and transparent access to heterogenous HPC environments at its core, Yabi then provides an analysis workflow environment that can create and reuse workflows as well as manage large amounts of both raw and processed data in a secure and flexible way across geographically distributed compute resources. Yabi can be used via a web-based environment to drag-and-drop tools to create sophisticated workflows. Yabi can also be accessed through the Yabi command line which is designed for users that are more comfortable with writing scripts or for enabling external workflow environments to leverage the features in Yabi. Configuring tools can be a significant overhead in workflow environments. Yabi greatly simplifies this task by enabling system administrators to configure as well as manage running tools via a web-based environment and without the need to write or edit software programs or scripts. In this paper, we highlight Yabi's capabilities through a range of bioinformatics use cases

  2. An Investigation on the Aggregation and Rheodynamics of Human Red Blood Cells Using High Performance Computations

    Directory of Open Access Journals (Sweden)

    Dong Xu

    2017-01-01

    Full Text Available Studies on the haemodynamics of human circulation are clinically and scientifically important. In order to investigate the effect of deformation and aggregation of red blood cells (RBCs in blood flow, a computational technique has been developed by coupling the interaction between the fluid and the deformable RBCs. Parallelization was carried out for the coupled code and a high speedup was achieved based on a spatial decomposition. In order to verify the code’s capability of simulating RBC deformation and transport, simulations were carried out for a spherical capsule in a microchannel and multiple RBC transport in a Poiseuille flow. RBC transport in a confined tube was also carried out to simulate the peristaltic effects of microvessels. Relatively large-scale simulations were carried out of the motion of 49,512 RBCs in shear flows, which yielded a hematocrit of 45%. The large-scale feature of the simulation has enabled a macroscale verification and investigation of the overall characteristics of RBC aggregations to be carried out. The results are in excellent agreement with experimental studies and, more specifically, both the experimental and simulation results show uniform RBC distributions under high shear rates (60–100/s whereas large aggregations were observed under a lower shear rate of 10/s.

  3. Computing compound distributions faster

    OpenAIRE

    Iseger, P.; Smith, M.A.J.; Dekker, Rommert

    1997-01-01

    textabstractThe use of Panjer's algorithm has meanwhile become a widespread standard technique for actuaries (Kuon et al., 1955). Panjer's recursion formula is used for the evaluation of compound distributions and can be applied to life and general insurance problems. The discrete version of Panjer's recursion formula is often applied to continuous distributions by discretizing the

  4. Implementing a High Performance Work Place in the Distribution and Logistics Industry: Recommendations for Leadership & Team Member Development

    Science.gov (United States)

    McCann, Laura Harding

    2012-01-01

    Leadership development and employee engagement are two elements critical to the success of organizations. In response to growth opportunities, our Distribution and Logistics company set on a course to implement High Performance Work Place to meet the leadership and employee engagement needs, and to find methods for improving work processes. This…

  5. Hydra: a self regenerating high performance computing grid for drug discovery.

    Science.gov (United States)

    Bullard, Drew; Gobbi, Alberto; Lardy, Matthew A; Perkins, Charles; Little, Zach

    2008-04-01

    Computer aided drug design is progressing and playing an increasingly important role in drug discovery. Computational methods are being used to evaluate larger and larger numbers of real and virtual compounds. New methods based on molecular simulations that take protein and ligand flexibility into account also contribute to an ever increasing need for compute time. Computational grids are therefore becoming a critically important tool for modern drug discovery, but can be expensive to deploy and maintain. Here, we describe the low cost implementation of a 165 node, computational grid at Anadys Pharmaceuticals. The grid makes use of the excess computing capacity of desktop computers deployed throughout the company and of outdated desktop computers which populate a central computing grid. The performance of the grid grows automatically with the size of the company and with advances in computer technology. To ensure the uniformity of the nodes in the grid, all computers are running the Linux operating system. The desktop computers run Linux inside MS Windows using coLinux as virtualization software. HYDRA has been used to optimize computational models, for virtual screening and for lead optimization.

  6. A high-performance model for shallow-water simulations in distributed and heterogeneous architectures

    Science.gov (United States)

    Conde, Daniel; Canelas, Ricardo B.; Ferreira, Rui M. L.

    2017-04-01

    One of the most common challenges in hydrodynamic modelling is the trade off one must make between highly resolved simulations and the time required for their computation. In the particular case of urban floods, modelers are often forced to simplify the complex geometries of the problem, or to implicitly include some of its hydrodynamic effects, due to the typically very large spatial scales involved and limited computational resources. At CEris - Instituto Superior Técnico, Universidade de Lisboa - the STAV-2D shallow-water model, particularly suited for strong transient flows in complex and dynamic geometries, has been under development for the past recent years (Canelas et al., 2013 & Conde et al., 2013). The model is based on an explicit, first-order 2DH finite-volume discretization scheme for unstructured triangular meshes, in which a flux-splitting technique is paired with a reviewed Roe-Riemann solver, yielding a model applicable to discontinuous flows over time-evolving geometries. STAV-2D features solid transport in both Euleran and Lagrangian forms, with the first aiming at describing the transport of fine natural sediments and the latter aimed at large individual debris. The model has been validated with theoretical solutions and laboratory experiments (Canelas et al., 2013 & Conde et al., 2015). This work presents our most recent effort in STAV-2D: the re-design of the code in a modern Object-Oriented parallel framework for heterogeneous computations in CPUs and GPUs. The programming language of choice for this re-design was C++, due to its wide support of established and emerging parallel programming interfaces. The current implementation of STAV-2D provides two different levels of parallel granularity: inter-node and intra-node. Inter-node parallelism is achieved by distributing a simulation across a set of worker nodes, with communication between nodes being explicitly managed through MPI. At this level, the main difficulty is associated with the

  7. Distributed computing for global health

    CERN Document Server

    CERN. Geneva; Schwede, Torsten; Moore, Celia; Smith, Thomas E; Williams, Brian; Grey, François

    2005-01-01

    Distributed computing harnesses the power of thousands of computers within organisations or over the Internet. In order to tackle global health problems, several groups of researchers have begun to use this approach to exceed by far the computing power of a single lab. This event illustrates how companies, research institutes and the general public are contributing their computing power to these efforts, and what impact this may have on a range of world health issues. Grids for neglected diseases Vincent Breton, CNRS/EGEE This talk introduces the topic of distributed computing, explaining the similarities and differences between Grid computing, volunteer computing and supercomputing, and outlines the potential of Grid computing for tackling neglected diseases where there is little economic incentive for private R&D efforts. Recent results on malaria drug design using the Grid infrastructure of the EU-funded EGEE project, which is coordinated by CERN and involves 70 partners in Europe, the US and Russi...

  8. High-Performance Computing Act of 1991. Report of the Senate Committee on Commerce, Science, and Transportation on S. 272. Senate, 102d Congress, 1st Session.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

    This report discusses Senate Bill no. 272, which provides for a coordinated federal research and development program to ensure continued U.S. leadership in high-performance computing. High performance computing is defined as representing the leading edge of technological advancement in computing, i.e., the most sophisticated computer chips, the…

  9. Computing Battery Lifetime Distributions

    NARCIS (Netherlands)

    Cloth, L.; Haverkort, Boudewijn R.H.M.; Jongerden, M.R.

    The usage of mobile devices like cell phones, navigation systems, or laptop computers, is limited by the lifetime of the included batteries. This lifetime depends naturally on the rate at which energy is consumed, however, it also depends on the usage pattern of the battery. Continuous drawing of a

  10. Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

    2017-05-15

    Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.

  11. High Performance Computing and Storage Requirements for Biological and Environmental Research Target 2017

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)

    2013-05-01

    The National Energy Research Scientific Computing Center (NERSC) is the primary computing center for the DOE Office of Science, serving approximately 4,500 users working on some 650 projects that involve nearly 600 codes in a wide variety of scientific disciplines. In addition to large-­scale computing and storage resources NERSC provides support and expertise that help scientists make efficient use of its systems. The latest review revealed several key requirements, in addition to achieving its goal of characterizing BER computing and storage needs.

  12. Computing compound distributions faster

    NARCIS (Netherlands)

    P. den Iseger; M.A.J. Smith; R. Dekker (Rommert)

    1997-01-01

    textabstractThe use of Panjer's algorithm has meanwhile become a widespread standard technique for actuaries (Kuon et al., 1955). Panjer's recursion formula is used for the evaluation of compound distributions and can be applied to life and general insurance problems. The discrete version of

  13. A Distributed Computational Infrastructure for Science and Education

    Directory of Open Access Journals (Sweden)

    Rustam K. Bazarov

    2014-06-01

    Full Text Available Researchers have lately been paying increasingly more attention to parallel and distributed algorithms for solving high-dimensionality problems. In this regard, the issue of acquiring or renting computational resources becomes a topical one for employees of scientific and educational institutions. This article examines technology and methods for organizing a distributed computational infrastructure. The author addresses the experience of creating a high-performance system powered by existing clusterization and grid computing technology. The approach examined in the article helps minimize financial costs, aggregate territorially distributed computational resources and ensures a more rational use of available computer equipment, eliminating its downtimes.

  14. Parallel-META: efficient metagenomic data analysis based on high-performance computation.

    Science.gov (United States)

    Su, Xiaoquan; Xu, Jian; Ning, Kang

    2012-01-01

    Metagenomics method directly sequences and analyses genome information from microbial communities. There are usually more than hundreds of genomes from different microbial species in the same community, and the main computational tasks for metagenomic data analyses include taxonomical and functional component examination of all genomes in the microbial community. Metagenomic data analysis is both data- and computation- intensive, which requires extensive computational power. Most of the current metagenomic data analysis softwares were designed to be used on a single computer or single computer clusters, which could not match with the fast increasing number of large metagenomic projects' computational requirements. Therefore, advanced computational methods and pipelines have to be developed to cope with such need for efficient analyses. In this paper, we proposed Parallel-META, a GPU- and multi-core-CPU-based open-source pipeline for metagenomic data analysis, which enabled the efficient and parallel analysis of multiple metagenomic datasets and the visualization of the results for multiple samples. In Parallel-META, the similarity-based database search was parallelized based on GPU computing and multi-core CPU computing optimization. Experiments have shown that Parallel-META has at least 15 times speed-up compared to traditional metagenomic data analysis method, with the same accuracy of the results http://www.computationalbioenergy.org/parallel-meta.html. The parallel processing of current metagenomic data would be very promising: with current speed up of 15 times and above, binning would not be a very time-consuming process any more. Therefore, some deeper analysis of the metagenomic data, such as the comparison of different samples, would be feasible in the pipeline, and some of these functionalities have been included into the Parallel-META pipeline.

  15. High-Performance Computing for the Electromagnetic Modeling and Simulation of Interconnects

    Science.gov (United States)

    Schutt-Aine, Jose E.

    1996-01-01

    The electromagnetic modeling of packages and interconnects plays a very important role in the design of high-speed digital circuits, and is most efficiently performed by using computer-aided design algorithms. In recent years, packaging has become a critical area in the design of high-speed communication systems and fast computers, and the importance of the software support for their development has increased accordingly. Throughout this project, our efforts have focused on the development of modeling and simulation techniques and algorithms that permit the fast computation of the electrical parameters of interconnects and the efficient simulation of their electrical performance.

  16. A GPU-based real time high performance computing service in a fast plant system controller prototype for ITER

    Energy Technology Data Exchange (ETDEWEB)

    Nieto, J., E-mail: jnieto@sec.upm.es [Grupo de Investigacion en Instrumentacion y Acustica Aplicada. Universidad Politecnica de Madrid, Crta. Valencia Km-7, Madrid 28031 Spain (Spain); Arcas, G. de; Ruiz, M. [Grupo de Investigacion en Instrumentacion y Acustica Aplicada. Universidad Politecnica de Madrid, Crta. Valencia Km-7, Madrid 28031 Spain (Spain); Vega, J. [Asociacion EURATOM/CIEMAT para Fusion, Madrid (Spain); Lopez, J.M.; Barrera, E. [Grupo de Investigacion en Instrumentacion y Acustica Aplicada. Universidad Politecnica de Madrid, Crta. Valencia Km-7, Madrid 28031 Spain (Spain); Castro, R. [Asociacion EURATOM/CIEMAT para Fusion, Madrid (Spain); Sanz, D. [Grupo de Investigacion en Instrumentacion y Acustica Aplicada. Universidad Politecnica de Madrid, Crta. Valencia Km-7, Madrid 28031 Spain (Spain); Utzel, N.; Makijarvi, P.; Zabeo, L. [ITER Organization, CS 90 046, 13067 St. Paul lez Durance Cedex (France)

    2012-12-15

    Highlights: Black-Right-Pointing-Pointer Implementation of fast plant system controller (FPSC) for ITER CODAC. Black-Right-Pointing-Pointer GPU-based real time high performance computing service. Black-Right-Pointing-Pointer Performance evaluation with respect to other solutions based in multi-core processors. - Abstract: EURATOM/CIEMAT and the Technical University of Madrid UPM are involved in the development of a FPSC (fast plant system control) prototype for ITER based on PXIe form factor. The FPSC architecture includes a GPU-based real time high performance computing service which has been integrated under EPICS (experimental physics and industrial control system). In this work we present the design of this service and its performance evaluation with respect to other solutions based in multi-core processors. Plasma pre-processing algorithms, illustrative of the type of tasks that could be required for both control and diagnostics, are used during the performance evaluation.

  17. FY 1994 Blue Book: High Performance Computing and Communications: Toward a National Information Infrastructure

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — government and industry that advanced computer and telecommunications technologies could provide huge benefits throughout the research community and the entire U.S....

  18. FY 2000 Blue Book: High Performance Computing and Communications: Information Technology Frontiers for a New Millennium

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — As we near the dawn of a new millennium, advances made possible by computing, information, and communications research and development R and D ? once barely...

  19. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Science.gov (United States)

    Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  20. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

    Directory of Open Access Journals (Sweden)

    David K Brown

    Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.

  1. JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

    Science.gov (United States)

    Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

    2015-01-01

    Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450

  2. High performance computing for deformable image registration: towards a new paradigm in adaptive radiotherapy.

    Science.gov (United States)

    Samant, Sanjiv S; Xia, Junyi; Muyan-Ozcelik, Pinar; Owens, John D

    2008-08-01

    The advent of readily available temporal imaging or time series volumetric (4D) imaging has become an indispensable component of treatment planning and adaptive radiotherapy (ART) at many radiotherapy centers. Deformable image registration (DIR) is also used in other areas of medical imaging, including motion corrected image reconstruction. Due to long computation time, clinical applications of DIR in radiation therapy and elsewhere have been limited and consequently relegated to offline analysis. With the recent advances in hardware and software, graphics processing unit (GPU) based computing is an emerging technology for general purpose computation, including DIR, and is suitable for highly parallelized computing. However, traditional general purpose computation on the GPU is limited because the constraints of the available programming platforms. As well, compared to CPU programming, the GPU currently has reduced dedicated processor memory, which can limit the useful working data set for parallelized processing. We present an implementation of the demons algorithm using the NVIDIA 8800 GTX GPU and the new CUDA programming language. The GPU performance will be compared with single threading and multithreading CPU implementations on an Intel dual core 2.4 GHz CPU using the C programming language. CUDA provides a C-like language programming interface, and allows for direct access to the highly parallel compute units in the GPU. Comparisons for volumetric clinical lung images acquired using 4DCT were carried out. Computation time for 100 iterations in the range of 1.8-13.5 s was observed for the GPU with image size ranging from 2.0 x 10(6) to 14.2 x 10(6) pixels. The GPU registration was 55-61 times faster than the CPU for the single threading implementation, and 34-39 times faster for the multithreading implementation. For CPU based computing, the computational time generally has a linear dependence on image size for medical imaging data. Computational efficiency is

  3. Human and Robotic Space Mission Use Cases for High-Performance Spaceflight Computing

    Science.gov (United States)

    Doyle, Richard; Bergman, Larry; Some, Raphael; Whitaker, William; Powell, Wesley; Johnson, Michael; Goforth, Montgomery; Lowry, Michael

    2013-01-01

    Spaceflight computing is a key resource in NASA space missions and a core determining factor of spacecraft capability, with ripple effects throughout the spacecraft, end-to-end system, and the mission; it can be aptly viewed as a "technology multiplier" in that advances in onboard computing provide dramatic improvements in flight functions and capabilities across the NASA mission classes, and will enable new flight capabilities and mission scenarios, increasing science and exploration return per mission-dollar.

  4. Acquisition of a High Performance Computing Instrument for Big Data Research and Education

    Science.gov (United States)

    2015-12-03

    for big data research and education at North Carolina Agricultural and Technical State University (NCA&T). This instrument will promote innovation and...translational big data research at NCA&T. It will make NCA&T (a land grant HBCU) a strong player in big data research and education. The research...Science, Computer Systems Technology, and Electrical and Computer Engineering at 1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 13. SUPPLEMENTARY

  5. Application of High Performance Computing for Simulations of N-Dodecane Jet Spray with Evaporation

    Science.gov (United States)

    2016-11-01

    multicomponent fluids such as n-dodecane. The long-term goal of this research is to incorporate these models into future simulations of turbulent jet...performance computing, fuel, spray, large eddy simulation, computational fluid dynamics 16. SECURITY CLASSIFICATION OF: 17. LIMITATION OF ABSTRACT SAR...Research Experience Prior to this summer’s internship, I held 2 Master’s degrees in Mechanical and Nuclear Engineering. I had focused on plasma physics at

  6. Money for Research, Not for Energy Bills: Finding Energy and Cost Savings in High Performance Computer Facility Designs

    Energy Technology Data Exchange (ETDEWEB)

    Drewmark Communications; Sartor, Dale; Wilson, Mark

    2010-07-01

    High-performance computing facilities in the United States consume an enormous amount of electricity, cutting into research budgets and challenging public- and private-sector efforts to reduce energy consumption and meet environmental goals. However, these facilities can greatly reduce their energy demand through energy-efficient design of the facility itself. Using a case study of a facility under design, this article discusses strategies and technologies that can be used to help achieve energy reductions.

  7. High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems

    Science.gov (United States)

    Makivic, Miloje S.

    1996-01-01

    This is the final technical report for the project entitled: "High-Performance Computing and Four-Dimensional Data Assimilation: The Impact on Future and Current Problems", funded at NPAC by the DAO at NASA/GSFC. First, the motivation for the project is given in the introductory section, followed by the executive summary of major accomplishments and the list of project-related publications. Detailed analysis and description of research results is given in subsequent chapters and in the Appendix.

  8. Application of high-performance computing to numerical simulation of human movement

    Science.gov (United States)

    Anderson, F. C.; Ziegler, J. M.; Pandy, M. G.; Whalen, R. T.

    1995-01-01

    We have examined the feasibility of using massively-parallel and vector-processing supercomputers to solve large-scale optimization problems for human movement. Specifically, we compared the computational expense of determining the optimal controls for the single support phase of gait using a conventional serial machine (SGI Iris 4D25), a MIMD parallel machine (Intel iPSC/860), and a parallel-vector-processing machine (Cray Y-MP 8/864). With the human body modeled as a 14 degree-of-freedom linkage actuated by 46 musculotendinous units, computation of the optimal controls for gait could take up to 3 months of CPU time on the Iris. Both the Cray and the Intel are able to reduce this time to practical levels. The optimal solution for gait can be found with about 77 hours of CPU on the Cray and with about 88 hours of CPU on the Intel. Although the overall speeds of the Cray and the Intel were found to be similar, the unique capabilities of each machine are better suited to different portions of the computational algorithm used. The Intel was best suited to computing the derivatives of the performance criterion and the constraints whereas the Cray was best suited to parameter optimization of the controls. These results suggest that the ideal computer architecture for solving very large-scale optimal control problems is a hybrid system in which a vector-processing machine is integrated into the communication network of a MIMD parallel machine.

  9. Implementation of the Two-Point Angular Correlation Function on a High-Performance Reconfigurable Computer

    Directory of Open Access Journals (Sweden)

    Volodymyr V. Kindratenko

    2009-01-01

    Full Text Available We present a parallel implementation of an algorithm for calculating the two-point angular correlation function as applied in the field of computational cosmology. The algorithm has been specifically developed for a reconfigurable computer. Our implementation utilizes a microprocessor and two reconfigurable processors on a dual-MAP SRC-6 system. The two reconfigurable processors are used as two application-specific co-processors. Two independent computational kernels are simultaneously executed on the reconfigurable processors while data pre-fetching from disk and initial data pre-processing are executed on the microprocessor. The overall end-to-end algorithm execution speedup achieved by this implementation is over 90× as compared to a sequential implementation of the algorithm executed on a single 2.8 GHz Intel Xeon microprocessor.

  10. A high-performance computing toolset for relatedness and principal component analysis of SNP data.

    Science.gov (United States)

    Zheng, Xiuwen; Levine, David; Shen, Jess; Gogarten, Stephanie M; Laurie, Cathy; Weir, Bruce S

    2012-12-15

    Genome-wide association studies are widely used to investigate the genetic basis of diseases and traits, but they pose many computational challenges. We developed gdsfmt and SNPRelate (R packages for multi-core symmetric multiprocessing computer architectures) to accelerate two key computations on SNP data: principal component analysis (PCA) and relatedness analysis using identity-by-descent measures. The kernels of our algorithms are written in C/C++ and highly optimized. Benchmarks show the uniprocessor implementations of PCA and identity-by-descent are ∼8-50 times faster than the implementations provided in the popular EIGENSTRAT (v3.0) and PLINK (v1.07) programs, respectively, and can be sped up to 30-300-fold by using eight cores. SNPRelate can analyse tens of thousands of samples with millions of SNPs. For example, our package was used to perform PCA on 55 324 subjects from the 'Gene-Environment Association Studies' consortium studies.

  11. Distributed GPU Computing in GIScience

    Science.gov (United States)

    Jiang, Y.; Yang, C.; Huang, Q.; Li, J.; Sun, M.

    2013-12-01

    Geoscientists strived to discover potential principles and patterns hidden inside ever-growing Big Data for scientific discoveries. To better achieve this objective, more capable computing resources are required to process, analyze and visualize Big Data (Ferreira et al., 2003; Li et al., 2013). Current CPU-based computing techniques cannot promptly meet the computing challenges caused by increasing amount of datasets from different domains, such as social media, earth observation, environmental sensing (Li et al., 2013). Meanwhile CPU-based computing resources structured as cluster or supercomputer is costly. In the past several years with GPU-based technology matured in both the capability and performance, GPU-based computing has emerged as a new computing paradigm. Compare to traditional computing microprocessor, the modern GPU, as a compelling alternative microprocessor, has outstanding high parallel processing capability with cost-effectiveness and efficiency(Owens et al., 2008), although it is initially designed for graphical rendering in visualization pipe. This presentation reports a distributed GPU computing framework for integrating GPU-based computing within distributed environment. Within this framework, 1) for each single computer, computing resources of both GPU-based and CPU-based can be fully utilized to improve the performance of visualizing and processing Big Data; 2) within a network environment, a variety of computers can be used to build up a virtual super computer to support CPU-based and GPU-based computing in distributed computing environment; 3) GPUs, as a specific graphic targeted device, are used to greatly improve the rendering efficiency in distributed geo-visualization, especially for 3D/4D visualization. Key words: Geovisualization, GIScience, Spatiotemporal Studies Reference : 1. Ferreira de Oliveira, M. C., & Levkowitz, H. (2003). From visual data exploration to visual data mining: A survey. Visualization and Computer Graphics, IEEE

  12. High-performance secure multi-party computation for data mining applications

    DEFF Research Database (Denmark)

    Bogdanov, Dan; Niitsoo, Margus; Toft, Tomas

    2012-01-01

    Secure multi-party computation (MPC) is a technique well suited for privacy-preserving data mining. Even with the recent progress in two-party computation techniques such as fully homomorphic encryption, general MPC remains relevant as it has shown promising performance metrics in real...... operations such as multiplication and comparison. Secondly, the confidential processing of financial data requires the use of more complex primitives, including a secure division operation. This paper describes new protocols in the Sharemind model for secure multiplication, share conversion, equality, bit...

  13. High performance computing for a 3-D optical diffraction tomographic application in fluid velocimetry.

    Science.gov (United States)

    Lobera, Julia; Ortega, Gloria; García, Inmaculada; Arroyo, María del Pilar; Garzón, Ester M

    2015-02-23

    Optical Diffraction Tomography has been recently introduced in fluid velocimetry to provide three dimensional information of seeding particle locations. In general, image reconstruction methods at visible wavelengths have to account for diffraction. Linear approximation has been used for three-dimensional image reconstruction, but a non-linear and iterative reconstruction method is required when multiple scattering is not negligible. Non-linear methods require the solution of the Helmholtz equation, computationally highly demanding due to the size of the problem. The present work shows the results of a non-linear method customized for spherical particle location using GPU computing and a made-to-measure storing format.

  14. AVES: A high performance computer cluster array for the INTEGRAL satellite scientific data analysis

    Science.gov (United States)

    Federici, Memmo; Martino, Bruno Luigi; Ubertini, Pietro

    2012-07-01

    In this paper we describe a new computing system array, designed, built and now used at the Space Astrophysics and Planetary Institute (IAPS) in Rome, Italy, for the INTEGRAL Space Observatory scientific data analysis. This new system has become necessary in order to reduce the processing time of the INTEGRAL data accumulated during the more than 9 years of in-orbit operation. In order to fulfill the scientific data analysis requirements with a moderately limited investment the starting approach has been to use a `cluster' array of commercial quad-CPU computers, featuring the extremely large scientific and calibration data archive on line.

  15. A Protocol for Provably Secure Authentication of a Tiny Entity to a High Performance Computing One

    Directory of Open Access Journals (Sweden)

    Siniša Tomović

    2016-01-01

    Full Text Available The problem of developing authentication protocols dedicated to a specific scenario where an entity with limited computational capabilities should prove the identity to a computationally powerful Verifier is addressed. An authentication protocol suitable for the considered scenario which jointly employs the learning parity with noise (LPN problem and a paradigm of random selection is proposed. It is shown that the proposed protocol is secure against active attacking scenarios and so called GRS man-in-the-middle (MIM attacking scenarios. In comparison with the related previously reported authentication protocols the proposed one provides reduction of the implementation complexity and at least the same level of the cryptographic security.

  16. Energy efficient distributed computing systems

    CERN Document Server

    Lee, Young-Choon

    2012-01-01

    The energy consumption issue in distributed computing systems raises various monetary, environmental and system performance concerns. Electricity consumption in the US doubled from 2000 to 2005.  From a financial and environmental standpoint, reducing the consumption of electricity is important, yet these reforms must not lead to performance degradation of the computing systems.  These contradicting constraints create a suite of complex problems that need to be resolved in order to lead to 'greener' distributed computing systems.  This book brings together a group of outsta

  17. Three-Dimensional Nanobiocomputing Architectures with aleph-Hypercells: Revolutionary Super-High-Performance Computing Platform

    Science.gov (United States)

    2006-05-01

    the switching power is not only a function of devices/gates/switches (bipolar junction transistors and field-effect transistors , BJTs and FETs, for...Acronyms 2D - Two-dimensional 3D - Three-dimensional ALU - Arithmetic logic unit BJT - Bipolar junction transistor CAD - Computer...14 3. 4. DNA Derivative Transistor for ℵ-Hypercells

  18. Achieving high performance in numerical computations on RISC workstations and parallel systems

    Energy Technology Data Exchange (ETDEWEB)

    Goedecker, S. [Max-Planck Inst. for Solid State Research, Stuttgart (Germany); Hoisie, A. [Los Alamos National Lab., NM (United States)

    1997-08-20

    The nominal peak speeds of both serial and parallel computers is raising rapidly. At the same time however it is becoming increasingly difficult to get out a significant fraction of this high peak speed from modern computer architectures. In this tutorial the authors give the scientists and engineers involved in numerically demanding calculations and simulations the necessary basic knowledge to write reasonably efficient programs. The basic principles are rather simple and the possible rewards large. Writing a program by taking into account optimization techniques related to the computer architecture can significantly speedup your program, often by factors of 10--100. As such, optimizing a program can for instance be a much better solution than buying a faster computer. If a few basic optimization principles are applied during program development, the additional time needed for obtaining an efficient program is practically negligible. In-depth optimization is usually only needed for a few subroutines or kernels and the effort involved is therefore also acceptable.

  19. High Fidelity Simulation of Liquid Jet in Cross-flow Using High Performance Computing

    Science.gov (United States)

    Soteriou, Marios; Li, Xiaoyi

    2011-11-01

    High fidelity, first principles simulation of atomization of a liquid jet by a fast cross-flowing gas can help reveal the controlling physics of this complicated two-phase flow of engineering interest. The turn-around execution time of such a simulation is prohibitively long using typically available computational resources today (i.e. parallel systems with ~O(100) CPUs). This is due to multiscale nature of the problem which requires the use of fine grids and time steps. In this work we present results from such a simulation performed on a state of the art massively parallel system available at Oakridge Leadership Computing Facility (OLCF). Scalability of the computational algorithm to ~2000 CPUs is demonstrated on grids of up to 200 million nodes. As a result, a simulation at intermediate Weber number becomes possible on this system. Results are in agreement with detailed experiment measurements of liquid column trajectory, breakup location, surface wavelength, onset of surface stripping as well as droplet size and velocity after primary breakup. Moreover, this uniform grid simulation is used as a base case for further code enhancement by evaluating the feasibility of employing Adaptive Mesh Refinement (AMR) near the liquid-gas interface as a means of mitigating computational cost.

  20. Full scale strain monitoring of a suspension bridge using high performance distributed fiber optic sensors

    Science.gov (United States)

    Xu, Jinlong; Dong, Yongkang; Zhang, Zhaohui; Li, Shunlong; He, Shaoyang; Li, Hui

    2016-12-01

    This paper investigated field monitoring of a 1108 m suspension bridge during an assessment load test, using integrated distributed fibre-optic sensors (DFOSs). In addition to the conventional Brillouin time domain analysis system, a high spatial resolution Brillouin system using the differential pulse-width pair (DPP) technique was adopted. Temperature compensation was achieved using a Raman distributed temperature sensing system. This is the first full scale field application of DFOSs using the Brillouin time domain analysis technique in a thousand-meter-scale suspension bridge. Measured strain distributions along the whole length of the bridge were presented. The interaction between the main cables and the steel-box-girder was highlighted. The Brillouin fibre-optic monitoring systems exhibited great facility for the purposes of long distance distributed strain monitoring, with up to 0.05 m spatial resolution, and 0.01 m/point sampling interval. The performance of the Brillouin system using DPP technique was discussed. The measured data was also employed for assessing bridge design and for the assessment of structural condition. The results show that the symmetrical design assumptions were consistent with the actual bridge, and that the strain values along the whole bridge were within the safety range. This trial field study serves as an example, demonstrating the feasibility of highly dense strain and temperature measurement for large scale civil infrastructures using integrated DFOSs.

  1. High-performance parallel computing in the classroom using the public goods game as an example

    Science.gov (United States)

    Perc, Matjaž

    2017-07-01

    The use of computers in statistical physics is common because the sheer number of equations that describe the behaviour of an entire system particle by particle often makes it impossible to solve them exactly. Monte Carlo methods form a particularly important class of numerical methods for solving problems in statistical physics. Although these methods are simple in principle, their proper use requires a good command of statistical mechanics, as well as considerable computational resources. The aim of this paper is to demonstrate how the usage of widely accessible graphics cards on personal computers can elevate the computing power in Monte Carlo simulations by orders of magnitude, thus allowing live classroom demonstration of phenomena that would otherwise be out of reach. As an example, we use the public goods game on a square lattice where two strategies compete for common resources in a social dilemma situation. We show that the second-order phase transition to an absorbing phase in the system belongs to the directed percolation universality class, and we compare the time needed to arrive at this result by means of the main processor and by means of a suitable graphics card. Parallel computing on graphics processing units has been developed actively during the last decade, to the point where today the learning curve for entry is anything but steep for those familiar with programming. The subject is thus ripe for inclusion in graduate and advanced undergraduate curricula, and we hope that this paper will facilitate this process in the realm of physics education. To that end, we provide a documented source code for an easy reproduction of presented results and for further development of Monte Carlo simulations of similar systems.

  2. BESIII production with distributed computing

    Science.gov (United States)

    Zhang, X. M.; Yan, T.; Zhao, X. H.; Ma, Z. T.; Yan, X. F.; Lin, T.; Deng, Z. Y.; Li, W. D.; Belov, S.; Pelevanyuk, I.; Zhemchugov, A.; Cai, H.

    2015-12-01

    Distributed computing is necessary nowadays for high energy physics experiments to organize heterogeneous computing resources all over the world to process enormous amounts of data. The BESIII experiment in China, has established its own distributed computing system, based on DIRAC, as a supplement to local clusters, collecting cluster, grid, desktop and cloud resources from collaborating member institutes around the world. The system consists of workload management and data management to deal with the BESIII Monte Carlo production workflow in a distributed environment. A dataset-based data transfer system has been developed to support data movements among sites. File and metadata management tools and a job submission frontend have been developed to provide a virtual layer for BESIII physicists to use distributed resources. Moreover, the paper shows the experience to cope with lack of grid experience and low manpower among the BESIII community.

  3. Understanding Strongly Correlated Materials thru Theory Algorithms and High Performance Computers

    Science.gov (United States)

    Kotliar, Gabriel

    A long standing challenge in condensed matter physics is the prediction of physical properties of materials starting from first principles. In the past two decades, substantial advances have taken place in this area. The combination of modern implementations of electronic structure methods in conjunction with Dynamical Mean Field Theory (DMFT), in combination with advanced impurity solvers, modern computer codes and massively parallel computers, are giving new system specific insights into the properties of strongly correlated electron systems enable the calculations of experimentally measurable correlation functions. The predictions of this ''theoretical spectroscopy'' can be directly compared with experimental results. In this talk I will briefly outline the state of the art of the methodology, and illustrate it with an example the origin of the solid state anomalies of elemental Plutonium.

  4. Investigating the Mobility of Light Autonomous Tracked Vehicles using a High Performance Computing Simulation Capability

    Science.gov (United States)

    2012-08-01

    by funding provided by the Na- tional Science Foundation under NSF Project CMMI - 0840442 and through TARDEC grant W911NF-11- D-0001-0048. M...Hall, Englewood Cliffs, New Jersey, 1989. [13] HEYN, T. Simulation of Tracked Vehicles on Granular Terrain Leveraging GPU Comput- ing. M.S. thesis ...Dy- namics on Graphics Processing Unit (GPU) Cards. M.S. thesis , Department of Me- chanical Engineering, University of Wisconsin– Madison, http

  5. High Performance Applications for the Single-Chip Message-Passing Parallel Computer

    OpenAIRE

    Dickenson, William Wesley

    2004-01-01

    Computer architects continue to push the limits of modern microprocessors. By using techniques such as out-of-order execution, branch prediction, and dynamic scheduling, designers have found ways to speed execution. However, growing architectural complexity has led to unsustained development and testing times. Shrinking feature sizes are causing increased wire resistances and signal propagation, thereby limiting a designâ s scalability. Indeed, the method of exploiting instruction-level para...

  6. AHPCRC (Army High Performance Computing Rsearch Center) Bulletin. Volume 1, Issue 4

    Science.gov (United States)

    2011-01-01

    polymers or proteins ) in fluids, drawing on a large body of existing literature and computational resources. Their model is now fully functional, and it...structures (human hemoglobin shown here, protein subunits in red and blue). Starting from microscopy 6) Electron cryo-micrograph of GroEL (suspended...Nanofluidics 5 Antimicrobial Peptides 8 Virus Protein Structures 10 Metal Nanomechanics 13 The Researchers 17 Education and Outreach 18 Researcher

  7. AHPCRC (Army High Performance Computing Research Center) Bulletin. Volume 2, Issue 2, 2011

    Science.gov (United States)

    2011-01-01

    properties and the information required to model the hinges. Extensive research of different kinds of plastics and their common usage determined the...below). The direct heat flux method for computing the thermal conductivity is analogous to experimental measurements. It was successfully applied to...drive down a road and easily locate buried IED’s! My project goals this summer were to learn how to use and implement Verilog HDL and ModelSim

  8. High performance graphics processor based computed tomography reconstruction algorithms for nuclear and other large scale applications.

    Energy Technology Data Exchange (ETDEWEB)

    Jimenez, Edward S. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Orr, Laurel J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Thompson, Kyle R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2013-09-01

    The goal of this work is to develop a fast computed tomography (CT) reconstruction algorithm based on graphics processing units (GPU) that achieves significant improvement over traditional central processing unit (CPU) based implementations. The main challenge in developing a CT algorithm that is capable of handling very large datasets is parallelizing the algorithm in such a way that data transfer does not hinder performance of the reconstruction algorithm. General Purpose Graphics Processing (GPGPU) is a new technology that the Science and Technology (S&T) community is starting to adopt in many fields where CPU-based computing is the norm. GPGPU programming requires a new approach to algorithm development that utilizes massively multi-threaded environments. Multi-threaded algorithms in general are difficult to optimize since performance bottlenecks occur that are non-existent in single-threaded algorithms such as memory latencies. If an efficient GPU-based CT reconstruction algorithm can be developed; computational times could be improved by a factor of 20. Additionally, cost benefits will be realized as commodity graphics hardware could potentially replace expensive supercomputers and high-end workstations. This project will take advantage of the CUDA programming environment and attempt to parallelize the task in such a way that multiple slices of the reconstruction volume are computed simultaneously. This work will also take advantage of the GPU memory by utilizing asynchronous memory transfers, GPU texture memory, and (when possible) pinned host memory so that the memory transfer bottleneck inherent to GPGPU is amortized. Additionally, this work will take advantage of GPU-specific hardware (i.e. fast texture memory, pixel-pipelines, hardware interpolators, and varying memory hierarchy) that will allow for additional performance improvements.

  9. Fully coupled multiscale modeling of cohesive failure in heterogeneous interfaces using high performance computing

    OpenAIRE

    Mosby, Matthew; Matous, Karel

    2014-01-01

    Multiscale interfaces are prevalent throughout engineering design and application, often in the form of layered composites and adhesive joints. Most modern adhesives are highly heterogeneous, containing a wide range of sizes, shapes, and material properties of reinforcing constituents. Predicting how the size, shape, orientation, and distribution of the reinforcing particles change the failure response of the bonded structure is important for design and safety assessment. Using Direct Numeric...

  10. Simulation of cardiac electrophysiology on next-generation high-performance computers.

    Science.gov (United States)

    Bordas, Rafel; Carpentieri, Bruno; Fotia, Giorgio; Maggio, Fabio; Nobes, Ross; Pitt-Francis, Joe; Southern, James

    2009-05-28

    Models of cardiac electrophysiology consist of a system of partial differential equations (PDEs) coupled with a system of ordinary differential equations representing cell membrane dynamics. Current software to solve such models does not provide the required computational speed for practical applications. One reason for this is that little use is made of recent developments in adaptive numerical algorithms for solving systems of PDEs. Studies have suggested that a speedup of up to two orders of magnitude is possible by using adaptive methods. The challenge lies in the efficient implementation of adaptive algorithms on massively parallel computers. The finite-element (FE) method is often used in heart simulators as it can encapsulate the complex geometry and small-scale details of the human heart. An alternative is the spectral element (SE) method, a high-order technique that provides the flexibility and accuracy of FE, but with a reduced number of degrees of freedom. The feasibility of implementing a parallel SE algorithm based on fully unstructured all-hexahedra meshes is discussed. A major computational task is solution of the large algebraic system resulting from FE or SE discretization. Choice of linear solver and preconditioner has a substantial effect on efficiency. A fully parallel implementation based on dynamic partitioning that accounts for load balance, communication and data movement costs is required. Each of these methods must be implemented on next-generation supercomputers in order to realize the necessary speedup. The problems that this may cause, and some of the techniques that are beginning to be developed to overcome these issues, are described.

  11. High Performance Computing Based Parallel HIearchical Modal Association Clustering (HPAR HMAC)

    Energy Technology Data Exchange (ETDEWEB)

    2017-01-12

    For many applications, clustering is a crucial step in order to gain insight into the makeup of a dataset. The best approach to a given problem often depends on a variety of factors, such as the size of the dataset, time restrictions, and soft clustering requirements. The HMAC algorithm seeks to combine the strengths of 2 particular clustering approaches: model-based and linkage-based clustering. One particular weakness of HMAC is its computational complexity. HMAC is not practical for mega-scale data clustering. For high-definition imagery, a user would have to wait months or years for a result; for a 16-megapixel image, the estimated runtime skyrockets to over a decade! To improve the execution time of HMAC, it is reasonable to consider an multi-core implementation that utilizes available system resources. An existing imple-mentation (Ray and Cheng 2014) divides the dataset into N partitions - one for each thread prior to executing the HMAC algorithm. This implementation benefits from 2 types of optimization: parallelization and divide-and-conquer. By running each partition in parallel, the program is able to accelerate computation by utilizing more system resources. Although the parallel implementation provides considerable improvement over the serial HMAC, it still suffers from poor computational complexity, O(N2). Once the maximum number of cores on a system is exhausted, the program exhibits slower behavior. We now consider a modification to HMAC that involves a recursive partitioning scheme. Our modification aims to exploit divide-and-conquer benefits seen by the parallel HMAC implementation. At each level in the recursion tree, partitions are divided into 2 sub-partitions until a threshold size is reached. When the partition can no longer be divided without falling below threshold size, the base HMAC algorithm is applied. This results in a significant speedup over the parallel HMAC.

  12. A high performance cloud computing platform for mRNA analysis.

    Science.gov (United States)

    Lin, Feng-Seng; Shen, Chia-Ping; Sung, Hsiao-Ya; Lam, Yan-Yu; Lin, Jeng-Wei; Lai, Feipei

    2013-01-01

    Multiclass classification is an important technique to many complex bioinformatics problems. However, their performance is limited by the computation power. Based on the Apache Hadoop design framework, this study proposes a two layer architecture that exploits the inherent parallelism of GA-SVM classification to speed up the work. The performance evaluations on an mRNA benchmark cancer dataset have reduced 86.55% features and raised accuracy from 97.53% to 98.03%. With a user-friendly web interface, the system provides researchers an easy way to investigate the unrevealed secrets in the fast-growing repository of bioinformatics data.

  13. Application of high performance asynchronous socket communication in power distribution automation

    Science.gov (United States)

    Wang, Ziyu

    2017-05-01

    With the development of information technology and Internet technology, and the growing demand for electricity, the stability and the reliable operation of power system have been the goal of power grid workers. With the advent of the era of big data, the power data will gradually become an important breakthrough to guarantee the safe and reliable operation of the power grid. So, in the electric power industry, how to efficiently and robustly receive the data transmitted by the data acquisition device, make the power distribution automation system be able to execute scientific decision quickly, which is the pursuit direction in power grid. In this paper, some existing problems in the power system communication are analysed, and with the help of the network technology, a set of solutions called Asynchronous Socket Technology to the problem in network communication which meets the high concurrency and the high throughput is proposed. Besides, the paper also looks forward to the development direction of power distribution automation in the era of big data and artificial intelligence.

  14. ArrayBridge: Interweaving declarative array processing with high-performance computing

    Energy Technology Data Exchange (ETDEWEB)

    Xing, Haoyuan [The Ohio State Univ., Columbus, OH (United States); Floratos, Sofoklis [The Ohio State Univ., Columbus, OH (United States); Blanas, Spyros [The Ohio State Univ., Columbus, OH (United States); Byna, Suren [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Prabhat, Prabhat [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wu, Kesheng [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Brown, Paul [Paradigm4, Inc., Waltham, MA (United States)

    2017-05-04

    Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aims to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.

  15. High-performance computing-based exploration of flow control with micro devices.

    Science.gov (United States)

    Fujii, Kozo

    2014-08-13

    The dielectric barrier discharge (DBD) plasma actuator that controls flow separation is one of the promising technologies to realize energy savings and noise reduction of fluid dynamic systems. However, the mechanism for controlling flow separation is not clearly defined, and this lack of knowledge prevents practical use of this technology. Therefore, large-scale computations for the study of the DBD plasma actuator have been conducted using the Japanese Petaflops supercomputer 'K' for three different Reynolds numbers. Numbers of new findings on the control of flow separation by the DBD plasma actuator have been obtained from the simulations, and some of them are presented in this study. Knowledge of suitable device parameters is also obtained. The DBD plasma actuator is clearly shown to be very effective for controlling flow separation at a Reynolds number of around 10(5), and several times larger lift-to-drag ratio can be achieved at higher angles of attack after stall. For higher Reynolds numbers, separated flow is partially controlled. Flow analysis shows key features towards better control. DBD plasma actuators are a promising technology, which could reduce fuel consumption and contribute to a green environment by achieving high aerodynamic performance. The knowledge described above can be obtained only with high-end computers such as the supercomputer 'K'. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  16. Constructing complex 3D biological environments from medical imaging using high performance computing.

    Science.gov (United States)

    Burkitt, Mark; Walker, Dawn; Romano, Daniela M; Fazeli, Alireza

    2012-01-01

    Extracting information about the structure of biological tissue from static image data is a complex task requiring computationally intensive operations. Here, we present how multicore CPUs and GPUs have been utilized to extract information about the shape, size, and path followed by the mammalian oviduct, called the fallopian tube in humans, from histology images, to create a unique but realistic 3D virtual organ. Histology images were processed to identify the individual cross sections and determine the 3D path that the tube follows through the tissue. This information was then related back to the histology images, linking the 2D cross sections with their corresponding 3D position along the oviduct. A series of linear 2D spline cross sections, which were computationally generated for the length of the oviduct, were bound to the 3D path of the tube using a novel particle system technique that provides smooth resolution of self-intersections. This results in a unique 3D model of the oviduct, which is grounded in reality. The GPU is used for the processor intensive operations of image processing and particle physics based simulations, significantly reducing the time required to generate a complete model.

  17. High-performance computer simulation of wave processes in geological media in seismic exploration

    Science.gov (United States)

    Kvasov, I. E.; Petrov, I. B.

    2012-02-01

    A class of problems arising in seismic exploration are investigated, namely, seismic signal propagation in multilayered geological rock and near-surface disturbance propagation in massive rock with heterogeneities, such as empty or filled fractures and cavities. Numerical solutions are obtained for wave propagation in such highly heterogeneous media, including those taking into account the plastic properties of the rock, which can be manifested near a seismic gap or a wellbore. All types of explosion-generated elastic and elastoplastic waves and waves reflected from fractures and the boundaries of the integration domain are analyzed. The identification of waves in seismograms recorded with near-surface receivers is addressed. The grid-characteristic method is used on triangular, parallelepipedal, and tetrahedral meshes with boundary conditions set on the rock-fracture interface and on free surfaces in explicit form. The numerical method proposed is suitable for the study of the interaction between seismic waves and heterogeneous inclusions, since it ensures the most correct design of computational algorithms on the boundaries of the integration domain and at media interfaces. A parallel software code implemented with the help of OpenMP and MPI was used to execute computations on parallelepipedal and tetrahedral grids.

  18. Power-aware applications for scientific cluster and distributed computing

    CERN Document Server

    Abdurachmanov, David; Eulisse, Giulio; Grosso, Paola; Hillegas, Curtis; Holzman, Burt; Klous, Sander; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    The aggregate power use of computing hardware is an important cost factor in scientific cluster and distributed computing systems. The Worldwide LHC Computing Grid (WLCG) is a major example of such a distributed computing system, used primarily for high throughput computing (HTC) applications. It has a computing capacity and power consumption rivaling that of the largest supercomputers. The computing capacity required from this system is also expected to grow over the next decade. Optimizing the power utilization and cost of such systems is thus of great interest. A number of trends currently underway will provide new opportunities for power-aware optimizations. We discuss how power-aware software applications and scheduling might be used to reduce power consumption, both as autonomous entities and as part of a (globally) distributed system. As concrete examples of computing centers we provide information on the large HEP-focused Tier-1 at FNAL, and the Tigress High Performance Computing Center at Princeton U...

  19. High performance reconciliation for continuous-variable quantum key distribution with LDPC code

    Science.gov (United States)

    Lin, Dakai; Huang, Duan; Huang, Peng; Peng, Jinye; Zeng, Guihua

    2015-03-01

    Reconciliation is a significant procedure in a continuous-variable quantum key distribution (CV-QKD) system. It is employed to extract secure secret key from the resulted string through quantum channel between two users. However, the efficiency and the speed of previous reconciliation algorithms are low. These problems limit the secure communication distance and the secure key rate of CV-QKD systems. In this paper, we proposed a high-speed reconciliation algorithm through employing a well-structured decoding scheme based on low density parity-check (LDPC) code. The complexity of the proposed algorithm is reduced obviously. By using a graphics processing unit (GPU) device, our method may reach a reconciliation speed of 25 Mb/s for a CV-QKD system, which is currently the highest level and paves the way to high-speed CV-QKD.

  20. Distributed control software of high-performance control-loop algorithm

    CERN Document Server

    Blanc, D

    1999-01-01

    The majority of industrial cooling and ventilation plants require the control of complex processes. All these processes are highly important for the operation of the machines. The stability and reliability of these processes are leading factors identifying the quality of the service provided. The control system architecture and software structure, as well, are required to have high dynamical performance and robust behaviour. The intelligent systems based on PID or RST controllers are used for their high level of stability and accuracy. The design and tuning of these complex controllers require the dynamic model of the plant to be known (generally obtained by identification) and the desired performance of the various control loops to be specified for achieving good performances. The concept of having a distributed control algorithm software provides full automation facilities with well-adapted functionality and good performances, giving methodology, means and tools to master the dynamic process optimization an...

  1. Application of high performance computing for studying cyclic variability in dilute internal combustion engines

    Energy Technology Data Exchange (ETDEWEB)

    FINNEY, Charles E A [ORNL; Edwards, Kevin Dean [ORNL; Stoyanov, Miroslav K [ORNL; Wagner, Robert M [ORNL

    2015-01-01

    Combustion instabilities in dilute internal combustion engines are manifest in cyclic variability (CV) in engine performance measures such as integrated heat release or shaft work. Understanding the factors leading to CV is important in model-based control, especially with high dilution where experimental studies have demonstrated that deterministic effects can become more prominent. Observation of enough consecutive engine cycles for significant statistical analysis is standard in experimental studies but is largely wanting in numerical simulations because of the computational time required to compute hundreds or thousands of consecutive cycles. We have proposed and begun implementation of an alternative approach to allow rapid simulation of long series of engine dynamics based on a low-dimensional mapping of ensembles of single-cycle simulations which map input parameters to output engine performance. This paper details the use Titan at the Oak Ridge Leadership Computing Facility to investigate CV in a gasoline direct-injected spark-ignited engine with a moderately high rate of dilution achieved through external exhaust gas recirculation. The CONVERGE CFD software was used to perform single-cycle simulations with imposed variations of operating parameters and boundary conditions selected according to a sparse grid sampling of the parameter space. Using an uncertainty quantification technique, the sampling scheme is chosen similar to a design of experiments grid but uses functions designed to minimize the number of samples required to achieve a desired degree of accuracy. The simulations map input parameters to output metrics of engine performance for a single cycle, and by mapping over a large parameter space, results can be interpolated from within that space. This interpolation scheme forms the basis for a low-dimensional metamodel which can be used to mimic the dynamical behavior of corresponding high-dimensional simulations. Simulations of high-EGR spark

  2. A High Performance Computing Study of a Scalable FISST-Based Approach to Multi-Target, Multi-Sensor Tracking

    Science.gov (United States)

    Hussein, I.; Wilkins, M.; Roscoe, C.; Faber, W.; Chakravorty, S.; Schumacher, P.

    2016-09-01

    Finite Set Statistics (FISST) is a rigorous Bayesian multi-hypothesis management tool for the joint detection, classification and tracking of multi-sensor, multi-object systems. Implicit within the approach are solutions to the data association and target label-tracking problems. The full FISST filtering equations, however, are intractable. While FISST-based methods such as the PHD and CPHD filters are tractable, they require heavy moment approximations to the full FISST equations that result in a significant loss of information contained in the collected data. In this paper, we review Smart Sampling Markov Chain Monte Carlo (SSMCMC) that enables FISST to be tractable while avoiding moment approximations. We study the effect of tuning key SSMCMC parameters on tracking quality and computation time. The study is performed on a representative space object catalog with varying numbers of RSOs. The solution is implemented in the Scala computing language at the Maui High Performance Computing Center (MHPCC) facility.

  3. Computational Modeling of Electroslag Remelting (ESR) Process Used for the Production of High-Performance Alloys

    Science.gov (United States)

    Kelkar, Kanchan M.; Patankar, Suhas V.; Srivatsa, Shesh K.; Minisandram, Ramesh S.; Evans, David G.; deBarbadillo, John J.; Smith, Richard H.; Helmink, Randolph C.; Mitchell, Alec; Sizek, Howard A.

    This paper presents a comprehensive computational model for the prediction of the transient Electroslag Remelting (ESR) process for cylindrical ingots based on axisymmetric two-dimensional analysis. The model analyzes the behavior of the slag and growing ingot during the entire ESR process involving a hot-slag start with an initial transient, near-steady melting, hot-topping and subsequent solidification of the slag and ingot after melting ends. The results of model application for an illustrative ESR process for Alloy 718 and its validation using results from an industrial trial are presented. They demonstrate the comprehensive capabilities of the model in predicting the behavior of the ingot and slag during the entire process and properties of the final ingot produced. Such analysis can benefit the optimization of existing process schedules and design of new processes for different alloys and different ingot sizes.

  4. The feasibility of an efficient drug design method with high-performance computers.

    Science.gov (United States)

    Yamashita, Takefumi; Ueda, Akihiko; Mitsui, Takashi; Tomonaga, Atsushi; Matsumoto, Shunji; Kodama, Tatsuhiko; Fujitani, Hideaki

    2015-01-01

    In this study, we propose a supercomputer-assisted drug design approach involving all-atom molecular dynamics (MD)-based binding free energy prediction after the traditional design/selection step. Because this prediction is more accurate than the empirical binding affinity scoring of the traditional approach, the compounds selected by the MD-based prediction should be better drug candidates. In this study, we discuss the applicability of the new approach using two examples. Although the MD-based binding free energy prediction has a huge computational cost, it is feasible with the latest 10 petaflop-scale computer. The supercomputer-assisted drug design approach also involves two important feedback procedures: The first feedback is generated from the MD-based binding free energy prediction step to the drug design step. While the experimental feedback usually provides binding affinities of tens of compounds at one time, the supercomputer allows us to simultaneously obtain the binding free energies of hundreds of compounds. Because the number of calculated binding free energies is sufficiently large, the compounds can be classified into different categories whose properties will aid in the design of the next generation of drug candidates. The second feedback, which occurs from the experiments to the MD simulations, is important to validate the simulation parameters. To demonstrate this, we compare the binding free energies calculated with various force fields to the experimental ones. The results indicate that the prediction will not be very successful, if we use an inaccurate force field. By improving/validating such simulation parameters, the next prediction can be made more accurate.

  5. Scalable analysis of Big pathology image data cohorts using efficient methods and high-performance computing strategies.

    Science.gov (United States)

    Kurc, Tahsin; Qi, Xin; Wang, Daihou; Wang, Fusheng; Teodoro, George; Cooper, Lee; Nalisnik, Michael; Yang, Lin; Saltz, Joel; Foran, David J

    2015-12-01

    We describe a suite of tools and methods that form a core set of capabilities for researchers and clinical investigators to evaluate multiple analytical pipelines and quantify sensitivity and variability of the results while conducting large-scale studies in investigative pathology and oncology. The overarching objective of the current investigation is to address the challenges of large data sizes and high computational demands. The proposed tools and methods take advantage of state-of-the-art parallel machines and efficient content-based image searching strategies. The content based image retrieval (CBIR) algorithms can quickly detect and retrieve image patches similar to a query patch using a hierarchical analysis approach. The analysis component based on high performance computing can carry out consensus clustering on 500,000 data points using a large shared memory system. Our work demonstrates efficient CBIR algorithms and high performance computing can be leveraged for efficient analysis of large microscopy images to meet the challenges of clinically salient applications in pathology. These technologies enable researchers and clinical investigators to make more effective use of the rich informational content contained within digitized microscopy specimens.

  6. Exploiting Data Intensive Applications on High Performance Computers to Unlock Australia's Landsat Archive

    Science.gov (United States)

    Purss, Matthew; Lewis, Adam; Edberg, Roger; Ip, Alex; Sixsmith, Joshua; Frankish, Glenn; Chan, Tai; Evans, Ben; Hurst, Lachlan

    2013-04-01

    Australia's Earth Observation Program has downlinked and archived satellite data acquired under the NASA Landsat mission for the Australian Government since the establishment of the Australian Landsat Station in 1979. Geoscience Australia maintains this archive and produces image products to aid the delivery of government policy objectives. Due to the labor intensive nature of processing of this data there have been few national-scale datasets created to date. To compile any Earth Observation product the historical approach has been to select the required subset of data and process "scene by scene" on an as-needed basis. As data volumes have increased over time, and the demand for the processed data has also grown, it has become increasingly difficult to rapidly produce these products and achieve satisfactory policy outcomes using these historic processing methods. The result is that we have been "drowning in a sea of uncalibrated data" and scientists, policy makers and the public have not been able to realize the full potential of the Australian Landsat Archive and its value is therefore significantly diminished. To overcome this critical issue, the Australian Space Research Program has funded the "Unlocking the Landsat Archive" (ULA) Project from April 2011 to June 2013 to improve the access and utilization of Australia's archive of Landsat data. The ULA Project is a public-private consortium led by Lockheed Martin Australia (LMA) and involving Geoscience Australia (GA), the Victorian Partnership for Advanced Computing (VPAC), the National Computational Infrastructure (NCI) at the Australian National University (ANU) and the Cooperative Research Centre for Spatial Information (CRC-SI). The outputs from the ULA project will become a fundamental component of Australia's eResearch infrastructure, with the Australian Landsat Archive hosted on the NCI and made openly available under a creative commons license. NCI provides access to researchers through significant HPC

  7. A high performance, low power computational platform for complex sensing operations in smart cities

    Directory of Open Access Journals (Sweden)

    Jiming Jiang

    2017-04-01

    Full Text Available This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4 GHz 802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from https://osf.io/fuyqd/. The hardware design is under CERN Open Hardware License v1.2.

  8. A high performance, low power computational platform for complex sensing operations in smart cities

    KAUST Repository

    Jiang, Jiming

    2017-02-02

    This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4GHz 802.15.4802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from [1]. The hardware design is under CERN Open Hardware License v1.2.

  9. Sputnik: ad hoc distributed computation.

    Science.gov (United States)

    Völkel, Gunnar; Lausser, Ludwig; Schmid, Florian; Kraus, Johann M; Kestler, Hans A

    2015-04-15

    In bioinformatic applications, computationally demanding algorithms are often parallelized to speed up computation. Nevertheless, setting up computational environments for distributed computation is often tedious. Aim of this project were the lightweight ad hoc set up and fault-tolerant computation requiring only a Java runtime, no administrator rights, while utilizing all CPU cores most effectively. The Sputnik framework provides ad hoc distributed computation on the Java Virtual Machine which uses all supplied CPU cores fully. It provides a graphical user interface for deployment setup and a web user interface displaying the current status of current computation jobs. Neither a permanent setup nor administrator privileges are required. We demonstrate the utility of our approach on feature selection of microarray data. The Sputnik framework is available on Github http://github.com/sysbio-bioinf/sputnik under the Eclipse Public License. hkestler@fli-leibniz.de or hans.kestler@uni-ulm.de Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  10. High Performance Computation of a Jet in Crossflow by Lattice Boltzmann Based Parallel Direct Numerical Simulation

    Directory of Open Access Journals (Sweden)

    Jiang Lei

    2015-01-01

    Full Text Available Direct numerical simulation (DNS of a round jet in crossflow based on lattice Boltzmann method (LBM is carried out on multi-GPU cluster. Data parallel SIMT (single instruction multiple thread characteristic of GPU matches the parallelism of LBM well, which leads to the high efficiency of GPU on the LBM solver. With present GPU settings (6 Nvidia Tesla K20M, the present DNS simulation can be completed in several hours. A grid system of 1.5 × 108 is adopted and largest jet Reynolds number reaches 3000. The jet-to-free-stream velocity ratio is set as 3.3. The jet is orthogonal to the mainstream flow direction. The validated code shows good agreement with experiments. Vortical structures of CRVP, shear-layer vortices and horseshoe vortices, are presented and analyzed based on velocity fields and vorticity distributions. Turbulent statistical quantities of Reynolds stress are also displayed. Coherent structures are revealed in a very fine resolution based on the second invariant of the velocity gradients.

  11. Run-Time Dynamically-Adaptable FPGA-Based Architecture for High-Performance Autonomous Distributed Systems

    OpenAIRE

    Valverde Alcalá, Juan

    2015-01-01

    Esta tesis doctoral se enmarca dentro del campo de los sistemas embebidos reconfigurables, redes de sensores inalámbricas para aplicaciones de altas prestaciones, y computación distribuida. El documento se centra en el estudio de alternativas de procesamiento para sistemas embebidos autónomos distribuidos de altas prestaciones (por sus siglas en inglés, High-Performance Autonomous Distributed Systems (HPADS)), así como su evolución hacia el procesamiento de alta resolución. El estudio se ha ...

  12. Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

    Energy Technology Data Exchange (ETDEWEB)

    Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

    2011-07-28

    existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.

  13. Parallel Markov chain Monte Carlo - bridging the gap to high-performance Bayesian computation in animal breeding and genetics.

    Science.gov (United States)

    Wu, Xiao-Lin; Sun, Chuanyu; Beissinger, Timothy M; Rosa, Guilherme Jm; Weigel, Kent A; Gatti, Natalia de Leon; Gianola, Daniel

    2012-09-25

    Most Bayesian models for the analysis of complex traits are not analytically tractable and inferences are based on computationally intensive techniques. This is true of Bayesian models for genome-enabled selection, which uses whole-genome molecular data to predict the genetic merit of candidate animals for breeding purposes. In this regard, parallel computing can overcome the bottlenecks that can arise from series computing. Hence, a major goal of the present study is to bridge the gap to high-performance Bayesian computation in the context of animal breeding and genetics. Parallel Monte Carlo Markov chain algorithms and strategies are described in the context of animal breeding and genetics. Parallel Monte Carlo algorithms are introduced as a starting point including their applications to computing single-parameter and certain multiple-parameter models. Then, two basic approaches for parallel Markov chain Monte Carlo are described: one aims at parallelization within a single chain; the other is based on running multiple chains, yet some variants are discussed as well. Features and strategies of the parallel Markov chain Monte Carlo are illustrated using real data, including a large beef cattle dataset with 50K SNP genotypes. Parallel Markov chain Monte Carlo algorithms are useful for computing complex Bayesian models, which does not only lead to a dramatic speedup in computing but can also be used to optimize model parameters in complex Bayesian models. Hence, we anticipate that use of parallel Markov chain Monte Carlo will have a profound impact on revolutionizing the computational tools for genomic selection programs.

  14. Distributed Computing Beyond The Grid

    CERN Document Server

    Klimentov, A; The ATLAS collaboration

    2012-01-01

    This note will summarize the Software development and operational experience and improvements of the ATLAS Distributed Computing in the past years. Grid model was successfully deployed for all HEP experiments and after the first two years of very successful LHC data-taking and processing on the Grid we need to assess our experience and to find a good balance between stability and innovation. Several Research and Development (RnD) pilot projects were launched by ATLAS (and HEP) computing communities, namely 'cloud computing', data storage federation. HEP experiments are also adopted data popularity model and it allows to migrate from planned data placement to the dynamic model. This talk will also present an overview of HEP experiments computing model evolution and increasing role of networking as one of major resources (in addition to storage and CPU) which should be taken into account by workload management and data management systems.

  15. STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens.

    Science.gov (United States)

    Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D; Volz, Kerstin

    2017-06-01

    We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space. Copyright © 2017 Elsevier B.V. All rights reserved.

  16. High performance in software development

    CERN Multimedia

    CERN. Geneva; Haapio, Petri; Liukkonen, Juha-Matti

    2015-01-01

    What are the ingredients of high-performing software? Software development, especially for large high-performance systems, is one the most complex tasks mankind has ever tried. Technological change leads to huge opportunities but challenges our old ways of working. Processing large data sets, possibly in real time or with other tight computational constraints, requires an efficient solution architecture. Efficiency requirements span from the distributed storage and large-scale organization of computation and data onto the lowest level of processor and data bus behavior. Integrating performance behavior over these levels is especially important when the computation is resource-bounded, as it is in numerics: physical simulation, machine learning, estimation of statistical models, etc. For example, memory locality and utilization of vector processing are essential for harnessing the computing power of modern processor architectures due to the deep memory hierarchies of modern general-purpose computers. As a r...

  17. Towards using direct methods in seismic tomography: computation of the full resolution matrix using high-performance computing and sparse QR factorization

    Science.gov (United States)

    Bogiatzis, Petros; Ishii, Miaki; Davis, Timothy A.

    2016-05-01

    For more than two decades, the number of data and model parameters in seismic tomography problems has exceeded the available computational resources required for application of direct computational methods, leaving iterative solvers the only option. One disadvantage of the iterative techniques is that the inverse of the matrix that defines the system is not explicitly formed, and as a consequence, the model resolution and covariance matrices cannot be computed. Despite the significant effort in finding computationally affordable approximations of these matrices, challenges remain, and methods such as the checkerboard resolution tests continue to be used. Based upon recent developments in sparse algorithms and high-performance computing resources, we show that direct methods are becoming feasible for large seismic tomography problems. We demonstrate the application of QR factorization in solving the regional P-wave structure and computing the full resolution matrix with 267 520 model parameters.

  18. Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

    Science.gov (United States)

    Alameda, J. C.

    2011-12-01

    Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into

  19. Parallel distributed computing using Python

    Science.gov (United States)

    Dalcin, Lisandro D.; Paz, Rodrigo R.; Kler, Pablo A.; Cosimo, Alejandro

    2011-09-01

    This work presents two software components aimed to relieve the costs of accessing high-performance parallel computing resources within a Python programming environment: MPI for Python and PETSc for Python. MPI for Python is a general-purpose Python package that provides bindings for the Message Passing Interface (MPI) standard using any back-end MPI implementation. Its facilities allow parallel Python programs to easily exploit multiple processors using the message passing paradigm. PETSc for Python provides access to the Portable, Extensible Toolkit for Scientific Computation (PETSc) libraries. Its facilities allow sequential and parallel Python applications to exploit state of the art algorithms and data structures readily available in PETSc for the solution of large-scale problems in science and engineering. MPI for Python and PETSc for Python are fully integrated to PETSc-FEM, an MPI and PETSc based parallel, multiphysics, finite elements code developed at CIMEC laboratory. This software infrastructure supports research activities related to simulation of fluid flows with applications ranging from the design of microfluidic devices for biochemical analysis to modeling of large-scale stream/aquifer interactions.

  20. Supercomputing '91; Proceedings of the 4th Annual Conference on High Performance Computing, Albuquerque, NM, Nov. 18-22, 1991

    Science.gov (United States)

    1991-01-01

    Various papers on supercomputing are presented. The general topics addressed include: program analysis/data dependence, memory access, distributed memory code generation, numerical algorithms, supercomputer benchmarks, latency tolerance, parallel programming, applications, processor design, networks, performance tools, mapping and scheduling, characterization affecting performance, parallelism packaging, computing climate change, combinatorial algorithms, hardware and software performance issues, system issues. (No individual items are abstracted in this volume)

  1. High performance computing and communications Grand Challenges program: Computational structural biology. Final report, August 15, 1992--January 14, 1997

    Energy Technology Data Exchange (ETDEWEB)

    Solomon, J.E.

    1997-10-02

    The Grand Challenge project consists of two elements: (1) a hierarchical methodology for 3D protein structure prediction; and (2) development of a parallel computing environment, the Protein Folding Workbench, for carrying out a variety of protein structure prediction/modeling computations. During the first three years of this project the author focused on the use of selected proteins from the Brookhaven Protein Data Base (PDB) of known structures to provide validation of the prediction algorithms and their software implementation, both serial and parallel. Two proteins in particular have been selected to provide the project with direct interaction with experimental molecular biology. A variety of site-specific mutagenesis experiments are performed on these two proteins to explore the many-to-one mapping characteristics of sequence to structure.

  2. Harnessing the Department of Energy’s High-Performance Computing Expertise to Strengthen the U.S. Chemical Enterprise

    Energy Technology Data Exchange (ETDEWEB)

    Dixon, David A.; Dupuis, Michel; Garrett, Bruce C.; Neaton, Jeffrey B.; Plata, Charity; Tarr, Matthew A.; Tomb, Jean-Francois; Golab, Joseph T.

    2012-01-17

    High-performance computing (HPC) is one area where the DOE has developed extensive expertise and capability. However, this expertise currently is not properly shared with or used by the private sector to speed product development, enable industry to move rapidly into new areas, and improve product quality. Such use would lead to substantial competitive advantages in global markets and yield important economic returns for the United States. To stimulate the dissemination of DOE's HPC expertise, the Council for Chemical Research (CCR) and the DOE jointly held a workshop on this topic. Four important energy topic areas were chosen as the focus of the meeting: Biomass/Bioenergy, Catalytic Materials, Energy Storage, and Photovoltaics. Academic, industrial, and government experts in these topic areas participated in the workshop to identify industry needs, evaluate the current state of expertise, offer proposed actions and strategies, and forecast the expected benefits of implementing those strategies.

  3. A High Performance Computing Approach to the Simulation of Fluid-Solid interaction Problems with Rigid and Flexible Components

    Directory of Open Access Journals (Sweden)

    Pazouki Arman

    2014-08-01

    Full Text Available This work outlines a unified multi-threaded, multi-scale High Performance Computing (HPC approach for the direct numerical simulation of Fluid-Solid Interaction (FSI problems. The simulation algorithm relies on the extended Smoothed Particle Hydrodynamics (XSPH method, which approaches the fluid flow in a La-grangian framework consistent with the Lagrangian tracking of the solid phase. A general 3D rigid body dynamics and an Absolute Nodal Coordinate Formulation (ANCF are implemented to model rigid and flexible multibody dynamics. The two-way coupling of the fluid and solid phases is supported through use of Boundary Condition Enforcing (BCE markers that capture the fluid-solid coupling forces by enforcing a no-slip boundary condition. The solid-solid short range interaction, which has a crucial impact on the small-scale behavior of fluid-solid mixtures, is resolved via a lubrication force model. The collective system states are integrated in time using an explicit, multi-rate scheme. To alleviate the heavy computational load, the overall algorithm leverages parallel computing on Graphics Processing Unit (GPU cards. Performance and scaling analysis are provided for simulations scenarios involving one or multiple phases with up to tens of thousands of solid objects. The software implementation of the approach, called Chrono:Fluid, is part of the Chrono project and available as an open-source software.

  4. SCinet Architecture: Featured at the International Conference for High Performance Computing,Networking, Storage and Analysis 2016

    Energy Technology Data Exchange (ETDEWEB)

    Lyonnais, Marc; Smith, Matt; Mace, Kate P.

    2017-02-06

    SCinet is the purpose-built network that operates during the International Conference for High Performance Computing,Networking, Storage and Analysis (Super Computing or SC). Created each year for the conference, SCinet brings to life a high-capacity network that supports applications and experiments that are a hallmark of the SC conference. The network links the convention center to research and commercial networks around the world. This resource serves as a platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of applications. Volunteers from academia, government and industry work together to design and deliver the SCinet infrastructure. Industry vendors and carriers donate millions of dollars in equipment and services needed to build and support the local and wide area networks. Planning begins more than a year in advance of each SC conference and culminates in a high intensity installation in the days leading up to the conference. The SCinet architecture for SC16 illustrates a dramatic increase in participation from the vendor community, particularly those that focus on network equipment. Software-Defined Networking (SDN) and Data Center Networking (DCN) are present in nearly all aspects of the design.

  5. Pharmacokinetics and Tissue Distribution Study of Praeruptorin D from Radix Peucedani in Rats by High-Performance Liquid Chromatography (HPLC

    Directory of Open Access Journals (Sweden)

    Xue Du

    2012-07-01

    Full Text Available Praeruptorin D (PD, a major pyranocoumarin isolated from Radix Peucedani, exhibited antitumor and anti-inflammatory activities. The aim of this study was to investigate the pharmacokinetics and tissue distribution of PD in rats following intravenous (i.v. administration. The levels of PD in plasma and tissues were measured by a simple and sensitive reversed-phase high-performance liquid chromatography (HPLC method. The biosamples were treated by liquid-liquid extraction (LLE with methyl tert-butyl ether (MTBE and osthole was used as the internal standard (IS. The chromatographic separation was accomplished on a reversed-phase C18 column using methanol-water (75:25, v/v as mobile phase at a flow rate of 0.8 mL/min and ultraviolet detection wave length was set at 323 nm. The results demonstrate that this method has excellent specificity, linearity, precision, accuracy and recovery. The pharmacokinetic study found that PD fitted well into a two-compartment model with a fast distribution phase and a relative slow elimination phase. Tissue distribution showed that the highest concentration was observed in the lung, followed by heart, liver and kidney. Furthermore, PD can also be detected in the brain, which indicated that PD could cross the blood-brain barrier after i.v. administration.

  6. Experimental Evaluation of Indoor Air Distribution in High-Performance Residential Buildings: Part I. General Descriptions and Qualification Tests

    Energy Technology Data Exchange (ETDEWEB)

    Jalalzadeh, A. A.; Hancock, E.; Powell, D.

    2007-12-01

    The main objective of this project is to experimentally characterize an air distribution system in heating mode during a period of recovery from setback. The specific air distribution system under evaluation incorporates a high sidewall supply-air register/diffuser and a near-floor wall return air grille directly below. With this arrangement, the highest temperature difference between the supply air and the room can occur during the recovery period and create a favorable condition for stratification. The experimental approach will provide realistic input data and results for verification of computational fluid dynamics modeling.

  7. The use of high-performance computing to solve participating media radiative heat transfer problems-results of an NSF workshop

    Energy Technology Data Exchange (ETDEWEB)

    Gritzo, L.A.; Skocypec, R.D. [Sandia National Labs., Albuquerque, NM (United States); Tong, T.W. [Arizona State Univ., Tempe, AZ (United States). Dept. of Mechanical and Aerospace Engineering

    1995-01-11

    Radiation in participating media is an important transport mechanism in many physical systems. The simulation of complex radiative transfer has not effectively exploited high-performance computing capabilities. In response to this need, a workshop attended by members active in the high-performance computing community, members active in the radiative transfer community, and members from closely related fields was held to identify how high-performance computing can be used effectively to solve the transport equation and advance the state-of-the-art in simulating radiative heat transfer. This workshop was held on March 29-30, 1994 in Albuquerque, New Mexico and was conducted by Sandia National Laboratories. The objectives of this workshop were to provide a vehicle to stimulate interest and new research directions within the two communities to exploit the advantages of high-performance computing for solving complex radiative heat transfer problems that are otherwise intractable.

  8. High-performance simulation-based algorithms for an alpine ski racer’s trajectory optimization in heterogeneous computer systems

    Directory of Open Access Journals (Sweden)

    Dębski Roman

    2014-09-01

    Full Text Available Effective, simulation-based trajectory optimization algorithms adapted to heterogeneous computers are studied with reference to the problem taken from alpine ski racing (the presented solution is probably the most general one published so far. The key idea behind these algorithms is to use a grid-based discretization scheme to transform the continuous optimization problem into a search problem over a specially constructed finite graph, and then to apply dynamic programming to find an approximation of the global solution. In the analyzed example it is the minimum-time ski line, represented as a piecewise-linear function (a method of elimination of unfeasible solutions is proposed. Serial and parallel versions of the basic optimization algorithm are presented in detail (pseudo-code, time and memory complexity. Possible extensions of the basic algorithm are also described. The implementation of these algorithms is based on OpenCL. The included experimental results show that contemporary heterogeneous computers can be treated as μ-HPC platforms-they offer high performance (the best speedup was equal to 128 while remaining energy and cost efficient (which is crucial in embedded systems, e.g., trajectory planners of autonomous robots. The presented algorithms can be applied to many trajectory optimization problems, including those having a black-box represented performance measure

  9. An open, parallel I/O computer as the platform for high-performance, high-capacity mass storage systems

    Science.gov (United States)

    Abineri, Adrian; Chen, Y. P.

    1992-01-01

    APTEC Computer Systems is a Portland, Oregon based manufacturer of I/O computers. APTEC's work in the context of high density storage media is on programs requiring real-time data capture with low latency processing and storage requirements. An example of APTEC's work in this area is the Loral/Space Telescope-Data Archival and Distribution System. This is an existing Loral AeroSys designed system, which utilizes an APTEC I/O computer. The key attributes of a system architecture that is suitable for this environment are as follows: (1) data acquisition alternatives; (2) a wide range of supported mass storage devices; (3) data processing options; (4) data availability through standard network connections; and (5) an overall system architecture (hardware and software designed for high bandwidth and low latency). APTEC's approach is outlined in this document.

  10. A high performance sensorimotor beta rhythm-based brain computer interface associated with human natural motor behavior

    Science.gov (United States)

    Bai, Ou; Lin, Peter; Vorbach, Sherry; Floeter, Mary Kay; Hattori, Noriaki; Hallett, Mark

    2008-03-01

    To explore the reliability of a high performance brain-computer interface (BCI) using non-invasive EEG signals associated with human natural motor behavior does not require extensive training. We propose a new BCI method, where users perform either sustaining or stopping a motor task with time locking to a predefined time window. Nine healthy volunteers, one stroke survivor with right-sided hemiparesis and one patient with amyotrophic lateral sclerosis (ALS) participated in this study. Subjects did not receive BCI training before participating in this study. We investigated tasks of both physical movement and motor imagery. The surface Laplacian derivation was used for enhancing EEG spatial resolution. A model-free threshold setting method was used for the classification of motor intentions. The performance of the proposed BCI was validated by an online sequential binary-cursor-control game for two-dimensional cursor movement. Event-related desynchronization and synchronization were observed when subjects sustained or stopped either motor execution or motor imagery. Feature analysis showed that EEG beta band activity over sensorimotor area provided the largest discrimination. With simple model-free classification of beta band EEG activity from a single electrode (with surface Laplacian derivation), the online classifications of the EEG activity with motor execution/motor imagery were: >90%/~80% for six healthy volunteers, >80%/~80% for the stroke patient and ~90%/~80% for the ALS patient. The EEG activities of the other three healthy volunteers were not classifiable. The sensorimotor beta rhythm of EEG associated with human natural motor behavior can be used for a reliable and high performance BCI for both healthy subjects and patients with neurological disorders. Significance: The proposed new non-invasive BCI method highlights a practical BCI for clinical applications, where the user does not require extensive training.

  11. Overlapping clusters for distributed computation.

    Energy Technology Data Exchange (ETDEWEB)

    Mirrokni, Vahab (Google Research, New York, NY); Andersen, Reid (Microsoft Corporation, Redmond, WA); Gleich, David F.

    2010-11-01

    Scalable, distributed algorithms must address communication problems. We investigate overlapping clusters, or vertex partitions that intersect, for graph computations. This setup stores more of the graph than required but then affords the ease of implementation of vertex partitioned algorithms. Our hope is that this technique allows us to reduce communication in a computation on a distributed graph. The motivation above draws on recent work in communication avoiding algorithms. Mohiyuddin et al. (SC09) design a matrix-powers kernel that gives rise to an overlapping partition. Fritzsche et al. (CSC2009) develop an overlapping clustering for a Schwarz method. Both techniques extend an initial partitioning with overlap. Our procedure generates overlap directly. Indeed, Schwarz methods are commonly used to capitalize on overlap. Elsewhere, overlapping communities (Ahn et al, Nature 2009; Mishra et al. WAW2007) are now a popular model of structure in social networks. These have long been studied in statistics (Cole and Wishart, CompJ 1970). We present two types of results: (i) an estimated swapping probability {rho}{infinity}; and (ii) the communication volume of a parallel PageRank solution (link-following {alpha} = 0.85) using an additive Schwarz method. The volume ratio is the amount of extra storage for the overlap (2 means we store the graph twice). Below, as the ratio increases, the swapping probability and PageRank communication volume decreases.

  12. Towards distributed multiscale computing for the VPH

    NARCIS (Netherlands)

    Hoekstra, A.G.; Coveney, P.

    2010-01-01

    Multiscale modeling is fundamental to the Virtual Physiological Human (VPH) initiative. Most detailed three-dimensional multiscale models lead to prohibitive computational demands. As a possible solution we present MAPPER, a computational science infrastructure for Distributed Multiscale Computing

  13. Computational Approach for Securing Radiology-Diagnostic Data in Connected Health Network using High-Performance GPU-Accelerated AES.

    Science.gov (United States)

    Adeshina, A M; Hashim, R

    2017-03-01

    Diagnostic radiology is a core and integral part of modern medicine, paving ways for the primary care physicians in the disease diagnoses, treatments and therapy managements. Obviously, all recent standard healthcare procedures have immensely benefitted from the contemporary information technology revolutions, apparently revolutionizing those approaches to acquiring, storing and sharing of diagnostic data for efficient and timely diagnosis of diseases. Connected health network was introduced as an alternative to the ageing traditional concept in healthcare system, improving hospital-physician connectivity and clinical collaborations. Undoubtedly, the modern medicinal approach has drastically improved healthcare but at the expense of high computational cost and possible breach of diagnosis privacy. Consequently, a number of cryptographical techniques are recently being applied to clinical applications, but the challenges of not being able to successfully encrypt both the image and the textual data persist. Furthermore, processing time of encryption-decryption of medical datasets, within a considerable lower computational cost without jeopardizing the required security strength of the encryption algorithm, still remains as an outstanding issue. This study proposes a secured radiology-diagnostic data framework for connected health network using high-performance GPU-accelerated Advanced Encryption Standard. The study was evaluated with radiology image datasets consisting of brain MR and CT datasets obtained from the department of Surgery, University of North Carolina, USA, and the Swedish National Infrastructure for Computing. Sample patients' notes from the University of North Carolina, School of medicine at Chapel Hill were also used to evaluate the framework for its strength in encrypting-decrypting textual data in the form of medical report. Significantly, the framework is not only able to accurately encrypt and decrypt medical image datasets, but it also

  14. The Analysis and Processing of Spectra with Splicing Abnormality in LAMOST Based on High Performance Computing Platform

    Science.gov (United States)

    Meng, Fanlong; Pan, Jingchang; Yu, Jingjing

    2017-10-01

    Since the completion of LAMOST the observation had produced a large number of data, and officially released data. There are some poor quality spectra in the observation data and even the data released to the outside world. Splicing abnormal spectra is one kind of them. Splicing abnormality is poor continuity spectrum showed in the splicing wavelengths of the red and blue end. This problem can be caused by several factors, such as stability of instrument, observation condition, the response function, and so on. It has important effect on the research and analysis of spectra. It is of great significance to study the automatic identification of the splicing abnormal spectra. The method in this paper we first do the spectrum pretreatment, then the spectral was intercepted into red and blue in order to fit curves. For the two fitting curves, a series of wavelength points are selected to calculate the difference of these points. For the difference set, the statistical characteristics of the difference are analyzed, such as mean and standard deviation. Then we calculate the area of the two curves. Based on these values, we proposed an evaluation function. At the same time, we provided a method for detecting the splicing abnormal spectra based on the high performance computing platform. Through a lot of experiments, the efficiency was greatly improved compared to stand-alone.

  15. Exploring the transcriptome of non-model oleaginous microalga Dunaliella tertiolecta through high-throughput sequencing and high performance computing.

    Science.gov (United States)

    Yao, Lina; Tan, Kenneth Wei Min; Tan, Tin Wee; Lee, Yuan Kun

    2017-02-22

    RNA-Seq technology has received a lot of attention in recent years for microalgal global transcriptomic profiling. It is widely used in transcriptome-wide analysis of gene expression., particularly for microalgal strains with potential as biofuel sources. However, insufficient genomic or transcriptomic information of non-model microalgae has limited the understanding of their regulatory mechanisms and hampered genetic manipulation to enhance biofuel production. As such, an optimal microalgal transcriptomic database construction is a subject of urgent investigation. Dunaliella tertiolecta, a non-model oleaginous microalgal species, was sequenced via Illumina MISEQ and HISEQ 4000 in RNA-Seq studies. The high quality high-throughout sequencing data were explored using high performance computing (HPC) in a petascale data center and subjected to de novo assembly and parallelized mpiBLASTX search with multiple species. As a result, a transcriptome database of 17,845 was constructed (~95% completeness). This enlarged database constructed fueled the RNA-Seq data analysis, which was validated by a nitrogen deprivation (ND) study that induces triacylglycerol (TAG) production. The new paralleled assembly and annotation method under HPC presented here allows the solution of large-scale data processing problems in acceptable computation time. There is significant increase in the number of transcriptomic data achieved and observable heterogeneity in the performance to identify differentially expressed genes in the ND treatment paradigm. The results provide new insights as to how response to ND treatment in microalgae is regulated. ND analyses highlight the advantages of this database generated in this study that could also serve as a useful resource for future gene manipulation and transcriptome-wide analysis. We thus demonstrate the usefulness of exploring the transcriptome as an informative platform for functional studies and genetic manipulations in similar species.

  16. Final report for %22High performance computing for advanced national electric power grid modeling and integration of solar generation resources%22, LDRD Project No. 149016.

    Energy Technology Data Exchange (ETDEWEB)

    Reno, Matthew J.; Riehm, Andrew Charles; Hoekstra, Robert John; Munoz-Ramirez, Karina; Stamp, Jason Edwin; Phillips, Laurence R.; Adams, Brian M.; Russo, Thomas V.; Oldfield, Ron A.; McLendon, William Clarence, III; Nelson, Jeffrey Scott; Hansen, Clifford W.; Richardson, Bryan T.; Stein, Joshua S.; Schoenwald, David Alan; Wolfenbarger, Paul R.

    2011-02-01

    Design and operation of the electric power grid (EPG) relies heavily on computational models. High-fidelity, full-order models are used to study transient phenomena on only a small part of the network. Reduced-order dynamic and power flow models are used when analysis involving thousands of nodes are required due to the computational demands when simulating large numbers of nodes. The level of complexity of the future EPG will dramatically increase due to large-scale deployment of variable renewable generation, active load and distributed generation resources, adaptive protection and control systems, and price-responsive demand. High-fidelity modeling of this future grid will require significant advances in coupled, multi-scale tools and their use on high performance computing (HPC) platforms. This LDRD report demonstrates SNL's capability to apply HPC resources to these 3 tasks: (1) High-fidelity, large-scale modeling of power system dynamics; (2) Statistical assessment of grid security via Monte-Carlo simulations of cyber attacks; and (3) Development of models to predict variability of solar resources at locations where little or no ground-based measurements are available.

  17. Pair distribution function computed tomography.

    Science.gov (United States)

    Jacques, Simon D M; Di Michiel, Marco; Kimber, Simon A J; Yang, Xiaohao; Cernik, Robert J; Beale, Andrew M; Billinge, Simon J L

    2013-01-01

    An emerging theme of modern composites and devices is the coupling of nanostructural properties of materials with their targeted arrangement at the microscale. Of the imaging techniques developed that provide insight into such designer materials and devices, those based on diffraction are particularly useful. However, to date, these have been heavily restrictive, providing information only on materials that exhibit high crystallographic ordering. Here we describe a method that uses a combination of X-ray atomic pair distribution function analysis and computed tomography to overcome this limitation. It allows the structure of nanocrystalline and amorphous materials to be identified, quantified and mapped. We demonstrate the method with a phantom object and subsequently apply it to resolving, in situ, the physicochemical states of a heterogeneous catalyst system. The method may have potential impact across a range of disciplines from materials science, biomaterials, geology, environmental science, palaeontology and cultural heritage to health.

  18. Cooperative Fault Tolerant Distributed Computing

    Energy Technology Data Exchange (ETDEWEB)

    Fagg, Graham E.

    2006-03-15

    HARNESS was proposed as a system that combined the best of emerging technologies found in current distributed computing research and commercial products into a very flexible, dynamically adaptable framework that could be used by applications to allow them to evolve and better handle their execution environment. The HARNESS system was designed using the considerable experience from previous projects such as PVM, MPI, IceT and Cumulvs. As such, the system was designed to avoid any of the common problems found with using these current systems, such as no single point of failure, ability to survive machine, node and software failures. Additional features included improved inter-component connectivity, with full support for dynamic down loading of addition components at run-time thus reducing the stress on application developers to build in all the libraries they need in advance.

  19. Adoption of High Performance Computational (HPC) Modeling Software for Widespread Use in the Manufacture of Welded Structures

    Energy Technology Data Exchange (ETDEWEB)

    Brust, Frederick W. [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Punch, Edward F. [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Twombly, Elizabeth Kurth [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Kalyanam, Suresh [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Kennedy, James [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Hattery, Garty R. [Engineering Mechanics Corporation of Columbus, Columbus, OH (United States); Dodds, Robert H. [Professional Consulting Services, Inc., Lisle, IL (United States); Mach, Justin C [Caterpillar, Peoria, IL (United States); Chalker, Alan [Ohio Supercomputer Center (OSC), Columbus, OH (United States); Nicklas, Jeremy [Ohio Supercomputer Center (OSC), Columbus, OH (United States); Gohar, Basil M [Ohio Supercomputer Center (OSC), Columbus, OH (United States); Hudak, David [Ohio Supercomputer Center (OSC), Columbus, OH (United States)

    2016-12-30

    This report summarizes the final product developed for the US DOE Small Business Innovation Research (SBIR) Phase II grant made to Engineering Mechanics Corporation of Columbus (Emc2) between April 16, 2014 and August 31, 2016 titled ‘Adoption of High Performance Computational (HPC) Modeling Software for Widespread Use in the Manufacture of Welded Structures’. Many US companies have moved fabrication and production facilities off shore because of cheaper labor costs. A key aspect in bringing these jobs back to the US is the use of technology to render US-made fabrications more cost-efficient overall with higher quality. One significant advantage that has emerged in the US over the last two decades is the use of virtual design for fabrication of small and large structures in weld fabrication industries. Industries that use virtual design and analysis tools have reduced material part size, developed environmentally-friendly fabrication processes, improved product quality and performance, and reduced manufacturing costs. Indeed, Caterpillar Inc. (CAT), one of the partners in this effort, continues to have a large fabrication presence in the US because of the use of weld fabrication modeling to optimize fabrications by controlling weld residual stresses and distortions and improving fatigue, corrosion, and fracture performance. This report describes Emc2’s DOE SBIR Phase II final results to extend an existing, state-of-the-art software code, Virtual Fabrication Technology (VFT®), currently used to design and model large welded structures prior to fabrication - to a broader range of products with widespread applications for small and medium-sized enterprises (SMEs). VFT® helps control distortion, can minimize and/or control residual stresses, control welding microstructure, and pre-determine welding parameters such as weld-sequencing, pre-bending, thermal-tensioning, etc. VFT® uses material properties, consumable properties, etc. as inputs

  20. Distributed and cloud computing from parallel processing to the Internet of Things

    CERN Document Server

    Hwang, Kai; Fox, Geoffrey C

    2012-01-01

    Distributed and Cloud Computing, named a 2012 Outstanding Academic Title by the American Library Association's Choice publication, explains how to create high-performance, scalable, reliable systems, exposing the design principles, architecture, and innovative applications of parallel, distributed, and cloud computing systems. Starting with an overview of modern distributed models, the book provides comprehensive coverage of distributed and cloud computing, including: Facilitating management, debugging, migration, and disaster recovery through virtualization Clustered systems for resear

  1. High performance systems

    Energy Technology Data Exchange (ETDEWEB)

    Vigil, M.B. [comp.

    1995-03-01

    This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.

  2. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery Using a Probabilistic Learning Framework

    Science.gov (United States)

    Basu, Saikat; Ganguly, Sangram; Michaelis, Andrew; Votava, Petr; Roy, Anshuman; Mukhopadhyay, Supratik; Nemani, Ramakrishna

    2015-01-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets, which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  3. A High Performance Computing Approach to Tree Cover Delineation in 1-m NAIP Imagery using a Probabilistic Learning Framework

    Science.gov (United States)

    Basu, S.; Ganguly, S.; Michaelis, A.; Votava, P.; Roy, A.; Mukhopadhyay, S.; Nemani, R. R.

    2015-12-01

    Tree cover delineation is a useful instrument in deriving Above Ground Biomass (AGB) density estimates from Very High Resolution (VHR) airborne imagery data. Numerous algorithms have been designed to address this problem, but most of them do not scale to these datasets which are of the order of terabytes. In this paper, we present a semi-automated probabilistic framework for the segmentation and classification of 1-m National Agriculture Imagery Program (NAIP) for tree-cover delineation for the whole of Continental United States, using a High Performance Computing Architecture. Classification is performed using a multi-layer Feedforward Backpropagation Neural Network and segmentation is performed using a Statistical Region Merging algorithm. The results from the classification and segmentation algorithms are then consolidated into a structured prediction framework using a discriminative undirected probabilistic graphical model based on Conditional Random Field, which helps in capturing the higher order contextual dependencies between neighboring pixels. Once the final probability maps are generated, the framework is updated and re-trained by relabeling misclassified image patches. This leads to a significant improvement in the true positive rates and reduction in false positive rates. The tree cover maps were generated for the whole state of California, spanning a total of 11,095 NAIP tiles covering a total geographical area of 163,696 sq. miles. The framework produced true positive rates of around 88% for fragmented forests and 74% for urban tree cover areas, with false positive rates lower than 2% for both landscapes. Comparative studies with the National Land Cover Data (NLCD) algorithm and the LiDAR canopy height model (CHM) showed the effectiveness of our framework for generating accurate high-resolution tree-cover maps.

  4. Message passing interface and multithreading hybrid for parallel molecular docking of large databases on petascale high performance computing machines.

    Science.gov (United States)

    Zhang, Xiaohua; Wong, Sergio E; Lightstone, Felice C

    2013-04-30

    A mixed parallel scheme that combines message passing interface (MPI) and multithreading was implemented in the AutoDock Vina molecular docking program. The resulting program, named VinaLC, was tested on the petascale high performance computing (HPC) machines at Lawrence Livermore National Laboratory. To exploit the typical cluster-type supercomputers, thousands of docking calculations were dispatched by the master process to run simultaneously on thousands of slave processes, where each docking calculation takes one slave process on one node, and within the node each docking calculation runs via multithreading on multiple CPU cores and shared memory. Input and output of the program and the data handling within the program were carefully designed to deal with large databases and ultimately achieve HPC on a large number of CPU cores. Parallel performance analysis of the VinaLC program shows that the code scales up to more than 15K CPUs with a very low overhead cost of 3.94%. One million flexible compound docking calculations took only 1.4 h to finish on about 15K CPUs. The docking accuracy of VinaLC has been validated against the DUD data set by the re-docking of X-ray ligands and an enrichment study, 64.4% of the top scoring poses have RMSD values under 2.0 Å. The program has been demonstrated to have good enrichment performance on 70% of the targets in the DUD data set. An analysis of the enrichment factors calculated at various percentages of the screening database indicates VinaLC has very good early recovery of actives. Copyright © 2013 Wiley Periodicals, Inc.

  5. Structural integrity and damage assessment of high performance arresting cable systems using an embedded distributed fiber optic sensor (EDIFOS) system

    Science.gov (United States)

    Mendoza, Edgar A.; Kempen, Cornelia; Sun, Sunjian; Esterkin, Yan; Prohaska, John; Bentley, Doug; Glasgow, Andy; Campbell, Richard

    2010-04-01

    Redondo Optics in collaboration with the Cortland Cable Company, TMT Laboratories, and Applied Fiber under a US Navy SBIR project is developing an embedded distributed fiber optic sensor (EDIFOSTM) system for the real-time, structural health monitoring, damage assessment, and lifetime prediction of next generation synthetic material arresting gear cables. The EDIFOSTM system represents a new, highly robust and reliable, technology that can be use for the structural damage assessment of critical cable infrastructures. The Navy is currently investigating the use of new, all-synthetic- material arresting cables. The arresting cable is one of the most stressed components in the entire arresting gear landing system. Synthetic rope materials offer higher performance in terms of the strength-to-weight characteristics, which improves the arresting gear engine's performance resulting in reduced wind-over-deck requirements, higher aircraft bring-back-weight capability, simplified operation, maintenance, supportability, and reduced life cycle costs. While employing synthetic cables offers many advantages for the Navy's future needs, the unknown failure modes of these cables remains a high technical risk. For these reasons, Redondo Optics is investigating the use of embedded fiber optic sensors within the synthetic arresting cables to provide real-time structural assessment of the cable state, and to inform the operator when a particular cable has suffered impact damage, is near failure, or is approaching the limit of its service lifetime. To date, ROI and its collaborators have developed a technique for embedding multiple sensor fibers within the strands of high performance synthetic material cables and use the embedded fiber sensors to monitor the structural integrity of the cable structures during tensile and compressive loads exceeding over 175,000-lbsf without any damage to the cable structure or the embedded fiber sensors.

  6. Simulation model of load balancing in distributed computing systems

    Science.gov (United States)

    Botygin, I. A.; Popov, V. N.; Frolov, S. G.

    2017-02-01

    The availability of high-performance computing, high speed data transfer over the network and widespread of software for the design and pre-production in mechanical engineering have led to the fact that at the present time the large industrial enterprises and small engineering companies implement complex computer systems for efficient solutions of production and management tasks. Such computer systems are generally built on the basis of distributed heterogeneous computer systems. The analytical problems solved by such systems are the key models of research, but the system-wide problems of efficient distribution (balancing) of the computational load and accommodation input, intermediate and output databases are no less important. The main tasks of this balancing system are load and condition monitoring of compute nodes, and the selection of a node for transition of the user’s request in accordance with a predetermined algorithm. The load balancing is one of the most used methods of increasing productivity of distributed computing systems through the optimal allocation of tasks between the computer system nodes. Therefore, the development of methods and algorithms for computing optimal scheduling in a distributed system, dynamically changing its infrastructure, is an important task.

  7. Cambridge-Cranfield High Performance Computing Facility (HPCF) purchases ten Sun Fire(TM) 15K servers to dramatically increase power of eScience research

    CERN Document Server

    2002-01-01

    "The Cambridge-Cranfield High Performance Computing Facility (HPCF), a collaborative environment for data and numerical intensive computing privately run by the University of Cambridge and Cranfield University, has purchased 10 Sun Fire(TM) 15K servers from Sun Microsystems, Inc.. The total investment, which includes more than $40 million in Sun technology, will dramatically increase the computing power, reliability, availability and scalability of the HPCF" (1 page).

  8. VLSI Design, Parallel Computation and Distributed Computing

    Science.gov (United States)

    1991-09-30

    I U1 TA 3 Daniel Mleitman U. :C..( -_. .. .s .. . . . . Tom Leighton David Shmoys . ........A ,~i ;.t , 77 Michael Sipser , Di.,t a-., Eva Tardos...Programming. 53. R. Boppania. MI. Sipser , --The Complexity of Finite Functions," Handbook of Theoretical Computer Science, 1990, 7.59-800. .54. T...Property with Application," IPL. Vol. 35, Sept. 1990, pp. 281- 285. 139. M. Sipser , L. Fortnow, ’Interactive Proof Systems in Log-Space," submit- ted

  9. A High-Performance Communication Service for Parallel Computing on Distributed DSP Systems

    National Research Council Canada - National Science Library

    2002-01-01

    Rapid increases in the complexity of algorithms for real-time signal processing applications have led to performance requirements exceeding the capabilities of conventional digital signal processor (DSP) architectures...

  10. Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Chase Qishi [New Jersey Inst. of Technology, Newark, NJ (United States); Univ. of Memphis, TN (United States); Zhu, Michelle Mengxia [Southern Illinois Univ., Carbondale, IL (United States)

    2016-06-06

    The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models feature diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific

  11. Playable Serious Games for Studying and Programming Computational STEM and Informatics Applications of Distributed and Parallel Computer Architectures

    Science.gov (United States)

    Amenyo, John-Thones

    2012-01-01

    Carefully engineered playable games can serve as vehicles for students and practitioners to learn and explore the programming of advanced computer architectures to execute applications, such as high performance computing (HPC) and complex, inter-networked, distributed systems. The article presents families of playable games that are grounded in…

  12. A DFN-based High Performance Computing Approach to the Simulation of Radionuclide Transport in Mineralogically Heterogeneous Fractured Rocks

    Science.gov (United States)

    Gylling, B.; Trinchero, P.; Molinero, J.; Deissmann, G.; Svensson, U.; Ebrahimi, H.; Hammond, G. E.; Bosbach, D.; Puigdomenech, I.

    2016-12-01

    estimates compared to the grain-scale model. The breakthrough computed by the two models converge at late times. These results suggest that spatially averaged parameters of mineral distribution are adequate to predict the late time behavior of breakthrough curves but could lead to severe overestimation of the radionuclide first-arrival.

  13. High-Performance Computing Act of 1991. Report To Accompany H.R. 656. House of Representatives, 102d Congress, 1st Session.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. House.

    This report, submitted by Representative Ford, of Michigan, from the House Committee on Education and Labor, is intended to accompany H.R. 656. The bill provides for a focused and coordinated federal research program to ensure continued U.S. leadership in high-performance computing through the establishment of a national network of…

  14. High Performance Computing and Communications: New Program Direction Would Benefit from a More Focused Effort. Report to the Chairman, Committee on Armed Services, House of Representatives.

    Science.gov (United States)

    General Accounting Office, Washington, DC. Accounting and Information Management Div.

    The House Armed Services Committee asked the GAO (General Accounting Office) to examine the HPCC (High Performance Computing and Communications) program in terms of: (1) the effectiveness of the program's management structure in setting goals and measuring progress, and (2) how extensively private industry has been involved in the planning and…

  15. HiTempo: a platform for time-series analysis of remote-sensing satellite data in a high-performance computing environment

    CSIR Research Space (South Africa)

    Van den Bergh, F

    2012-08-01

    Full Text Available methods. To research the application of these time-series analysis methods to large data sets, it is necessary to turn to high-performance computing (HPC) resources and software designs. This article presents an overview of the development of the Hi...

  16. Distributed-memory matrix computations

    DEFF Research Database (Denmark)

    Balle, Susanne Mølleskov

    1995-01-01

    The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism in these......The main goal of this project is to investigate, develop, and implement algorithms for numerical linear algebra on parallel computers in order to acquire expertise in methods for parallel computations. An important motivation for analyzaing and investigating the potential for parallelism....... Several areas in the numerical linear algebra field are investigated and they illustrate the problems that arise as well as the techniques that are related to the use of massively parallel computers: 1.Study of Strassen's matrix-matrix multiplication on the Connection Machine model CM-200. What...... performance can we expect to achieve? Why? 2.Solving systems of linear equations using a Strassen-type matrix-inversion algorithm. A good way to solve systems of linear equations on massively parallel computers? 3.Aspects of computing the singular value decomposition on the Connec-tion Machine CM-5/CM-5E...

  17. Dynamo: a flexible, user-friendly development tool for subtomogram averaging of cryo-EM data in high-performance computing environments.

    Science.gov (United States)

    Castaño-Díez, Daniel; Kudryashev, Mikhail; Arheit, Marcel; Stahlberg, Henning

    2012-05-01

    Dynamo is a new software package for subtomogram averaging of cryo Electron Tomography (cryo-ET) data with three main goals: first, Dynamo allows user-transparent adaptation to a variety of high-performance computing platforms such as GPUs or CPU clusters. Second, Dynamo implements user-friendliness through GUI interfaces and scripting resources. Third, Dynamo offers user-flexibility through a plugin API. Besides the alignment and averaging procedures, Dynamo includes native tools for visualization and analysis of results and data, as well as support for third party visualization software, such as Chimera UCSF or EMAN2. As a demonstration of these functionalities, we studied bacterial flagellar motors and showed automatically detected classes with absent and present C-rings. Subtomogram averaging is a common task in current cryo-ET pipelines, which requires extensive computational resources and follows a well-established workflow. However, due to the data diversity, many existing packages offer slight variations of the same algorithm to improve results. One of the main purposes behind Dynamo is to provide explicit tools to allow the user the insertion of custom designed procedures - or plugins - to replace or complement the native algorithms in the different steps of the processing pipeline for subtomogram averaging without the burden of handling parallelization. Custom scripts that implement new approaches devised by the user are integrated into the Dynamo data management system, so that they can be controlled by the GUI or the scripting capacities. Dynamo executables do not require licenses for third party commercial software. Sources, executables and documentation are freely distributed on http://www.dynamo-em.org. Copyright © 2012 Elsevier Inc. All rights reserved.

  18. An Introduction to High Performance Fortran

    Directory of Open Access Journals (Sweden)

    John Merlin

    1995-01-01

    Full Text Available High Performance Fortran (HPF is an informal standard for extensions to Fortran 90 to assist its implementation on parallel architectures, particularly for data-parallel computation. Among other things, it includes directives for specifying data distribution across multiple memories, and concurrent execution features. This article provides a tutorial introduction to the main features of HPF.

  19. Fel simulations using distributed computing

    NARCIS (Netherlands)

    Einstein, J.; Biedron, S.G.; Freund, H.P.; Milton, S.V.; Van Der Slot, P. J M; Bernabeu, G.

    2016-01-01

    While simulation tools are available and have been used regularly for simulating light sources, including Free-Electron Lasers, the increasing availability and lower cost of accelerated computing opens up new opportunities. This paper highlights a method of how accelerating and parallelizing code

  20. High Performance Computing and Communications Act of 1991. Hearing Before the Subcommittee on Science, Technology, and Space of the Committee on Commerce, Science, and Transportation. One Hundred Second Congress, First Session on S. 272 To Provide for a Coordinated Federal Research Program To Ensure Continued United States Leadership in High-Performance Computing.

    Science.gov (United States)

    Congress of the U.S., Washington, DC. Senate Committee on Commerce, Science, and Transportation.

    This hearing before the Senate Subcommittee on Science, Technology, and Space focuses on S. 272, the High-Performance Computing and Communications Act of 1991, a bill that provides for a coordinated federal research and development program to ensure continued U.S. leadership in this area. Performance computing is defined as representing the…

  1. High performance computing applied to simulation of the flow in pipes; Computacao de alto desempenho aplicada a simulacao de escoamento em dutos

    Energy Technology Data Exchange (ETDEWEB)

    Cozin, Cristiane; Lueders, Ricardo; Morales, Rigoberto E.M. [Universidade Tecnologica Federal do Parana (UTFPR), Curitiba, PR (Brazil). Dept. de Engenharia Mecanica

    2008-07-01

    In recent years, computer cluster has emerged as a real alternative to solution of problems which require high performance computing. Consequently, the development of new applications has been driven. Among them, flow simulation represents a real computational burden specially for large systems. This work presents a study of using parallel computing for numerical fluid flow simulation in pipelines. A mathematical flow model is numerically solved. In general, this procedure leads to a tridiagonal system of equations suitable to be solved by a parallel algorithm. In this work, this is accomplished by a parallel odd-oven reduction method found in the literature which is implemented on Fortran programming language. A computational platform composed by twelve processors was used. Many measures of CPU times for different tridiagonal system sizes and number of processors were obtained, highlighting the communication time between processors as an important issue to be considered when evaluating the performance of parallel applications. (author)

  2. Next generation distributed computing for cancer research.

    Science.gov (United States)

    Agarwal, Pankaj; Owzar, Kouros

    2014-01-01

    Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing.

  3. Next Generation Distributed Computing for Cancer Research

    Science.gov (United States)

    Agarwal, Pankaj; Owzar, Kouros

    2014-01-01

    Advances in next generation sequencing (NGS) and mass spectrometry (MS) technologies have provided many new opportunities and angles for extending the scope of translational cancer research while creating tremendous challenges in data management and analysis. The resulting informatics challenge is invariably not amenable to the use of traditional computing models. Recent advances in scalable computing and associated infrastructure, particularly distributed computing for Big Data, can provide solutions for addressing these challenges. In this review, the next generation of distributed computing technologies that can address these informatics problems is described from the perspective of three key components of a computational platform, namely computing, data storage and management, and networking. A broad overview of scalable computing is provided to set the context for a detailed description of Hadoop, a technology that is being rapidly adopted for large-scale distributed computing. A proof-of-concept Hadoop cluster, set up for performance benchmarking of NGS read alignment, is described as an example of how to work with Hadoop. Finally, Hadoop is compared with a number of other current technologies for distributed computing. PMID:25983539

  4. High performance computing equipment for environmental remediation modeling and first principles simulation of materials properties. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Glimm, J.; Lindquist, W.B.

    1994-08-01

    A 56-node Intel Paragon parallel computer was purchased with major support provided by this grant, and installed in July, 1993, in the Center for Scientific Computing, Department of Applied Mathematics and Statistics, SUNY - Stony Brook. The targeted research funded by this proposal consists of work to support the Stony Brook and Brookhaven National Laboratory contributions to the Partnership in Computational Science (PICS) program; namely environmental remediation modeling of ground water transport, Car-Parrinello first principles molecular dynamics calculations, and the supporting development of the parallelized VolVis graphics package. Research accomplishments to date for this targeted research is discussed in {section}2. This computer has also enabled or enhanced many other projects conducted both by the Center for Scientific Computing and by the Department of Applied Mathematics and Statistics. These other projects include two- and three-dimensional gas dynamics using front tracking, other molecular dynamics applications, kidney modeling, and global optimization techniques applied to DNA-protein interactions. Technical summaries of these additional projects are presented in {section}3. The targeted research includes users from the Departments of Applied Mathematics and Computer Science at SUNY - Stony Brook, as well as staff scientists from the Departments of Physics and Applied Sciences at Brookhaven National Laboratory. The additional projects involve university faculty from the above departments as well as the Departments of Physics and Chemistry. Regular users of this machine currently include 10 faculty members, 8 postdoctoral fellows, more that 12 PhD students and approximately 8 staff members from BNL.

  5. Parallel-META 2.0: enhanced metagenomic data analysis with functional annotation, high performance computing and advanced visualization.

    Science.gov (United States)

    Su, Xiaoquan; Pan, Weihua; Song, Baoxing; Xu, Jian; Ning, Kang

    2014-01-01

    The metagenomic method directly sequences and analyses genome information from microbial communities. The main computational tasks for metagenomic analyses include taxonomical and functional structure analysis for all genomes in a microbial community (also referred to as a metagenomic sample). With the advancement of Next Generation Sequencing (NGS) techniques, the number of metagenomic samples and the data size for each sample are increasing rapidly. Current metagenomic analysis is both data- and computation- intensive, especially when there are many species in a metagenomic sample, and each has a large number of sequences. As such, metagenomic analyses require extensive computational power. The increasing analytical requirements further augment the challenges for computation analysis. In this work, we have proposed Parallel-META 2.0, a metagenomic analysis software package, to cope with such needs for efficient and fast analyses of taxonomical and functional structures for microbial communities. Parallel-META 2.0 is an extended and improved version of Parallel-META 1.0, which enhances the taxonomical analysis using multiple databases, improves computation efficiency by optimized parallel computing, and supports interactive visualization of results in multiple views. Furthermore, it enables functional analysis for metagenomic samples including short-reads assembly, gene prediction and functional annotation. Therefore, it could provide accurate taxonomical and functional analyses of the metagenomic samples in high-throughput manner and on large scale.

  6. A High Performance Computing Approach to the Simulation of Fluid Solid Interaction Problems with Rigid and Flexible Components (Open Access Publisher’s Version)

    Science.gov (United States)

    2014-08-01

    HIGH PERFORMANCE COMPUTING APPROACH TO THE SIMULATION OF FLUID-SOLID INTERACTION PROBLEMS WITH RIGID AND FLEXIBLE COMPONENTS This work outlines a unified ...speed and architecture, memory layout and capacity, and power efficiency have motivated a trend of re-evaluating simulation algorithms with an eye...leverage any multi-threaded architecture, the CUDA library [26] was employed for the execution of all solution components on the GPU, with negligible

  7. Distributed computing by oblivious mobile robots

    CERN Document Server

    Flocchini, Paola; Santoro, Nicola

    2012-01-01

    The study of what can be computed by a team of autonomous mobile robots, originally started in robotics and AI, has become increasingly popular in theoretical computer science (especially in distributed computing), where it is now an integral part of the investigations on computability by mobile entities. The robots are identical computational entities located and able to move in a spatial universe; they operate without explicit communication and are usually unable to remember the past; they are extremely simple, with limited resources, and individually quite weak. However, collectively the ro

  8. Distributed computer systems theory and practice

    CERN Document Server

    Zedan, H S M

    2014-01-01

    Distributed Computer Systems: Theory and Practice is a collection of papers dealing with the design and implementation of operating systems, including distributed systems, such as the amoeba system, argus, Andrew, and grapevine. One paper discusses the concepts and notations for concurrent programming, particularly language notation used in computer programming, synchronization methods, and also compares three classes of languages. Another paper explains load balancing or load redistribution to improve system performance, namely, static balancing and adaptive load balancing. For program effici

  9. Impossibility results for distributed computing

    CERN Document Server

    Attiya, Hagit

    2014-01-01

    To understand the power of distributed systems, it is necessary to understand their inherent limitations: what problems cannot be solved in particular systems, or without sufficient resources (such as time or space). This book presents key techniques for proving such impossibility results and applies them to a variety of different problems in a variety of different system models. Insights gained from these results are highlighted, aspects of a problem that make it difficult are isolated, features of an architecture that make it inadequate for solving certain problems efficiently are identified

  10. High-Performance Networking

    CERN Document Server

    CERN. Geneva

    2003-01-01

    The series will start with an historical introduction about what people saw as high performance message communication in their time and how that developed to the now to day known "standard computer network communication". It will be followed by a far more technical part that uses the High Performance Computer Network standards of the 90's, with 1 Gbit/sec systems as introduction for an in depth explanation of the three new 10 Gbit/s network and interconnect technology standards that exist already or emerge. If necessary for a good understanding some sidesteps will be included to explain important protocols as well as some necessary details of concerned Wide Area Network (WAN) standards details including some basics of wavelength multiplexing (DWDM). Some remarks will be made concerning the rapid expanding applications of networked storage.

  11. ATLAS Distributed Computing in LHC Run2

    CERN Document Server

    Campana, Simone; The ATLAS collaboration

    2015-01-01

    The ATLAS Distributed Computing infrastructure has evolved after the first period of LHC data taking in order to cope with the challenges of the upcoming LHC Run2. An increased data rate and computing demands of the Monte-Carlo simulation, as well as new approaches to ATLAS analysis, dictated a more dynamic workload management system (ProdSys2) and data management system (Rucio), overcoming the boundaries imposed by the design of the old computing model. In particular, the commissioning of new central computing system components was the core part of the migration toward the flexible computing model. The flexible computing utilization exploring the opportunistic resources such as HPC, cloud, and volunteer computing is embedded in the new computing model, the data access mechanisms have been enhanced with the remote access, and the network topology and performance is deeply integrated into the core of the system. Moreover a new data management strategy, based on defined lifetime for each dataset, has been defin...

  12. LHCb: LHCb Distributed Computing Operations

    CERN Multimedia

    Stagni, F

    2011-01-01

    The proliferation of tools for monitoring both activities and infrastructure, together with the pressing need for prompt reaction in case of problems impacting data taking, data reconstruction, data reprocessing and user analysis brought to the need of better organizing the huge amount of information available. The monitoring system for the LHCb Grid Computing relies on many heterogeneous and independent sources of information offering different views for a better understanding of problems while an operations team and defined procedures have been put in place to handle them. This work summarizes the state-of-the-art of LHCb Grid operations emphasizing the reasons that brought to various choices and what are the tools currently in use to run our daily activities. We highlight the most common problems experienced across years of activities on the WLCG infrastructure, the services with their criticality, the procedures in place, the relevant metrics and the tools available and the ones still missing.

  13. Mobile Agents in Networking and Distributed Computing

    CERN Document Server

    Cao, Jiannong

    2012-01-01

    The book focuses on mobile agents, which are computer programs that can autonomously migrate between network sites. This text introduces the concepts and principles of mobile agents, provides an overview of mobile agent technology, and focuses on applications in networking and distributed computing.

  14. Inulin in Medicinal Plants (IV) : Reversed-Phase High-Performance Liquid Chromatography of Inulin after Acetylation : Molecular-Weight Distribution of Inulin in Medicinal Plants

    OpenAIRE

    三野, 芳紀; 筒井, 聡美; 太田, 長世; YOSHIKI, MINO; SATOMI, TSUTSUI; NAGAYO, OTA; 大阪薬科大学; Osaka College of Pharmacy

    1985-01-01

    Reversed-phase high-performance liquid chromatography coupled with pre-acetylation enabled acculate molecular-weight assay of inulin in medicinal plants to be conducted. The results clearly showed that the molecular-weight distribution of inulin varied depending on the stage of growth: Small molecular weight inulin polymers were detected in large quantity in the earlier growth stage whereas large molecular weight inulin polymers at the flowering and post flowering period.

  15. Lawrence Livermore National Laboratories Perspective on Code Development and High Performance Computing Resources in Support of the National HED/ICF Effort

    Energy Technology Data Exchange (ETDEWEB)

    Clouse, C. J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Edwards, M. J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); McCoy, M. G. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Marinak, M. M. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Verdon, C. P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-07-07

    Through its Advanced Scientific Computing (ASC) and Inertial Confinement Fusion (ICF) code development efforts, Lawrence Livermore National Laboratory (LLNL) provides a world leading numerical simulation capability for the National HED/ICF program in support of the Stockpile Stewardship Program (SSP). In addition the ASC effort provides high performance computing platform capabilities upon which these codes are run. LLNL remains committed to, and will work with, the national HED/ICF program community to help insure numerical simulation needs are met and to make those capabilities available, consistent with programmatic priorities and available resources.

  16. Study on High Performance of MPI-Based Parallel FDTD from WorkStation to Super Computer Platform

    Directory of Open Access Journals (Sweden)

    Z. L. He

    2012-01-01

    Full Text Available Parallel FDTD method is applied to analyze the electromagnetic problems of the electrically large targets on super computer. It is well known that the more the number of processors the less computing time consumed. Nevertheless, with the same number of processors, computing efficiency is affected by the scheme of the MPI virtual topology. Then, the influence of different virtual topology schemes on parallel performance of parallel FDTD is studied in detail. The general rules are presented on how to obtain the highest efficiency of parallel FDTD algorithm by optimizing MPI virtual topology. To show the validity of the presented method, several numerical results are given in the later part. Various comparisons are made and some useful conclusions are summarized.

  17. Masterless Distributed Computing Over Mobile Devices

    Science.gov (United States)

    2012-09-01

    SCIENCE IN COMPUTER SCIENCE and MASTER OF SCIENCE IN APPLIED MATHEMATICS from the NAVAL POSTGRADUATE SCHOOL September 2012 Author...of Computer Science Carlos Borges Chair, Department of Applied Mathematics iv THIS PAGE INTENTIONALLY LEFT BLANK v ABSTRACT It is obvious...vol. 51, no. 1, pp. 107–113, 2008. [9] C. Shi, V. Lakafosis, and M. Ammar, “ Serendipity : A distributed computing platform for disruption tolerant

  18. Grand Challenges 1993: High Performance Computing and Communications. A Report by the Committee on Physical, Mathematical, and Engineering Sciences. The FY 1993 U.S. Research and Development Program.

    Science.gov (United States)

    Office of Science and Technology Policy, Washington, DC.

    This report presents the United States research and development program for 1993 for high performance computing and computer communications (HPCC) networks. The first of four chapters presents the program goals and an overview of the federal government's emphasis on high performance computing as an important factor in the nation's scientific and…

  19. Fitting adsorption isotherms to the distribution data determined using packed micro-columns for high-performance liquid chromatography.

    Science.gov (United States)

    Jandera, P; Bunceková, S; Mihlbachler, K; Guiochon, G; Backovská, V; Planeta, J

    2001-08-03

    Knowing the adsorption isotherms of the components of a mixture on the chromatographic system used to separate them is necessary for a better understanding of the separation process and for the optimization of the production rate and costs in preparative high-performance liquid chromatography (HPLC). Currently, adsorption isotherms are usually measured by frontal analysis, using conventional analytical columns. Unfortunately, this approach requires relatively large quantities of pure compounds, and hence is expensive, especially in the case of pure enantiomers. In this work, we investigated the possible use of packed micro-bore and capillary HPLC columns for the determination of adsorption isotherms of benzophenone, o-cresol and phenol in reversed-phase systems and of the enantiomers of mandelic acid on a Teicoplanin chiral stationary phase. We found a reasonable agreement between the isotherm coefficients of the model compounds determined on micro-columns and on conventional analytical columns packed with the same material. Both frontal analysis and perturbation techniques could be used for this determination. The consumption of pure compounds needed to determine the isotherms decreases proportionally to the second power of the decrease in the column inner diameter, i.e. 10 times for a micro-bore column (1 mm I.D.) and 100 times for capillary columns (0.32 mm I.D.) with respect to 3.3 mm I.D. conventional columns.

  20. Distributed fiber optic sensor-enhanced detection and prediction of shrinkage-induced delamination of ultra-high-performance concrete overlay

    Science.gov (United States)

    Bao, Yi; Valipour, Mahdi; Meng, Weina; Khayat, Kamal H.; Chen, Genda

    2017-08-01

    This study develops a delamination detection system for smart ultra-high-performance concrete (UHPC) overlays using a fully distributed fiber optic sensor. Three 450 mm (length) × 200 mm (width) × 25 mm (thickness) UHPC overlays were cast over an existing 200 mm thick concrete substrate. The initiation and propagation of delamination due to early-age shrinkage of the UHPC overlay were detected as sudden increases and their extension in spatial distribution of shrinkage-induced strains measured from the sensor based on pulse pre-pump Brillouin optical time domain analysis. The distributed sensor is demonstrated effective in detecting delamination openings from microns to hundreds of microns. A three-dimensional finite element model with experimental material properties is proposed to understand the complete delamination process measured from the distributed sensor. The model is validated using the distributed sensor data. The finite element model with cohesive elements for the overlay-substrate interface can predict the complete delamination process.