high-performance computing systems: Topics by WorldWideScience.org

Sample records for high-performance computing systems

Quantum Accelerators for High-performance Computing Systems

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

2017-11-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.
Optical interconnection networks for high-performance computing systems

International Nuclear Information System (INIS)

Biberman, Aleksandr; Bergman, Keren

2012-01-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)
Embedded High Performance Scalable Computing Systems

National Research Council Canada - National Science Library

Ngo, David

2003-01-01

The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...
Monitoring SLAC High Performance UNIX Computing Systems

International Nuclear Information System (INIS)

Lettsome, Annette K.

2005-01-01

Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia. Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface
Software Systems for High-performance Quantum Computing

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S [ORNL; Britt, Keith A [ORNL

2016-01-01

Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventional computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.
High performance computing on vector systems

CERN Document Server

Roller, Sabine

2008-01-01

Presents the developments in high-performance computing and simulation on modern supercomputer architectures. This book covers trends in hardware and software development in general and specifically the vector-based systems and heterogeneous architectures. It presents innovative fields like coupled multi-physics or multi-scale simulations.
Quantum Accelerators for High-Performance Computing Systems

OpenAIRE

Britt, Keith A.; Mohiyaddin, Fahd A.; Humble, Travis S.

2017-01-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantu...
High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael; HLRS 2017

2018-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
DOE research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-12-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
A Heterogeneous High-Performance System for Computational and Computer Science

Science.gov (United States)

2016-11-15

expand the research infrastructure at the institution but also to enhance the high -performance computing training provided to both undergraduate and... cloud computing, supercomputing, and the availability of cheap memory and storage led to enormous amounts of data to be sifted through in forensic... High -Performance Computing (HPC) tools that can be integrated with existing curricula and support our research to modernize and dramatically advance
High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
NCI's High Performance Computing (HPC) and High Performance Data (HPD) Computing Platform for Environmental and Earth System Data Science

Science.gov (United States)

Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

2015-04-01

The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially
High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2003-01-01

This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.
Department of Energy research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-08-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...
The high performance cluster computing system for BES offline data analysis

International Nuclear Information System (INIS)

Sun Yongzhao; Xu Dong; Zhang Shaoqiang; Yang Ting

2004-01-01

A high performance cluster computing system (EPCfarm) is introduced, which used for BES offline data analysis. The setup and the characteristics of the hardware and software of EPCfarm are described. The PBS, a queue management package, and the performance of EPCfarm is presented also. (authors)
High performance systems

Energy Technology Data Exchange (ETDEWEB)

Vigil, M.B. [comp.

1995-03-01

This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.
High Performance Computing in Science and Engineering '14

CERN Document Server

Kröner, Dietmar; Resch, Michael

2015-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High performance parallel computers for science

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

1989-01-01

This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction

Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2013-01-01

Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the
A checkpoint compression study for high-performance computing systems

Energy Technology Data Exchange (ETDEWEB)

Ibtesham, Dewan [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science; Ferreira, Kurt B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States). Scalable System Software Dept.; Arnold, Dorian [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science

2015-02-17

As high-performance computing systems continue to increase in size and complexity, higher failure rates and increased overheads for checkpoint/restart (CR) protocols have raised concerns about the practical viability of CR protocols for future systems. Previously, compression has proven to be a viable approach for reducing checkpoint data volumes and, thereby, reducing CR protocol overhead leading to improved application performance. In this article, we further explore compression-based CR optimization by exploring its baseline performance and scaling properties, evaluating whether improved compression algorithms might lead to even better application performance and comparing checkpoint compression against and alongside other software- and hardware-based optimizations. Our results highlights are: (1) compression is a very viable CR optimization; (2) generic, text-based compression algorithms appear to perform near optimally for checkpoint data compression and faster compression algorithms will not lead to better application performance; (3) compression-based optimizations fare well against and alongside other software-based optimizations; and (4) while hardware-based optimizations outperform software-based ones, they are not as cost effective.
High performance computing in power and energy systems

Energy Technology Data Exchange (ETDEWEB)

Khaitan, Siddhartha Kumar [Iowa State Univ., Ames, IA (United States); Gupta, Anshul (eds.) [IBM Watson Research Center, Yorktown Heights, NY (United States)

2013-07-01

The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, cascading and security analysis, interaction between hybrid systems (electric, transport, gas, oil, coal, etc.) and so on, to get meaningful information in real time to ensure a secure, reliable and stable power system grid. Advanced research on development and implementation of market-ready leading-edge high-speed enabling technologies and algorithms for solving real-time, dynamic, resource-critical problems will be required for dynamic security analysis targeted towards successful implementation of Smart Grid initiatives. This books aims to bring together some of the latest research developments as well as thoughts on the future research directions of the high performance computing applications in electric power systems planning, operations, security, markets, and grid integration of alternate sources of energy, etc.
High-Performance Operating Systems

DEFF Research Database (Denmark)

Sharp, Robin

1999-01-01

Notes prepared for the DTU course 49421 "High Performance Operating Systems". The notes deal with quantitative and qualitative techniques for use in the design and evaluation of operating systems in computer systems for which performance is an important parameter, such as real-time applications......, communication systems and multimedia systems....
High-performance computing using FPGAs

CERN Document Server

Benkrid, Khaled

2013-01-01

This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community. The book includes: Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation. Seven architecture chapters which...
Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

Science.gov (United States)

Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

2013-01-01

With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
High performance computing in linear control

International Nuclear Information System (INIS)

Datta, B.N.

1993-01-01

Remarkable progress has been made in both theory and applications of all important areas of control. The theory is rich and very sophisticated. Some beautiful applications of control theory are presently being made in aerospace, biomedical engineering, industrial engineering, robotics, economics, power systems, etc. Unfortunately, the same assessment of progress does not hold in general for computations in control theory. Control Theory is lagging behind other areas of science and engineering in this respect. Nowadays there is a revolution going on in the world of high performance scientific computing. Many powerful computers with vector and parallel processing have been built and have been available in recent years. These supercomputers offer very high speed in computations. Highly efficient software, based on powerful algorithms, has been developed to use on these advanced computers, and has also contributed to increased performance. While workers in many areas of science and engineering have taken great advantage of these hardware and software developments, control scientists and engineers, unfortunately, have not been able to take much advantage of these developments
Micromagnetics on high-performance workstation and mobile computational platforms

Science.gov (United States)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2000-01-01

The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.
Software Applications on the Peregrine System | High-Performance Computing

Science.gov (United States)

Algebraic Modeling System (GAMS) Statistics and analysis High-level modeling system for mathematical reactivity. Gurobi Optimizer Statistics and analysis Solver for mathematical programming LAMMPS Chemistry and , reactivities, and vibrational, electronic and NMR spectra. R Statistical Computing Environment Statistics and
Computational Biology and High Performance Computing 2000

Energy Technology Data Exchange (ETDEWEB)

Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

2000-10-19

The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.
Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2015-01-01

A continuation of Contemporary High Performance Computing: From Petascale toward Exascale, this second volume continues the discussion of HPC flagship systems, major application workloads, facilities, and sponsors. The book includes of figures and pictures that capture the state of existing systems: pictures of buildings, systems in production, floorplans, and many block diagrams and charts to illustrate system design and performance.
High Performance Networks From Supercomputing to Cloud Computing

CERN Document Server

Abts, Dennis

2011-01-01

Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati
Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa; Parashar, Manish; Kim, Hyunjoo; Jordan, Kirk E.; Sachdeva, Vipin; Sexton, James; Jamjoom, Hani; Shae, Zon-Yin; Pencheva, Gergina; Tavakoli, Reza; Wheeler, Mary F.

2012-01-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a
Peregrine System | High-Performance Computing | NREL

Science.gov (United States)

classes of nodes that users access: Login Nodes Peregrine has four login nodes, each of which has Intel E5 /scratch file systems, the /mss file system is mounted on all login nodes. Compute Nodes Peregrine has 2592
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Anon.

1991-03-15

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour.
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

International Nuclear Information System (INIS)

Anon.

1991-01-01

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour
High Performance Computing Operations Review Report

Energy Technology Data Exchange (ETDEWEB)

Cupps, Kimberly C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2013-12-19

The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.
Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa

2012-10-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a representative HPC application. © 2012 IEEE.
High-performance computing — an overview

Science.gov (United States)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...
High-performance computing in seismology

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-09-01

The scientific, technical, and economic importance of the issues discussed here presents a clear agenda for future research in computational seismology. In this way these problems will drive advances in high-performance computing in the field of seismology. There is a broad community that will benefit from this work, including the petroleum industry, research geophysicists, engineers concerned with seismic hazard mitigation, and governments charged with enforcing a comprehensive test ban treaty. These advances may also lead to new applications for seismological research. The recent application of high-resolution seismic imaging of the shallow subsurface for the environmental remediation industry is an example of this activity. This report makes the following recommendations: (1) focused efforts to develop validated documented software for seismological computations should be supported, with special emphasis on scalable algorithms for parallel processors; (2) the education of seismologists in high-performance computing technologies and methodologies should be improved; (3) collaborations between seismologists and computational scientists and engineers should be increased; (4) the infrastructure for archiving, disseminating, and processing large volumes of seismological data should be improved.
High performance parallel computers for science: New developments at the Fermilab advanced computer program

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs
High performance computing in Windows Azure cloud

OpenAIRE

Ambruš, Dejan

2013-01-01

High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...
Evaluation of high-performance computing software

Energy Technology Data Exchange (ETDEWEB)

Browne, S.; Dongarra, J. [Univ. of Tennessee, Knoxville, TN (United States); Rowan, T. [Oak Ridge National Lab., TN (United States)

1996-12-31

The absence of unbiased and up to date comparative evaluations of high-performance computing software complicates a user`s search for the appropriate software package. The National HPCC Software Exchange (NHSE) is attacking this problem using an approach that includes independent evaluations of software, incorporation of author and user feedback into the evaluations, and Web access to the evaluations. We are applying this approach to the Parallel Tools Library (PTLIB), a new software repository for parallel systems software and tools, and HPC-Netlib, a high performance branch of the Netlib mathematical software repository. Updating the evaluations with feed-back and making it available via the Web helps ensure accuracy and timeliness, and using independent reviewers produces unbiased comparative evaluations difficult to find elsewhere.
A high performance scientific cloud computing environment for materials simulations

OpenAIRE

Jorissen, Kevin; Vila, Fernando D.; Rehr, John J.

2011-01-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including...
Computer Simulation Performed for Columbia Project Cooling System

Science.gov (United States)

Ahmad, Jasim

2005-01-01

This demo shows a high-fidelity simulation of the air flow in the main computer room housing the Columbia (10,024 intel titanium processors) system. The simulation asseses the performance of the cooling system and identified deficiencies, and recommended modifications to eliminate them. It used two in house software packages on NAS supercomputers: Chimera Grid tools to generate a geometric model of the computer room, OVERFLOW-2 code for fluid and thermal simulation. This state-of-the-art technology can be easily extended to provide a general capability for air flow analyses on any modern computer room. Columbia_CFD_black.tiff
High performance computing and communications: Advancing the frontiers of information technology

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-12-31

This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.
A High Performance VLSI Computer Architecture For Computer Graphics

Science.gov (United States)

Chin, Chi-Yuan; Lin, Wen-Tai

1988-10-01

A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
Scalability of DL_POLY on High Performance Computing Platform

Directory of Open Access Journals (Sweden)

Mabule Samuel Mabakane

2017-12-01

Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.
Computational Environments and Analysis methods available on the NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform

Science.gov (United States)

Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.

2014-12-01

The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will
High-performance computing for airborne applications

International Nuclear Information System (INIS)

Quinn, Heather M.; Manuzatto, Andrea; Fairbanks, Tom; Dallmann, Nicholas; Desgeorges, Rose

2010-01-01

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.
Benchmarking high performance computing architectures with CMS’ skeleton framework

Science.gov (United States)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
8th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2015-01-01

Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.
High performance computing in science and engineering Garching/Munich 2016

Energy Technology Data Exchange (ETDEWEB)

Wagner, Siegfried; Bode, Arndt; Bruechle, Helmut; Brehm, Matthias (eds.)

2016-11-01

Computer simulations are the well-established third pillar of natural sciences along with theory and experimentation. Particularly high performance computing is growing fast and constantly demands more and more powerful machines. To keep pace with this development, in spring 2015, the Leibniz Supercomputing Centre installed the high performance computing system SuperMUC Phase 2, only three years after the inauguration of its sibling SuperMUC Phase 1. Thereby, the compute capabilities were more than doubled. This book covers the time-frame June 2014 until June 2016. Readers will find many examples of outstanding research in the more than 130 projects that are covered in this book, with each one of these projects using at least 4 million core-hours on SuperMUC. The largest scientific communities using SuperMUC in the last two years were computational fluid dynamics simulations, chemistry and material sciences, astrophysics, and life sciences.
High Performance Computing - Power Application Programming Interface Specification.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H.,; Kelly, Suzanne M.; Pedretti, Kevin; Grant, Ryan; Olivier, Stephen Lecler; Levenhagen, Michael J.; DeBonis, David

2014-08-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
High performance parallel computing of flows in complex geometries: II. Applications

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F; Poinsot, T

2009-01-01

Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.
A high performance scientific cloud computing environment for materials simulations

Science.gov (United States)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
High-performance mass storage system for workstations

Science.gov (United States)

Chiang, T.; Tang, Y.; Gupta, L.; Cooperman, S.

1993-01-01

Reduced Instruction Set Computer (RISC) workstations and Personnel Computers (PC) are very popular tools for office automation, command and control, scientific analysis, database management, and many other applications. However, when using Input/Output (I/O) intensive applications, the RISC workstations and PC's are often overburdened with the tasks of collecting, staging, storing, and distributing data. Also, by using standard high-performance peripherals and storage devices, the I/O function can still be a common bottleneck process. Therefore, the high-performance mass storage system, developed by Loral AeroSys' Independent Research and Development (IR&D) engineers, can offload a RISC workstation of I/O related functions and provide high-performance I/O functions and external interfaces. The high-performance mass storage system has the capabilities to ingest high-speed real-time data, perform signal or image processing, and stage, archive, and distribute the data. This mass storage system uses a hierarchical storage structure, thus reducing the total data storage cost, while maintaining high-I/O performance. The high-performance mass storage system is a network of low-cost parallel processors and storage devices. The nodes in the network have special I/O functions such as: SCSI controller, Ethernet controller, gateway controller, RS232 controller, IEEE488 controller, and digital/analog converter. The nodes are interconnected through high-speed direct memory access links to form a network. The topology of the network is easily reconfigurable to maximize system throughput for various applications. This high-performance mass storage system takes advantage of a 'busless' architecture for maximum expandability. The mass storage system consists of magnetic disks, a WORM optical disk jukebox, and an 8mm helical scan tape to form a hierarchical storage structure. Commonly used files are kept in the magnetic disk for fast retrieval. The optical disks are used as archive
Debugging a high performance computing program

Science.gov (United States)

Gooding, Thomas M.

2013-08-20

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.

High-performance scientific computing in the cloud

Science.gov (United States)

Jorissen, Kevin; Vila, Fernando; Rehr, John

2011-03-01

Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M.F.; Ethier, S.; Wichmann, N.

2009-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores.
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M F; Ethier, S; Wichmann, N

2007-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

Energy Technology Data Exchange (ETDEWEB)

Qin, J; Bauer, M A, E-mail: qin.jinhui@gmail.com, E-mail: bauer@uwo.ca [Computer Science Department, University of Western Ontario, London, ON N6A 5B7 (Canada)

2010-11-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

International Nuclear Information System (INIS)

Qin, J; Bauer, M A

2010-01-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

KAUST Repository

Danani, Bob K.

2012-07-01

Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.
Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

Science.gov (United States)

Ahn, Sul-Ah; Jung, Youngim

2016-10-01

The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.
High-performance computing on GPUs for resistivity logging of oil and gas wells

Science.gov (United States)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

International Nuclear Information System (INIS)

Pop, Florin

2014-01-01

Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.
Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

Energy Technology Data Exchange (ETDEWEB)

Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

2017-07-14

As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Science.gov (United States)

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Directory of Open Access Journals (Sweden)

David K Brown

Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

Science.gov (United States)

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
Unravelling the structure of matter on high-performance computers

International Nuclear Information System (INIS)

Kieu, T.D.; McKellar, B.H.J.

1992-11-01

The various phenomena and the different forms of matter in nature are believed to be the manifestation of only a handful set of fundamental building blocks-the elementary particles-which interact through the four fundamental forces. In the study of the structure of matter at this level one has to consider forces which are not sufficiently weak to be treated as small perturbations to the system, an example of which is the strong force that binds the nucleons together. High-performance computers, both vector and parallel machines, have facilitated the necessary non-perturbative treatments. The principles and the techniques of computer simulations applied to Quantum Chromodynamics are explained examples include the strong interactions, the calculation of the mass of nucleons and their decay rates. Some commercial and special-purpose high-performance machines for such calculations are also mentioned. 3 refs., 2 tabs
Analytical performance modeling for computer systems

CERN Document Server

Tay, Y C

2013-01-01

This book is an introduction to analytical performance modeling for computer systems, i.e., writing equations to describe their performance behavior. It is accessible to readers who have taken college-level courses in calculus and probability, networking and operating systems. This is not a training manual for becoming an expert performance analyst. Rather, the objective is to help the reader construct simple models for analyzing and understanding the systems that they are interested in.Describing a complicated system abstractly with mathematical equations requires a careful choice of assumpti
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda; Yokota, Rio; Keyes, David E.

2016-01-01

model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization
GPU-based high-performance computing for radiation therapy

International Nuclear Information System (INIS)

Jia, Xun; Jiang, Steve B; Ziegenhein, Peter

2014-01-01

Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. (topical review)
14th annual Results and Review Workshop on High Performance Computing in Science and Engineering

CERN Document Server

Nagel, Wolfgang E; Resch, Michael M; Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2011; High Performance Computing in Science and Engineering '11

2012-01-01

This book presents the state-of-the-art in simulation on supercomputers. Leading researchers present results achieved on systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2011. The reports cover all fields of computational science and engineering, ranging from CFD to computational physics and chemistry, to computer science, with a special emphasis on industrially relevant applications. Presenting results for both vector systems and microprocessor-based systems, the book allows readers to compare the performance levels and usability of various architectures. As HLRS
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

Cordes Ben

2009-01-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

2009-03-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.

Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

Science.gov (United States)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the
High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

1999-01-01

The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.
High-Performance Java Codes for Computational Fluid Dynamics

Science.gov (United States)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
A High Performance COTS Based Computer Architecture

Science.gov (United States)

Patte, Mathieu; Grimoldi, Raoul; Trautner, Roland

2014-08-01

Using Commercial Off The Shelf (COTS) electronic components for space applications is a long standing idea. Indeed the difference in processing performance and energy efficiency between radiation hardened components and COTS components is so important that COTS components are very attractive for use in mass and power constrained systems. However using COTS components in space is not straightforward as one must account with the effects of the space environment on the COTS components behavior. In the frame of the ESA funded activity called High Performance COTS Based Computer, Airbus Defense and Space and its subcontractor OHB CGS have developed and prototyped a versatile COTS based architecture for high performance processing. The rest of the paper is organized as follows: in a first section we will start by recapitulating the interests and constraints of using COTS components for space applications; then we will briefly describe existing fault mitigation architectures and present our solution for fault mitigation based on a component called the SmartIO; in the last part of the paper we will describe the prototyping activities executed during the HiP CBC project.
High Performance Computing Facility Operational Assessment, FY 2010 Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; White, Julia C [ORNL

2010-08-01

Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energy assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools
Hot Chips and Hot Interconnects for High End Computing Systems

Science.gov (United States)

Saini, Subhash

2005-01-01

I will discuss several processors: 1. The Cray proprietary processor used in the Cray X1; 2. The IBM Power 3 and Power 4 used in an IBM SP 3 and IBM SP 4 systems; 3. The Intel Itanium and Xeon, used in the SGI Altix systems and clusters respectively; 4. IBM System-on-a-Chip used in IBM BlueGene/L; 5. HP Alpha EV68 processor used in DOE ASCI Q cluster; 6. SPARC64 V processor, which is used in the Fujitsu PRIMEPOWER HPC2500; 7. An NEC proprietary processor, which is used in NEC SX-6/7; 8. Power 4+ processor, which is used in Hitachi SR11000; 9. NEC proprietary processor, which is used in Earth Simulator. The IBM POWER5 and Red Storm Computing Systems will also be discussed. The architectures of these processors will first be presented, followed by interconnection networks and a description of high-end computer systems based on these processors and networks. The performance of various hardware/programming model combinations will then be compared, based on latest NAS Parallel Benchmark results (MPI, OpenMP/HPF and hybrid (MPI + OpenMP). The tutorial will conclude with a discussion of general trends in the field of high performance computing, (quantum computing, DNA computing, cellular engineering, and neural networks).
Visualization and Data Analysis for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-09-27

This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

KAUST Repository

Downes, Turlough P.

2013-01-01

As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.
VLab: A Science Gateway for Distributed First Principles Calculations in Heterogeneous High Performance Computing Systems

Science.gov (United States)

da Silveira, Pedro Rodrigo Castro

2014-01-01

This thesis describes the development and deployment of a cyberinfrastructure for distributed high-throughput computations of materials properties at high pressures and/or temperatures--the Virtual Laboratory for Earth and Planetary Materials--VLab. VLab was developed to leverage the aggregated computational power of grid systems to solve…
High Performance Computing Modernization Program Kerberos Throughput Test Report

Science.gov (United States)

2017-10-26

Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5524--17-9751 High Performance Computing Modernization Program Kerberos Throughput Test ...NUMBER 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 2. REPORT TYPE1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 6. AUTHOR(S) 8. PERFORMING...PAGE 18. NUMBER OF PAGES 17. LIMITATION OF ABSTRACT High Performance Computing Modernization Program Kerberos Throughput Test Report Daniel G. Gdula* and
Cloud object store for checkpoints of high performance computing applications using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-04-19

Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

OpenAIRE

YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

2009-01-01

[ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...
Computer performance optimization systems, applications, processes

CERN Document Server

Osterhage, Wolfgang W

2013-01-01

Computing power performance was important at times when hardware was still expensive, because hardware had to be put to the best use. Later on this criterion was no longer critical, since hardware had become inexpensive. Meanwhile, however, people have realized that performance again plays a significant role, because of the major drain on system resources involved in developing complex applications. This book distinguishes between three levels of performance optimization: the system level, application level and business processes level. On each, optimizations can be achieved and cost-cutting p
Computed radiography systems performance evaluation

International Nuclear Information System (INIS)

Xavier, Clarice C.; Nersissian, Denise Y.; Furquim, Tania A.C.

2009-01-01

The performance of a computed radiography system was evaluated, according to the AAPM Report No. 93. Evaluation tests proposed by the publication were performed, and the following nonconformities were found: imaging p/ate (lP) dark noise, which compromises the clinical image acquired using the IP; exposure indicator uncalibrated, which can cause underexposure to the IP; nonlinearity of the system response, which causes overexposure; resolution limit under the declared by the manufacturer and erasure thoroughness uncalibrated, impairing structures visualization; Moire pattern visualized at the grid response, and IP Throughput over the specified by the manufacturer. These non-conformities indicate that digital imaging systems' lack of calibration can cause an increase in dose in order that image prob/ems can be so/ved. (author)
FPGA cluster for high-performance AO real-time control system

Science.gov (United States)

Geng, Deli; Goodsell, Stephen J.; Basden, Alastair G.; Dipper, Nigel A.; Myers, Richard M.; Saunter, Chris D.

2006-06-01

Whilst the high throughput and low latency requirements for the next generation AO real-time control systems have posed a significant challenge to von Neumann architecture processor systems, the Field Programmable Gate Array (FPGA) has emerged as a long term solution with high performance on throughput and excellent predictability on latency. Moreover, FPGA devices have highly capable programmable interfacing, which lead to more highly integrated system. Nevertheless, a single FPGA is still not enough: multiple FPGA devices need to be clustered to perform the required subaperture processing and the reconstruction computation. In an AO real-time control system, the memory bandwidth is often the bottleneck of the system, simply because a vast amount of supporting data, e.g. pixel calibration maps and the reconstruction matrix, need to be accessed within a short period. The cluster, as a general computing architecture, has excellent scalability in processing throughput, memory bandwidth, memory capacity, and communication bandwidth. Problems, such as task distribution, node communication, system verification, are discussed.
High Performance Computing Software Applications for Space Situational Awareness

Science.gov (United States)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
Performance characteristics of a Kodak computed radiography system.

Science.gov (United States)

Bradford, C D; Peppler, W W; Dobbins, J T

1999-01-01

The performance characteristics of a photostimulable phosphor based computed radiographic (CR) system were studied. The modulation transfer function (MTF), noise power spectra (NPS), and detective quantum efficiency (DQE) of the Kodak Digital Science computed radiography (CR) system (Eastman Kodak Co.-model 400) were measured and compared to previously published results of a Fuji based CR system (Philips Medical Systems-PCR model 7000). To maximize comparability, the same measurement techniques and analysis methods were used. The DQE at four exposure levels (30, 3, 0.3, 0.03 mR) and two plate types (standard and high resolution) were calculated from the NPS and MTF measurements. The NPS was determined from two-dimensional Fourier analysis of uniformly exposed plates. The presampling MTF was determined from the Fourier transform (FT) of the system's finely sampled line spread function (LSF) as produced by a narrow slit. A comparison of the slit type ("beveled edge" versus "straight edge") and its effect on the resulting MTF measurements was also performed. The results show that both systems are comparable in resolution performance. The noise power studies indicated a higher level of noise for the Kodak images (approximately 20% at the low exposure levels and 40%-70% at higher exposure levels). Within the clinically relevant exposure range (0.3-3 mR), the resulting DQE for the Kodak plates ranged between 20%-50% lower than for the corresponding Fuji plates. Measurements of the presampling MTF with the two slit types have shown that a correction factor can be applied to compensate for transmission through the relief edges.
High Performance Computing - Power Application Programming Interface Specification Version 2.0.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ward, H. Lee [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-03-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
DURIP: High Performance Computing in Biomathematics Applications

Science.gov (United States)

2017-05-10

Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied
AHPCRC - Army High Performance Computing Research Center

Science.gov (United States)

2010-01-01

computing. Of particular interest is the ability of a distrib- uted jamming network (DJN) to jam signals in all or part of a sensor or communications net...and reasoning, assistive technologies. FRIEDRICH (FRITZ) PRINZ Finmeccanica Professor of Engineering, Robert Bosch Chair, Department of Engineering...High Performance Computing Research Center www.ahpcrc.org BARBARA BRYAN AHPCRC Research and Outreach Manager, HPTi (650) 604-3732 bbryan@hpti.com Ms

Enabling high performance computational science through combinatorial algorithms

International Nuclear Information System (INIS)

Boman, Erik G; Bozdag, Doruk; Catalyurek, Umit V; Devine, Karen D; Gebremedhin, Assefaw H; Hovland, Paul D; Pothen, Alex; Strout, Michelle Mills

2007-01-01

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation
Enabling high performance computational science through combinatorial algorithms

Energy Technology Data Exchange (ETDEWEB)

Boman, Erik G [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Bozdag, Doruk [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Catalyurek, Umit V [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Devine, Karen D [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Gebremedhin, Assefaw H [Computer Science and Center for Computational Science, Old Dominion University (United States); Hovland, Paul D [Mathematics and Computer Science Division, Argonne National Laboratory (United States); Pothen, Alex [Computer Science and Center for Computational Science, Old Dominion University (United States); Strout, Michelle Mills [Computer Science, Colorado State University (United States)

2007-07-15

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation.
HiGIS: An Open Framework for High Performance Geographic Information System

Directory of Open Access Journals (Sweden)

XIONG, W.

2015-08-01

Full Text Available Big data era expose many challenges to geospatial data management, geocomputation and cartography. There is no exception in geographic information systems (GIS community. Technologies and facilities of high performance computing (HPC become more and more feasible to researchers, while mobile computing, ubiquitous computing, and cloud computing are emerging. But traditional GIS need to be improved to take advantages of all these evolutions. We proposed and implemented a GIS married with high performance computing, which is called HiGIS. The goal of HiGIS is to promote the performance of geocomputation by leveraging the power of HPC, and to build an open framework for geospatial data storing, processing, displaying and sharing. In this paper the architecture, data model and modules of the HiGIS system are introduced. A geocomputation scheduling engine based on communicating sequential process was designed to exploit spatial analysis and processing. Parallel I/O strategy using file view was proposed to improve the performance of geospatial raster data access. In order to support web-based online mapping, an interactive cartographic script was provided to represent a map. A demostration of locating house was used to manifest the characteristics of HiGIS. Parallel and concurrency performance experiments show the feasibility of this system.
BurstMem: A High-Performance Burst Buffer System for Scientific Applications

Energy Technology Data Exchange (ETDEWEB)

Wang, Teng [Auburn University, Auburn, Alabama; Oral, H Sarp [ORNL; Wang, Yandong [Auburn University, Auburn, Alabama; Settlemyer, Bradley W [ORNL; Atchley, Scott [ORNL; Yu, Weikuan [Auburn University, Auburn, Alabama

2014-01-01

The growth of computing power on large-scale sys- tems requires commensurate high-bandwidth I/O system. Many parallel file systems are designed to provide fast sustainable I/O in response to applications soaring requirements. To meet this need, a novel system is imperative to temporarily buffer the bursty I/O and gradually flush datasets to long-term parallel file systems. In this paper, we introduce the design of BurstMem, a high- performance burst buffer system. BurstMem provides a storage framework with efficient storage and communication manage- ment strategies. Our experiments demonstrate that BurstMem is able to speed up the I/O performance of scientific applications by up to 8.5 on leadership computer systems.
WinHPC System Configuration | High-Performance Computing | NREL

Science.gov (United States)

), login node (WinHPC02) and worker/compute nodes. The head node acts as the file, DNS, and license server . The login node is where the users connect to access the cluster. Node 03 has dual Intel Xeon E5530 2008 R2 HPC Edition. The login node, WinHPC02, is where users login to access the system. This is where
Management issues for high performance storage systems

Energy Technology Data Exchange (ETDEWEB)

Louis, S. [Lawrence Livermore National Lab., CA (United States); Burris, R. [Oak Ridge National Lab., TN (United States)

1995-03-01

Managing distributed high-performance storage systems is complex and, although sharing common ground with traditional network and systems management, presents unique storage-related issues. Integration technologies and frameworks exist to help manage distributed network and system environments. Industry-driven consortia provide open forums where vendors and users cooperate to leverage solutions. But these new approaches to open management fall short addressing the needs of scalable, distributed storage. We discuss the motivation and requirements for storage system management (SSM) capabilities and describe how SSM manages distributed servers and storage resource objects in the High Performance Storage System (HPSS), a new storage facility for data-intensive applications and large-scale computing. Modem storage systems, such as HPSS, require many SSM capabilities, including server and resource configuration control, performance monitoring, quality of service, flexible policies, file migration, file repacking, accounting, and quotas. We present results of initial HPSS SSM development including design decisions and implementation trade-offs. We conclude with plans for follow-on work and provide storage-related recommendations for vendors and standards groups seeking enterprise-wide management solutions.
RISC Processors and High Performance Computing

Science.gov (United States)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Computer controlled high voltage system

Energy Technology Data Exchange (ETDEWEB)

Kunov, B; Georgiev, G; Dimitrov, L [and others

1996-12-31

A multichannel computer controlled high-voltage power supply system is developed. The basic technical parameters of the system are: output voltage -100-3000 V, output current - 0-3 mA, maximum number of channels in one crate - 78. 3 refs.
Using high performance interconnects in a distributed computing and mass storage environment

International Nuclear Information System (INIS)

Ernst, M.

1994-01-01

Detector Collaborations of the HERA Experiments typically involve more than 500 physicists from a few dozen institutes. These physicists require access to large amounts of data in a fully transparent manner. Important issues include Distributed Mass Storage Management Systems in a Distributed and Heterogeneous Computing Environment. At the very center of a distributed system, including tens of CPUs and network attached mass storage peripherals are the communication links. Today scientists are witnessing an integration of computing and communication technology with the open-quote network close-quote becoming the computer. This contribution reports on a centrally operated computing facility for the HERA Experiments at DESY, including Symmetric Multiprocessor Machines (84 Processors), presently more than 400 GByte of magnetic disk and 40 TB of automoted tape storage, tied together by a HIPPI open-quote network close-quote. Focussing on the High Performance Interconnect technology, details will be provided about the HIPPI based open-quote Backplane close-quote configured around a 20 Gigabit/s Multi Media Router and the performance and efficiency of the related computer interfaces
Lightweight Provenance Service for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

2017-09-09

Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda

2016-03-04

Exascale systems are predicted to have approximately 1 billion cores, assuming gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics but has recently been extended to a wider range of problems. Its high arithmetic intensity combined with its linear complexity and asynchronous communication patterns make it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on internode communication. We focus on the communication part only; the efficiency of the computational kernels are beyond the scope of the present study. We develop a performance model that considers the communication patterns of the FMM and observe a good match between our model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization of internode communication in FMM that validates the model against actual measurements of communication time. The ultimate communication model is predictive in an absolute sense; however, on complex systems, this objective is often out of reach or of a difficulty out of proportion to its benefit when there exists a simpler model that is inexpensive and sufficient to guide coding decisions leading to improved scaling. The current model provides such guidance.
Integrated State Estimation and Contingency Analysis Software Implementation using High Performance Computing Techniques

Energy Technology Data Exchange (ETDEWEB)

Chen, Yousu; Glaesemann, Kurt R.; Rice, Mark J.; Huang, Zhenyu

2015-12-31

Power system simulation tools are traditionally developed in sequential mode and codes are optimized for single core computing only. However, the increasing complexity in the power grid models requires more intensive computation. The traditional simulation tools will soon not be able to meet the grid operation requirements. Therefore, power system simulation tools need to evolve accordingly to provide faster and better results for grid operations. This paper presents an integrated state estimation and contingency analysis software implementation using high performance computing techniques. The software is able to solve large size state estimation problems within one second and achieve a near-linear speedup of 9,800 with 10,000 cores for contingency analysis application. The performance evaluation is presented to show its effectiveness.
Trends in high-performance computing for engineering calculations.

Science.gov (United States)

Giles, M B; Reguly, I

2014-08-13

High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
US QCD computational performance studies with PERI

International Nuclear Information System (INIS)

Zhang, Y; Fowler, R; Huck, K; Malony, A; Porterfield, A; Reed, D; Shende, S; Taylor, V; Wu, X

2007-01-01

We report on some of the interactions between two SciDAC projects: The National Computational Infrastructure for Lattice Gauge Theory (USQCD), and the Performance Engineering Research Institute (PERI). Many modern scientific programs consistently report the need for faster computational resources to maintain global competitiveness. However, as the size and complexity of emerging high end computing (HEC) systems continue to rise, achieving good performance on such systems is becoming ever more challenging. In order to take full advantage of the resources, it is crucial to understand the characteristics of relevant scientific applications and the systems these applications are running on. Using tools developed under PERI and by other performance measurement researchers, we studied the performance of two applications, MILC and Chroma, on several high performance computing systems at DOE laboratories. In the case of Chroma, we discuss how the use of C++ and modern software engineering and programming methods are driving the evolution of performance tools
10th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2017-01-01

This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.
High Performance Computing Facility Operational Assessment 2015: Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Barker, Ashley D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bland, Arthur S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Gary, Jeff D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Hack, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; McNally, Stephen T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Rogers, James H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Smith, Brian E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Straatsma, T. P. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Sukumar, Sreenivas Rangan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Thach, Kevin G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Tichenor, Suzy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Vazhkudai, Sudharshan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Wells, Jack C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility

2016-03-01

Oak Ridge National Laboratory’s (ORNL’s) Leadership Computing Facility (OLCF) continues to surpass its operational target goals: supporting users; delivering fast, reliable systems; creating innovative solutions for high-performance computing (HPC) needs; and managing risks, safety, and security aspects associated with operating one of the most powerful computers in the world. The results can be seen in the cutting-edge science delivered by users and the praise from the research community. Calendar year (CY) 2015 was filled with outstanding operational results and accomplishments: a very high rating from users on overall satisfaction that ties the highest-ever mark set in CY 2014; the greatest number of core-hours delivered to research projects; the largest percentage of capability usage since the OLCF began tracking the metric in 2009; and success in delivering on the allocation of 60, 30, and 10% of core hours offered for the INCITE (Innovative and Novel Computational Impact on Theory and Experiment), ALCC (Advanced Scientific Computing Research Leadership Computing Challenge), and Director’s Discretionary programs, respectively. These accomplishments, coupled with the extremely high utilization rate, represent the fulfillment of the promise of Titan: maximum use by maximum-size simulations. The impact of all of these successes and more is reflected in the accomplishments of OLCF users, with publications this year in notable journals Nature, Nature Materials, Nature Chemistry, Nature Physics, Nature Climate Change, ACS Nano, Journal of the American Chemical Society, and Physical Review Letters, as well as many others. The achievements included in the 2015 OLCF Operational Assessment Report reflect first-ever or largest simulations in their communities; for example Titan enabled engineers in Los Angeles and the surrounding region to design and begin building improved critical infrastructure by enabling the highest-resolution Cybershake map for Southern
High-performance computing in accelerating structure design and analysis

International Nuclear Information System (INIS)

Li Zenghai; Folwell, Nathan; Ge Lixin; Guetz, Adam; Ivanov, Valentin; Kowalski, Marc; Lee, Lie-Quan; Ng, Cho-Kuen; Schussman, Greg; Stingelin, Lukas; Uplenchwar, Ravindra; Wolf, Michael; Xiao, Liling; Ko, Kwok

2006-01-01

Future high-energy accelerators such as the Next Linear Collider (NLC) will accelerate multi-bunch beams of high current and low emittance to obtain high luminosity, which put stringent requirements on the accelerating structures for efficiency and beam stability. While numerical modeling has been quite standard in accelerator R and D, designing the NLC accelerating structure required a new simulation capability because of the geometric complexity and level of accuracy involved. Under the US DOE Advanced Computing initiatives (first the Grand Challenge and now SciDAC), SLAC has developed a suite of electromagnetic codes based on unstructured grids and utilizing high-performance computing to provide an advanced tool for modeling structures at accuracies and scales previously not possible. This paper will discuss the code development and computational science research (e.g. domain decomposition, scalable eigensolvers, adaptive mesh refinement) that have enabled the large-scale simulations needed for meeting the computational challenges posed by the NLC as well as projects such as the PEP-II and RIA. Numerical results will be presented to show how high-performance computing has made a qualitative improvement in accelerator structure modeling for these accelerators, either at the component level (single cell optimization), or on the scale of an entire structure (beam heating and long-range wakefields)
Cloud object store for archive storage of high performance computing data using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2015-06-30

Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
Enabling the ATLAS Experiment at the LHC for High Performance Computing

CERN Document Server

AUTHOR|(CDS)2091107; Ereditato, Antonio

In this thesis, I studied the feasibility of running computer data analysis programs from the Worldwide LHC Computing Grid, in particular large-scale simulations of the ATLAS experiment at the CERN LHC, on current general purpose High Performance Computing (HPC) systems. An approach for integrating HPC systems into the Grid is proposed, which has been implemented and tested on the „Todi” HPC machine at the Swiss National Supercomputing Centre (CSCS). Over the course of the test, more than 500000 CPU-hours of processing time have been provided to ATLAS, which is roughly equivalent to the combined computing power of the two ATLAS clusters at the University of Bern. This showed that current HPC systems can be used to efficiently run large-scale simulations of the ATLAS detector and of the detected physics processes. As a first conclusion of my work, one can argue that, in perspective, running large-scale tasks on a few large machines might be more cost-effective than running on relatively small dedicated com...
7th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Nagel, Wolfgang; Resch, Michael

2014-01-01

Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.

High Performance Computing - Power Application Programming Interface Specification Version 1.4

Energy Technology Data Exchange (ETDEWEB)

Laros III, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); DeBonis, David [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2016-10-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

International Nuclear Information System (INIS)

Desai, Ajit; Pettit, Chris; Poirel, Dominique; Sarkar, Abhijit

2017-01-01

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.
A high-performance data acquisition system for computer-based multichannel analyzer

International Nuclear Information System (INIS)

Zhou Xinzhi; Bai Rongsheng; Wen Liangbi; Huang Yanwen

1996-01-01

A high-performance data acquisition system applied in the multichannel analyzer is designed with single-chip microcomputer system. The paper proposes the principle and the method of realizing the simultaneous data acquisition, the data pre-processing, and the fast bidirectional data transfer by means of direct memory access based on dual-port RAM as well. The measurement for dead or live time of ADC system can also be implemented efficiently by using it
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

Science.gov (United States)

Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

2012-09-01

, and so the costs of running applications vary widely according to how they use resources. The cloud is well suited to processing CPU-bound (and memory bound) workflows such as the periodogram code, given the relatively low cost of processing in comparison with I/O operations. I/O-bound applications such as Montage perform best on high-performance clusters with fast networks and parallel file-systems. Science-driven Cyberinfrastructure: Montage has been widely used as a driver application to develop workflow management services, such as task scheduling in distributed environments, designing fault tolerance techniques for job schedulers, and developing workflow orchestration techniques. Running Parallel Applications Across Distributed Cloud Environments: Data processing will eventually take place in parallel distributed across cyber infrastructure environments having different architectures. We have used the Pegasus Work Management System (WMS) to successfully run applications across three very different environments: TeraGrid, OSG (Open Science Grid), and FutureGrid. Provisioning resources across different grids and clouds (also referred to as Sky Computing), involves establishing a distributed environment, where issues of, e.g, remote job submission, data management, and security need to be addressed. This environment also requires building virtual machine images that can run in different environments. Usually, each cloud provides basic images that can be customized with additional software and services. In most of our work, we provisioned compute resources using a custom application, called Wrangler. Pegasus WMS abstracts the architectures of the compute environments away from the end-user, and can be considered a first-generation tool suitable for scientists to run their applications on disparate environments.
Contributing to the design of run-time systems dedicated to high performance computing

International Nuclear Information System (INIS)

Perache, M.

2006-10-01

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
High Performance Computing in Science and Engineering '08 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2009-01-01

The discussions and plans on all scienti?c, advisory, and political levels to realize an even larger “European Supercomputer” in Germany, where the hardware costs alone will be hundreds of millions Euro – much more than in the past – are getting closer to realization. As part of the strategy, the three national supercomputing centres HLRS (Stuttgart), NIC/JSC (Julic ¨ h) and LRZ (Munich) have formed the Gauss Centre for Supercomputing (GCS) as a new virtual organization enabled by an agreement between the Federal Ministry of Education and Research (BMBF) and the state ministries for research of Baden-Wurttem ¨ berg, Bayern, and Nordrhein-Westfalen. Already today, the GCS provides the most powerful high-performance computing - frastructure in Europe. Through GCS, HLRS participates in the European project PRACE (Partnership for Advances Computing in Europe) and - tends its reach to all European member countries. These activities aligns well with the activities of HLRS in the European HPC infrastructur...
High performance computing in science and engineering '09: transactions of the High Performance Computing Center, Stuttgart (HLRS) 2009

National Research Council Canada - National Science Library

Nagel, Wolfgang E; Kröner, Dietmar; Resch, Michael

2010-01-01

...), NIC/JSC (J¨ u lich), and LRZ (Munich). As part of that strategic initiative, in May 2009 already NIC/JSC has installed the first phase of the GCS HPC Tier-0 resources, an IBM Blue Gene/P with roughly 300.000 Cores, this time in J¨ u lich, With that, the GCS provides the most powerful high-performance computing infrastructure in Europe alread...
The contribution of high-performance computing and modelling for industrial development

CSIR Research Space (South Africa)

Sithole, Happy

2017-10-01

Full Text Available Performance Computing and Modelling for Industrial Development Dr Happy Sithole and Dr Onno Ubbink 2 Strategic context • High-performance computing (HPC) combined with machine Learning and artificial intelligence present opportunities to non...
The path toward HEP High Performance Computing

International Nuclear Information System (INIS)

Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit
High performance computing and communications: FY 1997 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-12-01

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A provides a detailed look at HPCC Program activities within each agency.
High performance computing and communications: FY 1996 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1995-05-16

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage of the High Performance Computing Act of 1991, signed on December 9, 1991. Twelve federal agencies, in collaboration with scientists and managers from US industry, universities, and research laboratories, have developed the Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1995 and FY 1996. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency.
Low-cost, high-performance and efficiency computational photometer design

Science.gov (United States)

Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly

2014-05-01

Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.
High Performance Spaceflight Computing (HPSC)

Data.gov (United States)

National Aeronautics and Space Administration — Space-based computing has not kept up with the needs of current and future NASA missions. We are developing a next-generation flight computing system that addresses...
Solving Problems in Various Domains by Hybrid Models of High Performance Computations

Directory of Open Access Journals (Sweden)

Yurii Rogozhin

2014-03-01

Full Text Available This work presents a hybrid model of high performance computations. The model is based on membrane system (P~system where some membranes may contain quantum device that is triggered by the data entering the membrane. This model is supposed to take advantages of both biomolecular and quantum paradigms and to overcome some of their inherent limitations. The proposed approach is demonstrated through two selected problems: SAT, and image retrieving.
Polymer waveguides for electro-optical integration in data centers and high-performance computers.

Science.gov (United States)

Dangel, Roger; Hofrichter, Jens; Horst, Folkert; Jubin, Daniel; La Porta, Antonio; Meier, Norbert; Soganci, Ibrahim Murat; Weiss, Jonas; Offrein, Bert Jan

2015-02-23

To satisfy the intra- and inter-system bandwidth requirements of future data centers and high-performance computers, low-cost low-power high-throughput optical interconnects will become a key enabling technology. To tightly integrate optics with the computing hardware, particularly in the context of CMOS-compatible silicon photonics, optical printed circuit boards using polymer waveguides are considered as a formidable platform. IBM Research has already demonstrated the essential silicon photonics and interconnection building blocks. A remaining challenge is electro-optical packaging, i.e., the connection of the silicon photonics chips with the system. In this paper, we present a new single-mode polymer waveguide technology and a scalable method for building the optical interface between silicon photonics chips and single-mode polymer waveguides.
Use of high performance computing to examine the effectiveness of aquifer remediation

International Nuclear Information System (INIS)

Tompson, A.F.B.; Ashby, S.F.; Falgout, R.D.; Smith, S.G.; Fogwell, T.W.; Loosmore, G.A.

1994-06-01

Large-scale simulation of fluid flow and chemical migration is being used to study the effectiveness of pump-and-treat restoration of a contaminated, saturated aquifer. A three-element approach focusing on geostatistical representations of heterogeneous aquifers, high-performance computing strategies for simulating flow, migration, and reaction processes in large three-dimensional systems, and highly-resolved simulations of flow and chemical migration in porous formations will be discussed. Results from a preliminary application of this approach to examine pumping behavior at a real, heterogeneous field site will be presented. Future activities will emphasize parallel computations in larger, dynamic, and nonlinear (two-phase) flow problems as well as improved interpretive methods for defining detailed material property distributions
New Developments in Modeling MHD Systems on High Performance Computing Architectures

Science.gov (United States)

Germaschewski, K.; Raeder, J.; Larson, D. J.; Bhattacharjee, A.

2009-04-01

Modeling the wide range of time and length scales present even in fluid models of plasmas like MHD and X-MHD (Extended MHD including two fluid effects like Hall term, electron inertia, electron pressure gradient) is challenging even on state-of-the-art supercomputers. In the last years, HPC capacity has continued to grow exponentially, but at the expense of making the computer systems more and more difficult to program in order to get maximum performance. In this paper, we will present a new approach to managing the complexity caused by the need to write efficient codes: Separating the numerical description of the problem, in our case a discretized right hand side (r.h.s.), from the actual implementation of efficiently evaluating it. An automatic code generator is used to describe the r.h.s. in a quasi-symbolic form while leaving the translation into efficient and parallelized code to a computer program itself. We implemented this approach for OpenGGCM (Open General Geospace Circulation Model), a model of the Earth's magnetosphere, which was accelerated by a factor of three on regular x86 architecture and a factor of 25 on the Cell BE architecture (commonly known for its deployment in Sony's PlayStation 3).
What Physicists Should Know About High Performance Computing - Circa 2002

Science.gov (United States)

Frederick, Donald

2002-08-01

High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.
Inclusive vision for high performance computing at the CSIR

CSIR Research Space (South Africa)

Gazendam, A

2006-02-01

Full Text Available and computationally intensive applications. A number of different technologies and standards were identified as core to the open and distributed high-performance infrastructure envisaged...
High Performance Computing Multicast

Science.gov (United States)

2012-02-01

A History of the Virtual Synchrony Replication Model,” in Replication: Theory and Practice, Charron-Bost, B., Pedone, F., and Schiper, A. (Eds...Performance Computing IP / IPv4 Internet Protocol (version 4.0) IPMC Internet Protocol MultiCast LAN Local Area Network MCMD Dr. Multicast MPI

A Framework for Debugging Geoscience Projects in a High Performance Computing Environment

Science.gov (United States)

Baxter, C.; Matott, L.

2012-12-01

High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
The path toward HEP High Performance Computing

CERN Document Server

Apostolakis, John; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on th...
High-performance computing for structural mechanics and earthquake/tsunami engineering

CERN Document Server

Hori, Muneo; Ohsaki, Makoto

2016-01-01

Huge earthquakes and tsunamis have caused serious damage to important structures such as civil infrastructure elements, buildings and power plants around the globe. To quantitatively evaluate such damage processes and to design effective prevention and mitigation measures, the latest high-performance computational mechanics technologies, which include telascale to petascale computers, can offer powerful tools. The phenomena covered in this book include seismic wave propagation in the crust and soil, seismic response of infrastructure elements such as tunnels considering soil-structure interactions, seismic response of high-rise buildings, seismic response of nuclear power plants, tsunami run-up over coastal towns and tsunami inundation considering fluid-structure interactions. The book provides all necessary information for addressing these phenomena, ranging from the fundamentals of high-performance computing for finite element methods, key algorithms of accurate dynamic structural analysis, fluid flows ...
Power/energy use cases for high performance computing

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hammond, Steven [National Renewable Energy Lab. (NREL), Golden, CO (United States); Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Munch, Kristin [National Renewable Energy Lab. (NREL), Golden, CO (United States)

2013-12-01

Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.
Performance Aspects of Synthesizable Computing Systems

DEFF Research Database (Denmark)

Schleuniger, Pascal

Embedded systems are used in a broad range of applications that demand high performance within severely constrained mechanical, power, and cost requirements. Embedded systems implemented in ASIC technology tend to provide the highest performance, lowest power consumption and lowest unit cost. How...
High performance parallel computing of flows in complex geometries: I. Methods

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F; Gazaix, M; Poinsot, T

2009-01-01

Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

International Nuclear Information System (INIS)

Khaleel, Mohammad A.

2009-01-01

This report is an account of the deliberations and conclusions of the workshop on 'Forefront Questions in Nuclear Science and the Role of High Performance Computing' held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to (1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; (2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; (3) provide nuclear physicists the opportunity to influence the development of high performance computing; and (4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Khaleel, Mohammad A.

2009-10-01

This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
Simple, parallel, high-performance virtual machines for extreme computations

International Nuclear Information System (INIS)

Chokoufe Nejad, Bijan; Ohl, Thorsten; Reuter, Jurgen

2014-11-01

We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new processes or complex higher order corrections that are currently out of reach could be evaluated with a VM given enough computing power.
High-performance dual-speed CCD camera system for scientific imaging

Science.gov (United States)

Simpson, Raymond W.

1996-03-01

Traditionally, scientific camera systems were partitioned with a `camera head' containing the CCD and its support circuitry and a camera controller, which provided analog to digital conversion, timing, control, computer interfacing, and power. A new, unitized high performance scientific CCD camera with dual speed readout at 1 X 106 or 5 X 106 pixels per second, 12 bit digital gray scale, high performance thermoelectric cooling, and built in composite video output is described. This camera provides all digital, analog, and cooling functions in a single compact unit. The new system incorporates the A/C converter, timing, control and computer interfacing in the camera, with the power supply remaining a separate remote unit. A 100 Mbyte/second serial link transfers data over copper or fiber media to a variety of host computers, including Sun, SGI, SCSI, PCI, EISA, and Apple Macintosh. Having all the digital and analog functions in the camera made it possible to modify this system for the Woods Hole Oceanographic Institution for use on a remote controlled submersible vehicle. The oceanographic version achieves 16 bit dynamic range at 1.5 X 105 pixels/second, can be operated at depths of 3 kilometers, and transfers data to the surface via a real time fiber optic link.
FY 1992 Blue Book: Grand Challenges: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...
Computational Fluid Dynamics (CFD) Computations With Zonal Navier-Stokes Flow Solver (ZNSFLOW) Common High Performance Computing Scalable Software Initiative (CHSSI) Software

National Research Council Canada - National Science Library

Edge, Harris

1999-01-01

...), computational fluid dynamics (CFD) 6 project. Under the project, a proven zonal Navier-Stokes solver was rewritten for scalable parallel performance on both shared memory and distributed memory high performance computers...
Component-based software for high-performance scientific computing

Energy Technology Data Exchange (ETDEWEB)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.
Component-based software for high-performance scientific computing

International Nuclear Information System (INIS)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly
System Software and Tools for High Performance Computing Environments: A report on the findings of the Pasadena Workshop, April 14--16, 1992

Energy Technology Data Exchange (ETDEWEB)

Sterling, T. [Universities Space Research Association, Washington, DC (United States); Messina, P. [Jet Propulsion Lab., Pasadena, CA (United States); Chen, M. [Yale Univ., New Haven, CT (United States)] [and others

1993-04-01

The Pasadena Workshop on System Software and Tools for High Performance Computing Environments was held at the Jet Propulsion Laboratory from April 14 through April 16, 1992. The workshop was sponsored by a number of Federal agencies committed to the advancement of high performance computing (HPC) both as a means to advance their respective missions and as a national resource to enhance American productivity and competitiveness. Over a hundred experts in related fields from industry, academia, and government were invited to participate in this effort to assess the current status of software technology in support of HPC systems. The overall objectives of the workshop were to understand the requirements and current limitations of HPC software technology and to contribute to a basis for establishing new directions in research and development for software technology in HPC environments. This report includes reports written by the participants of the workshop`s seven working groups. Materials presented at the workshop are reproduced in appendices. Additional chapters summarize the findings and analyze their implications for future directions in HPC software technology development.
High performance computing and communications: FY 1995 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1994-04-01

The High Performance Computing and Communications (HPCC) Program was formally established following passage of the High Performance Computing Act of 1991 signed on December 9, 1991. Ten federal agencies in collaboration with scientists and managers from US industry, universities, and laboratories have developed the HPCC Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1994 and FY 1995. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency. Although the Department of Education is an official HPCC agency, its current funding and reporting of crosscut activities goes through the Committee on Education and Health Resources, not the HPCC Program. For this reason the Implementation Plan covers nine HPCC agencies.
Development of a high performance eigensolver on the peta-scale next generation supercomputer system

International Nuclear Information System (INIS)

Imamura, Toshiyuki; Yamada, Susumu; Machida, Masahiko

2010-01-01

For the present supercomputer systems, a multicore and multisocket processors are necessary to build a system, and choice of interconnection is essential. In addition, for effective development of a new code, high performance, scalable, and reliable numerical software is one of the key items. ScaLAPACK and PETSc are well-known software on distributed memory parallel computer systems. It is needless to say that highly tuned software towards new architecture like many-core processors must be chosen for real computation. In this study, we present a high-performance and high-scalable eigenvalue solver towards the next-generation supercomputer system, so called 'K-computer' system. We have developed two versions, the standard version (eigen s) and enhanced performance version (eigen sx), which are developed on the T2K cluster system housed at University of Tokyo. Eigen s employs the conventional algorithms; Householder tridiagonalization, divide and conquer (DC) algorithm, and Householder back-transformation. They are carefully implemented with blocking technique and flexible two-dimensional data-distribution to reduce the overhead of memory traffic and data transfer, respectively. Eigen s performs excellently on the T2K system with 4096 cores (theoretical peak is 37.6 TFLOPS), and it shows fine performance 3.0 TFLOPS with a two hundred thousand dimensional matrix. The enhanced version, eigen sx, uses more advanced algorithms; the narrow-band reduction algorithm, DC for band matrices, and the block Householder back-transformation with WY-representation. Even though this version is still on a test stage, it shows 4.7 TFLOPS with the same dimensional matrix on eigen s. (author)
Improving the Eco-Efficiency of High Performance Computing Clusters Using EECluster

Directory of Open Access Journals (Sweden)

Alberto Cocaña-Fernández

2016-03-01

Full Text Available As data and supercomputing centres increase their performance to improve service quality and target more ambitious challenges every day, their carbon footprint also continues to grow, and has already reached the magnitude of the aviation industry. Also, high power consumptions are building up to a remarkable bottleneck for the expansion of these infrastructures in economic terms due to the unavailability of sufficient energy sources. A substantial part of the problem is caused by current energy consumptions of High Performance Computing (HPC clusters. To alleviate this situation, we present in this work EECluster, a tool that integrates with multiple open-source Resource Management Systems to significantly reduce the carbon footprint of clusters by improving their energy efficiency. EECluster implements a dynamic power management mechanism based on Computational Intelligence techniques by learning a set of rules through multi-criteria evolutionary algorithms. This approach enables cluster operators to find the optimal balance between a reduction in the cluster energy consumptions, service quality, and number of reconfigurations. Experimental studies using both synthetic and actual workloads from a real world cluster support the adoption of this tool to reduce the carbon footprint of HPC clusters.
High-performance control system for a heavy-ion medical accelerator

Energy Technology Data Exchange (ETDEWEB)

Lancaster, H.D.; Magyary, S.B.; Sah, R.C.

1983-03-01

A high performance control system is being designed as part of a heavy ion medical accelerator. The accelerator will be a synchrotron dedicated to clinical and other biomedical uses of heavy ions, and it will deliver fully stripped ions at energies up to 800 MeV/nucleon. A key element in the design of an accelerator which will operate in a hospital environment is to provide a high performance control system. This control system will provide accelerator modeling to facilitate changes in operating mode, provide automatic beam tuning to simplify accelerator operations, and provide diagnostics to enhance reliability. The control system being designed utilizes many microcomputers operating in parallel to collect and transmit data; complex numerical computations are performed by a powerful minicomputer. In order to provide the maximum operational flexibility, the Medical Accelerator control system will be capable of dealing with pulse-to-pulse changes in beam energy and ion species.
High-performance control system for a heavy-ion medical accelerator

International Nuclear Information System (INIS)

Lancaster, H.D.; Magyary, S.B.; Sah, R.C.

1983-03-01

A high performance control system is being designed as part of a heavy ion medical accelerator. The accelerator will be a synchrotron dedicated to clinical and other biomedical uses of heavy ions, and it will deliver fully stripped ions at energies up to 800 MeV/nucleon. A key element in the design of an accelerator which will operate in a hospital environment is to provide a high performance control system. This control system will provide accelerator modeling to facilitate changes in operating mode, provide automatic beam tuning to simplify accelerator operations, and provide diagnostics to enhance reliability. The control system being designed utilizes many microcomputers operating in parallel to collect and transmit data; complex numerical computations are performed by a powerful minicomputer. In order to provide the maximum operational flexibility, the Medical Accelerator control system will be capable of dealing with pulse-to-pulse changes in beam energy and ion species

High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Packet-Level Analysis

Science.gov (United States)

2015-09-01

individual fragments using the hash-based method. In general, fragments 6 appear in order and relatively close to each other in the file. A fragment...data product derived from the data model is shown in Fig. 5, a Google Earth12 Keyhole Markup Language (KML) file. This product includes aggregate...System BLOb binary large object FPGA field-programmable gate array HPC high-performance computing IP Internet Protocol KML Keyhole Markup Language
Towards Portable Large-Scale Image Processing with High-Performance Computing.

Science.gov (United States)

Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A

2018-05-03

High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software
High-Throughput Computing on High-Performance Platforms: A Case Study

Energy Technology Data Exchange (ETDEWEB)

Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Matteo, Turilli [Rutgers University; Angius, Alessio [Rutgers University; Oral, H Sarp [ORNL; De, K [University of Texas at Arlington; Klimentov, A [Brookhaven National Laboratory (BNL); Wells, Jack C. [ORNL; Jha, S [Rutgers University

2017-10-01

The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan -- a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.
Computational Performance Analysis of Nonlinear Dynamic Systems using Semi-infinite Programming

Directory of Open Access Journals (Sweden)

Tor A. Johansen

2001-01-01

Full Text Available For nonlinear systems that satisfy certain regularity conditions it is shown that upper and lower bounds on the performance (cost function can be computed using linear or quadratic programming. The performance conditions derived from Hamilton-Jacobi inequalities are formulated as linear inequalities defined pointwise by discretizing the state-space when assuming a linearly parameterized class of functions representing the candidate performance bounds. Uncertainty with respect to some system parameters can be incorporated by also gridding the parameter set. In addition to performance analysis, the method can also be used to compute Lyapunov functions that guarantees uniform exponential stability.
High-performance simulation-based algorithms for an alpine ski racer’s trajectory optimization in heterogeneous computer systems

Directory of Open Access Journals (Sweden)

Dębski Roman

2014-09-01

Full Text Available Effective, simulation-based trajectory optimization algorithms adapted to heterogeneous computers are studied with reference to the problem taken from alpine ski racing (the presented solution is probably the most general one published so far. The key idea behind these algorithms is to use a grid-based discretization scheme to transform the continuous optimization problem into a search problem over a specially constructed finite graph, and then to apply dynamic programming to find an approximation of the global solution. In the analyzed example it is the minimum-time ski line, represented as a piecewise-linear function (a method of elimination of unfeasible solutions is proposed. Serial and parallel versions of the basic optimization algorithm are presented in detail (pseudo-code, time and memory complexity. Possible extensions of the basic algorithm are also described. The implementation of these algorithms is based on OpenCL. The included experimental results show that contemporary heterogeneous computers can be treated as μ-HPC platforms-they offer high performance (the best speedup was equal to 128 while remaining energy and cost efficient (which is crucial in embedded systems, e.g., trajectory planners of autonomous robots. The presented algorithms can be applied to many trajectory optimization problems, including those having a black-box represented performance measure
5th International Conference on High Performance Scientific Computing

CERN Document Server

Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

2014-01-01

This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...
3rd International Conference on High Performance Scientific Computing

CERN Document Server

Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

2008-01-01

This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...
6th International Conference on High Performance Scientific Computing

CERN Document Server

Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

2017-01-01

This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...
Monitoring performance of a highly distributed and complex computing infrastructure in LHCb

Science.gov (United States)

Mathe, Z.; Haen, C.; Stagni, F.

2017-10-01

In order to ensure an optimal performance of the LHCb Distributed Computing, based on LHCbDIRAC, it is necessary to be able to inspect the behavior over time of many components: firstly the agents and services on which the infrastructure is built, but also all the computing tasks and data transfers that are managed by this infrastructure. This consists of recording and then analyzing time series of a large number of observables, for which the usage of SQL relational databases is far from optimal. Therefore within DIRAC we have been studying novel possibilities based on NoSQL databases (ElasticSearch, OpenTSDB and InfluxDB) as a result of this study we developed a new monitoring system based on ElasticSearch. It has been deployed on the LHCb Distributed Computing infrastructure for which it collects data from all the components (agents, services, jobs) and allows creating reports through Kibana and a web user interface, which is based on the DIRAC web framework. In this paper we describe this new implementation of the DIRAC monitoring system. We give details on the ElasticSearch implementation within the DIRAC general framework, as well as an overview of the advantages of the pipeline aggregation used for creating a dynamic bucketing of the time series. We present the advantages of using the ElasticSearch DSL high-level library for creating and running queries. Finally we shall present the performances of that system.
Threshold-based queuing system for performance analysis of cloud computing system with dynamic scaling

Energy Technology Data Exchange (ETDEWEB)

Shorgin, Sergey Ya.; Pechinkin, Alexander V. [Institute of Informatics Problems, Russian Academy of Sciences (Russian Federation); Samouylov, Konstantin E.; Gaidamaka, Yuliya V.; Gudkova, Irina A.; Sopin, Eduard S. [Telecommunication Systems Department, Peoples’ Friendship University of Russia (Russian Federation)

2015-03-10

Cloud computing is promising technology to manage and improve utilization of computing center resources to deliver various computing and IT services. For the purpose of energy saving there is no need to unnecessarily operate many servers under light loads, and they are switched off. On the other hand, some servers should be switched on in heavy load cases to prevent very long delays. Thus, waiting times and system operating cost can be maintained on acceptable level by dynamically adding or removing servers. One more fact that should be taken into account is significant server setup costs and activation times. For better energy efficiency, cloud computing system should not react on instantaneous increase or instantaneous decrease of load. That is the main motivation for using queuing systems with hysteresis for cloud computing system modelling. In the paper, we provide a model of cloud computing system in terms of multiple server threshold-based infinite capacity queuing system with hysteresis and noninstantanuous server activation. For proposed model, we develop a method for computing steady-state probabilities that allow to estimate a number of performance measures.
Threshold-based queuing system for performance analysis of cloud computing system with dynamic scaling

International Nuclear Information System (INIS)

Shorgin, Sergey Ya.; Pechinkin, Alexander V.; Samouylov, Konstantin E.; Gaidamaka, Yuliya V.; Gudkova, Irina A.; Sopin, Eduard S.

2015-01-01

Cloud computing is promising technology to manage and improve utilization of computing center resources to deliver various computing and IT services. For the purpose of energy saving there is no need to unnecessarily operate many servers under light loads, and they are switched off. On the other hand, some servers should be switched on in heavy load cases to prevent very long delays. Thus, waiting times and system operating cost can be maintained on acceptable level by dynamically adding or removing servers. One more fact that should be taken into account is significant server setup costs and activation times. For better energy efficiency, cloud computing system should not react on instantaneous increase or instantaneous decrease of load. That is the main motivation for using queuing systems with hysteresis for cloud computing system modelling. In the paper, we provide a model of cloud computing system in terms of multiple server threshold-based infinite capacity queuing system with hysteresis and noninstantanuous server activation. For proposed model, we develop a method for computing steady-state probabilities that allow to estimate a number of performance measures
High-Level Performance Modeling of SAR Systems

Science.gov (United States)

Chen, Curtis

2006-01-01

SAUSAGE (Still Another Utility for SAR Analysis that s General and Extensible) is a computer program for modeling (see figure) the performance of synthetic- aperture radar (SAR) or interferometric synthetic-aperture radar (InSAR or IFSAR) systems. The user is assumed to be familiar with the basic principles of SAR imaging and interferometry. Given design parameters (e.g., altitude, power, and bandwidth) that characterize a radar system, the software predicts various performance metrics (e.g., signal-to-noise ratio and resolution). SAUSAGE is intended to be a general software tool for quick, high-level evaluation of radar designs; it is not meant to capture all the subtleties, nuances, and particulars of specific systems. SAUSAGE was written to facilitate the exploration of engineering tradeoffs within the multidimensional space of design parameters. Typically, this space is examined through an iterative process of adjusting the values of the design parameters and examining the effects of the adjustments on the overall performance of the system at each iteration. The software is designed to be modular and extensible to enable consideration of a variety of operating modes and antenna beam patterns, including, for example, strip-map and spotlight SAR acquisitions, polarimetry, burst modes, and squinted geometries.
High-performance computational fluid dynamics: a custom-code approach

International Nuclear Information System (INIS)

Fannon, James; Náraigh, Lennon Ó; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain

2016-01-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing. (paper)
High-performance computational fluid dynamics: a custom-code approach

Science.gov (United States)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
International Conference on Modern Mathematical Methods and High Performance Computing in Science and Technology

CERN Document Server

Srivastava, HM; Venturino, Ezio; Resch, Michael; Gupta, Vijay

2016-01-01

The book discusses important results in modern mathematical models and high performance computing, such as applied operations research, simulation of operations, statistical modeling and applications, invisibility regions and regular meta-materials, unmanned vehicles, modern radar techniques/SAR imaging, satellite remote sensing, coding, and robotic systems. Furthermore, it is valuable as a reference work and as a basis for further study and research. All contributing authors are respected academicians, scientists and researchers from around the globe. All the papers were presented at the international conference on Modern Mathematical Methods and High Performance Computing in Science & Technology (M3HPCST 2015), held at Raj Kumar Goel Institute of Technology, Ghaziabad, India, from 27–29 December 2015, and peer-reviewed by international experts. The conference provided an exceptional platform for leading researchers, academicians, developers, engineers and technocrats from a broad range of disciplines ...
Human and Robotic Space Mission Use Cases for High-Performance Spaceflight Computing

Science.gov (United States)

Some, Raphael; Doyle, Richard; Bergman, Larry; Whitaker, William; Powell, Wesley; Johnson, Michael; Goforth, Montgomery; Lowry, Michael

2013-01-01

Spaceflight computing is a key resource in NASA space missions and a core determining factor of spacecraft capability, with ripple effects throughout the spacecraft, end-to-end system, and mission. Onboard computing can be aptly viewed as a "technology multiplier" in that advances provide direct dramatic improvements in flight functions and capabilities across the NASA mission classes, and enable new flight capabilities and mission scenarios, increasing science and exploration return. Space-qualified computing technology, however, has not advanced significantly in well over ten years and the current state of the practice fails to meet the near- to mid-term needs of NASA missions. Recognizing this gap, the NASA Game Changing Development Program (GCDP), under the auspices of the NASA Space Technology Mission Directorate, commissioned a study on space-based computing needs, looking out 15-20 years. The study resulted in a recommendation to pursue high-performance spaceflight computing (HPSC) for next-generation missions, and a decision to partner with the Air Force Research Lab (AFRL) in this development.
SCEAPI: A unified Restful Web API for High-Performance Computing

Science.gov (United States)

Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi

2017-10-01

The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
Computer simulations of high pressure systems

International Nuclear Information System (INIS)

Wilkins, M.L.

1977-01-01

Numerical methods are capable of solving very difficult problems in solid mechanics and gas dynamics. In the design of engineering structures, critical decisions are possible if the behavior of materials is correctly described in the calculation. Problems of current interest require accurate analysis of stress-strain fields that range from very small elastic displacement to very large plastic deformation. A finite difference program is described that solves problems over this range and in two and three space-dimensions and time. A series of experiments and calculations serve to establish confidence in the plasticity formulation. The program can be used to design high pressure systems where plastic flow occurs. The purpose is to identify material properties, strength and elongation, that meet the operating requirements. An objective is to be able to perform destructive testing on a computer rather than on the engineering structure. Examples of topical interest are given
Multi-Language Programming Environments for High Performance Java Computing

OpenAIRE

Vladimir Getov; Paul Gray; Sava Mintchev; Vaidy Sunderam

1999-01-01

Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI) tool which provides ...
FY 1993 Blue Book: Grand Challenges 1993: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

Performance Measurements in a High Throughput Computing Environment

CERN Document Server

AUTHOR|(CDS)2145966; Gribaudo, Marco

The IT infrastructures of companies and research centres are implementing new technologies to satisfy the increasing need of computing resources for big data analysis. In this context, resource profiling plays a crucial role in identifying areas where the improvement of the utilisation efficiency is needed. In order to deal with the profiling and optimisation of computing resources, two complementary approaches can be adopted: the measurement-based approach and the model-based approach. The measurement-based approach gathers and analyses performance metrics executing benchmark applications on computing resources. Instead, the model-based approach implies the design and implementation of a model as an abstraction of the real system, selecting only those aspects relevant to the study. This Thesis originates from a project carried out by the author within the CERN IT department. CERN is an international scientific laboratory that conducts fundamental researches in the domain of elementary particle physics. The p...
Computer performance evaluation of FACOM 230-75 computer system, (2)

International Nuclear Information System (INIS)

Fujii, Minoru; Asai, Kiyoshi

1980-08-01

In this report are described computer performance evaluations for FACOM230-75 computers in JAERI. The evaluations are performed on following items: (1) Cost/benefit analysis of timesharing terminals, (2) Analysis of the response time of timesharing terminals, (3) Analysis of throughout time for batch job processing, (4) Estimation of current potential demands for computer time, (5) Determination of appropriate number of card readers and line printers. These evaluations are done mainly from the standpoint of cost reduction of computing facilities. The techniques adapted are very practical ones. This report will be useful for those people who are concerned with the management of computing installation. (author)
High-Performance Computing Paradigm and Infrastructure

CERN Document Server

Yang, Laurence T

2006-01-01

With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream
Towards the development of run times leveraging virtualization for high performance computing

International Nuclear Information System (INIS)

Diakhate, F.

2010-12-01

In recent years, there has been a growing interest in using virtualization to improve the efficiency of data centers. This success is rooted in virtualization's excellent fault tolerance and isolation properties, in the overall flexibility it brings, and in its ability to exploit multi-core architectures efficiently. These characteristics also make virtualization an ideal candidate to tackle issues found in new compute cluster architectures. However, in spite of recent improvements in virtualization technology, overheads in the execution of parallel applications remain, which prevent its use in the field of high performance computing. In this thesis, we propose a virtual device dedicated to message passing between virtual machines, so as to improve the performance of parallel applications executed in a cluster of virtual machines. We also introduce a set of techniques facilitating the deployment of virtualized parallel applications. These functionalities have been implemented as part of a runtime system which allows to benefit from virtualization's properties in a way that is as transparent as possible to the user while minimizing performance overheads. (author)
Electro-optical system for the high speed reconstruction of computed tomography images

International Nuclear Information System (INIS)

Tresp, V.

1989-01-01

An electro-optical system for the high-speed reconstruction of computed tomography (CT) images has been built and studied. The system is capable of reconstructing high-contrast and high-resolution images at video rate (30 images per second), which is more than two orders of magnitude faster than the reconstruction rate achieved by special purpose digital computers used in commercial CT systems. The filtered back-projection algorithm which was implemented in the reconstruction system requires the filtering of all projections with a prescribed filter function. A space-integrating acousto-optical convolver, a surface acoustic wave filter and a digital finite-impulse response filter were used for this purpose and their performances were compared. The second part of the reconstruction, the back projection of the filtered projections, is computationally very expensive. An optical back projector has been built which maps the filtered projections onto the two-dimensional image space using an anamorphic lens system and a prism image rotator. The reconstructed image is viewed by a video camera, routed through a real-time image-enhancement system, and displayed on a TV monitor. The system reconstructs parallel-beam projection data, and in a modified version, is also capable of reconstructing fan-beam projection data. This extension is important since the latter are the kind of projection data actually acquired in high-speed X-ray CT scanners. The reconstruction system was tested by reconstructing precomputed projection data of phantom images. These were stored in a special purpose projection memory and transmitted to the reconstruction system as an electronic signal. In this way, a projection measurement system that acquires projections sequentially was simulated
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

2017-05-15

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

Science.gov (United States)

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Performance evaluation of a computed radiography system

Energy Technology Data Exchange (ETDEWEB)

Roussilhe, J.; Fallet, E. [Carestream Health France, 71 - Chalon/Saone (France); Mango, St.A. [Carestream Health, Inc. Rochester, New York (United States)

2007-07-01

Computed radiography (CR) standards have been formalized and published in Europe and in the US. The CR system classification is defined in those standards by - minimum normalized signal-to-noise ratio (SNRN), and - maximum basic spatial resolution (SRb). Both the signal-to-noise ratio (SNR) and the contrast sensitivity of a CR system depend on the dose (exposure time and conditions) at the detector. Because of their wide dynamic range, the same storage phosphor imaging plate can qualify for all six CR system classes. The exposure characteristics from 30 to 450 kV, the contrast sensitivity, and the spatial resolution of the KODAK INDUSTREX CR Digital System have been thoroughly evaluated. This paper will present some of the factors that determine the system's spatial resolution performance. (authors)
Nuclear forces and high-performance computing: The perfect match

International Nuclear Information System (INIS)

Luu, T; Walker-Loud, A

2009-01-01

High-performance computing is now enabling the calculation of certain hadronic interaction parameters directly from Quantum Chromodynamics, the quantum field theory that governs the behavior of quarks and gluons and is ultimately responsible for the nuclear strong force. In this paper we briefly describe the state of the field and show how other aspects of hadronic interactions will be ascertained in the near future. We give estimates of computational requirements needed to obtain these goals, and outline a procedure for incorporating these results into the broader nuclear physics community.
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

Science.gov (United States)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into
Top scientific research center deploys Zambeel Aztera (TM) network storage system in high performance environment

CERN Multimedia

2002-01-01

" The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory has implemented a Zambeel Aztera storage system and software to accelerate the productivity of scientists running high performance scientific simulations and computations" (1 page).
NINJA: Java for High Performance Numerical Computing

Directory of Open Access Journals (Sweden)

José E. Moreira

2002-01-01

Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.
RAPPORT: running scientific high-performance computing applications on the cloud.

Science.gov (United States)

Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

2013-01-28

Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
Analysis and Modeling of Social In uence in High Performance Computing Workloads

KAUST Repository

Zheng, Shuai

2011-06-01

High Performance Computing (HPC) is becoming a common tool in many research areas. Social influence (e.g., project collaboration) among increasing users of HPC systems creates bursty behavior in underlying workloads. This bursty behavior is increasingly common with the advent of grid computing and cloud computing. Mining the user bursty behavior is important for HPC workloads prediction and scheduling, which has direct impact on overall HPC computing performance. A representative work in this area is the Mixed User Group Model (MUGM), which clusters users according to the resource demand features of their submissions, such as duration time and parallelism. However, MUGM has some difficulties when implemented in real-world system. First, representing user behaviors by the features of their resource demand is usually difficult. Second, these features are not always available. Third, measuring the similarities among users is not a well-defined problem. In this work, we propose a Social Influence Model (SIM) to identify, analyze, and quantify the level of social influence across HPC users. The advantage of the SIM model is that it finds HPC communities by analyzing user job submission time, thereby avoiding the difficulties of MUGM. An offline algorithm and a fast-converging, computationally-efficient online learning algorithm for identifying social groups are proposed. Both offline and online algorithms are applied on several HPC and grid workloads, including Grid 5000, EGEE 2005 and 2007, and KAUST Supercomputing Lab (KSL) BGP data. From the experimental results, we show the existence of a social graph, which is characterized by a pattern of dominant users and followers. In order to evaluate the effectiveness of identified user groups, we show the pattern discovered by the offline algorithm follows a power-law distribution, which is consistent with those observed in mainstream social networks. We finally conclude the thesis and discuss future directions of our work.
Real-time Tsunami Inundation Prediction Using High Performance Computers

Science.gov (United States)

Oishi, Y.; Imamura, F.; Sugawara, D.

2014-12-01

Recently off-shore tsunami observation stations based on cabled ocean bottom pressure gauges are actively being deployed especially in Japan. These cabled systems are designed to provide real-time tsunami data before tsunamis reach coastlines for disaster mitigation purposes. To receive real benefits of these observations, real-time analysis techniques to make an effective use of these data are necessary. A representative study was made by Tsushima et al. (2009) that proposed a method to provide instant tsunami source prediction based on achieving tsunami waveform data. As time passes, the prediction is improved by using updated waveform data. After a tsunami source is predicted, tsunami waveforms are synthesized from pre-computed tsunami Green functions of linear long wave equations. Tsushima et al. (2014) updated the method by combining the tsunami waveform inversion with an instant inversion of coseismic crustal deformation and improved the prediction accuracy and speed in the early stages. For disaster mitigation purposes, real-time predictions of tsunami inundation are also important. In this study, we discuss the possibility of real-time tsunami inundation predictions, which require faster-than-real-time tsunami inundation simulation in addition to instant tsunami source analysis. Although the computational amount is large to solve non-linear shallow water equations for inundation predictions, it has become executable through the recent developments of high performance computing technologies. We conducted parallel computations of tsunami inundation and achieved 6.0 TFLOPS by using 19,000 CPU cores. We employed a leap-frog finite difference method with nested staggered grids of which resolution range from 405 m to 5 m. The resolution ratio of each nested domain was 1/3. Total number of grid points were 13 million, and the time step was 0.1 seconds. Tsunami sources of 2011 Tohoku-oki earthquake were tested. The inundation prediction up to 2 hours after the
FY 1995 Blue Book: High Performance Computing and Communications: Technology for the National Information Infrastructure

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program was created to accelerate the development of future generations of high performance computers...
Human Computer Music Performance

OpenAIRE

Dannenberg, Roger B.

2012-01-01

Human Computer Music Performance (HCMP) is the study of music performance by live human performers and real-time computer-based performers. One goal of HCMP is to create a highly autonomous artificial performer that can fill the role of a human, especially in a popular music setting. This will require advances in automated music listening and understanding, new representations for music, techniques for music synchronization, real-time human-computer communication, music generation, sound synt...
A parallel calibration utility for WRF-Hydro on high performance computers

Science.gov (United States)

Wang, J.; Wang, C.; Kotamarthi, V. R.

2017-12-01

A successful modeling of complex hydrological processes comprises establishing an integrated hydrological model which simulates the hydrological processes in each water regime, calibrates and validates the model performance based on observation data, and estimates the uncertainties from different sources especially those associated with parameters. Such a model system requires large computing resources and often have to be run on High Performance Computers (HPC). The recently developed WRF-Hydro modeling system provides a significant advancement in the capability to simulate regional water cycles more completely. The WRF-Hydro model has a large range of parameters such as those in the input table files — GENPARM.TBL, SOILPARM.TBL and CHANPARM.TBL — and several distributed scaling factors such as OVROUGHRTFAC. These parameters affect the behavior and outputs of the model and thus may need to be calibrated against the observations in order to obtain a good modeling performance. Having a parameter calibration tool specifically for automate calibration and uncertainty estimates of WRF-Hydro model can provide significant convenience for the modeling community. In this study, we developed a customized tool using the parallel version of the model-independent parameter estimation and uncertainty analysis tool, PEST, to enabled it to run on HPC with PBS and SLURM workload manager and job scheduler. We also developed a series of PEST input file templates that are specifically for WRF-Hydro model calibration and uncertainty analysis. Here we will present a flood case study occurred in April 2013 over Midwest. The sensitivity and uncertainties are analyzed using the customized PEST tool we developed.
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY

International Nuclear Information System (INIS)

FENG, H.; JONES, K.W.; MCGUIGAN, M.; SMITH, G.J.; SPILETIC, J.

2001-01-01

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY.

Energy Technology Data Exchange (ETDEWEB)

FENG,H.; JONES,K.W.; MCGUIGAN,M.; SMITH,G.J.; SPILETIC,J.

2001-10-12

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.

Thinking processes used by high-performing students in a computer programming task

Directory of Open Access Journals (Sweden)

Marietjie Havenga

2011-07-01

Full Text Available Computer programmers must be able to understand programming source code and write programs that execute complex tasks to solve real-world problems. This article is a trans- disciplinary study at the intersection of computer programming, education and psychology. It outlines the role of mental processes in the process of programming and indicates how successful thinking processes can support computer science students in writing correct and well-defined programs. A mixed methods approach was used to better understand the thinking activities and programming processes of participating students. Data collection involved both computer programs and students’ reflective thinking processes recorded in their journals. This enabled analysis of psychological dimensions of participants’ thinking processes and their problem-solving activities as they considered a programming problem. Findings indicate that the cognitive, reflective and psychological processes used by high-performing programmers contributed to their success in solving a complex programming problem. Based on the thinking processes of high performers, we propose a model of integrated thinking processes, which can support computer programming students. Keywords: Computer programming, education, mixed methods research, thinking processes. Disciplines: Computer programming, education, psychology
Spatial Processing of Urban Acoustic Wave Fields from High-Performance Computations

National Research Council Canada - National Science Library

Ketcham, Stephen A; Wilson, D. K; Cudney, Harley H; Parker, Michael W

2007-01-01

.... The objective of this work is to develop spatial processing techniques for acoustic wave propagation data from three-dimensional high-performance computations to quantify scattering due to urban...
DEISA2: supporting and developing a European high-performance computing ecosystem

International Nuclear Information System (INIS)

Lederer, H

2008-01-01

The DEISA Consortium has deployed and operated the Distributed European Infrastructure for Supercomputing Applications. Through the EU FP7 DEISA2 project (funded for three years as of May 2008), the consortium is continuing to support and enhance the distributed high-performance computing infrastructure and its activities and services relevant for applications enabling, operation, and technologies, as these are indispensable for the effective support of computational sciences for high-performance computing (HPC). The service-provisioning model will be extended from one that supports single projects to one supporting virtual European communities. Collaborative activities will also be carried out with new European and other international initiatives. Of strategic importance is cooperation with the PRACE project, which is preparing for the installation of a limited number of leadership-class Tier-0 supercomputers in Europe. The key role and aim of DEISA will be to deliver a turnkey operational solution for a persistent European HPC ecosystem that will integrate national Tier-1 centers and the new Tier-0 centers
Application of High Performance Computing to Earthquake Hazard and Disaster Estimation in Urban Area

Directory of Open Access Journals (Sweden)

Muneo Hori

2018-02-01

Full Text Available Integrated earthquake simulation (IES is a seamless simulation of analyzing all processes of earthquake hazard and disaster. There are two difficulties in carrying out IES, namely, the requirement of large-scale computation and the requirement of numerous analysis models for structures in an urban area, and they are solved by taking advantage of high performance computing (HPC and by developing a system of automated model construction. HPC is a key element in developing IES, as it needs to analyze wave propagation and amplification processes in an underground structure; a model of high fidelity for the underground structure exceeds a degree-of-freedom larger than 100 billion. Examples of IES for Tokyo Metropolis are presented; the numerical computation is made by using K computer, the supercomputer of Japan. The estimation of earthquake hazard and disaster for a given earthquake scenario is made by the ground motion simulation and the urban area seismic response simulation, respectively, for the target area of 10,000 m × 10,000 m.
High performance simulation for the Silva project using the tera computer

International Nuclear Information System (INIS)

Bergeaud, V.; La Hargue, J.P.; Mougery, F.; Boulet, M.; Scheurer, B.; Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A.

2003-01-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
High performance simulation for the Silva project using the tera computer

Energy Technology Data Exchange (ETDEWEB)

Bergeaud, V.; La Hargue, J.P.; Mougery, F. [CS Communication and Systemes, 92 - Clamart (France); Boulet, M.; Scheurer, B. [CEA Bruyeres-le-Chatel, 91 - Bruyeres-le-Chatel (France); Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A. [CEA Saclay, 91 - Gif sur Yvette (France)

2003-07-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
Achieving high performance in numerical computations on RISC workstations and parallel systems

Energy Technology Data Exchange (ETDEWEB)

Goedecker, S. [Max-Planck Inst. for Solid State Research, Stuttgart (Germany); Hoisie, A. [Los Alamos National Lab., NM (United States)

1997-08-20

The nominal peak speeds of both serial and parallel computers is raising rapidly. At the same time however it is becoming increasingly difficult to get out a significant fraction of this high peak speed from modern computer architectures. In this tutorial the authors give the scientists and engineers involved in numerically demanding calculations and simulations the necessary basic knowledge to write reasonably efficient programs. The basic principles are rather simple and the possible rewards large. Writing a program by taking into account optimization techniques related to the computer architecture can significantly speedup your program, often by factors of 10--100. As such, optimizing a program can for instance be a much better solution than buying a faster computer. If a few basic optimization principles are applied during program development, the additional time needed for obtaining an efficient program is practically negligible. In-depth optimization is usually only needed for a few subroutines or kernels and the effort involved is therefore also acceptable.
A Computer Controlled Precision High Pressure Measuring System

Science.gov (United States)

Sadana, S.; Yadav, S.; Jha, N.; Gupta, V. K.; Agarwal, R.; Bandyopadhyay, A. K.; Saxena, T. K.

2011-01-01

A microcontroller (AT89C51) based electronics has been designed and developed for high precision calibrator based on Digiquartz pressure transducer (DQPT) for the measurement of high hydrostatic pressure up to 275 MPa. The input signal from DQPT is converted into a square wave form and multiplied through frequency multiplier circuit over 10 times to input frequency. This input frequency is multiplied by a factor of ten using phased lock loop. Octal buffer is used to store the calculated frequency, which in turn is fed to microcontroller AT89C51 interfaced with a liquid crystal display for the display of frequency as well as corresponding pressure in user friendly units. The electronics developed is interfaced with a computer using RS232 for automatic data acquisition, computation and storage. The data is acquired by programming in Visual Basic 6.0. This system is interfaced with the PC to make it a computer controlled system. The system is capable of measuring the frequency up to 4 MHz with a resolution of 0.01 Hz and the pressure up to 275 MPa with a resolution of 0.001 MPa within measurement uncertainty of 0.025%. The details on the hardware of the pressure measuring system, associated electronics, software and calibration are discussed in this paper.
Distributed metadata in a high performance computing environment

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Zhang, Zhenhua; Liu, Xuezhao; Tang, Haiying

2017-07-11

A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination that a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.
High-performance floating-point image computing workstation for medical applications

Science.gov (United States)

Mills, Karl S.; Wong, Gilman K.; Kim, Yongmin

1990-07-01

The medical imaging field relies increasingly on imaging and graphics techniques in diverse applications with needs similar to (or more stringent than) those of the military, industrial and scientific communities. However, most image processing and graphics systems available for use in medical imaging today are either expensive, specialized, or in most cases both. High performance imaging and graphics workstations which can provide real-time results for a number of applications, while maintaining affordability and flexibility, can facilitate the application of digital image computing techniques in many different areas. This paper describes the hardware and software architecture of a medium-cost floating-point image processing and display subsystem for the NeXT computer, and its applications as a medical imaging workstation. Medical imaging applications of the workstation include use in a Picture Archiving and Communications System (PACS), in multimodal image processing and 3-D graphics workstation for a broad range of imaging modalities, and as an electronic alternator utilizing its multiple monitor display capability and large and fast frame buffer. The subsystem provides a 2048 x 2048 x 32-bit frame buffer (16 Mbytes of image storage) and supports both 8-bit gray scale and 32-bit true color images. When used to display 8-bit gray scale images, up to four different 256-color palettes may be used for each of four 2K x 2K x 8-bit image frames. Three of these image frames can be used simultaneously to provide pixel selectable region of interest display. A 1280 x 1024 pixel screen with 1: 1 aspect ratio can be windowed into the frame buffer for display of any portion of the processed image or images. In addition, the system provides hardware support for integer zoom and an 82-color cursor. This subsystem is implemented on an add-in board occupying a single slot in the NeXT computer. Up to three boards may be added to the NeXT for multiple display capability (e
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto

2013-09-10

State-of-the art computers need high performance transistors, which consume ultra-low power resulting in longer battery lifetime. Billions of transistors are integrated neatly using matured silicon fabrication process to maintain the performance per cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today\\'s computers. One limitation is silicon\\'s rigidity and brittleness. Here we show a generic batch process to convert high performance silicon electronics into flexible and semi-transparent one while retaining its performance, process compatibility, integration density and cost. We demonstrate high-k/metal gate stack based p-type metal oxide semiconductor field effect transistors on 4 inch silicon fabric released from bulk silicon (100) wafers with sub-threshold swing of 80 mV dec(-1) and on/off ratio of near 10(4) within 10% device uniformity with a minimum bending radius of 5 mm and an average transmittance of similar to 7% in the visible spectrum.
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai; Shae, Zon Yin; Zhang, Xiangliang; Jamjoom, Hani T.; Fong, Liana

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies
Using a Computer-based Messaging System at a High School To Increase School/Home Communication.

Science.gov (United States)

Burden, Mitzi K.

Minimal communication between school and home was found to contribute to low performance by students at McDuffie High School (South Carolina). This report describes the experience of establishing a computer-based telephone messaging system in the high school and involving parents, teachers, and students in its use. Additional strategies employed…
High-performance OPCPA laser system

International Nuclear Information System (INIS)

Zuegel, J.D.; Bagnoud, V.; Bromage, J.; Begishev, I.A.; Puth, J.

2006-01-01

Optical parametric chirped-pulse amplification (OPCPA) is ideally suited for amplifying ultra-fast laser pulses since it provides broadband gain across a wide range of wavelengths without many of the disadvantages of regenerative amplification. A high-performance OPCPA system has been demonstrated as a prototype for the front end of the OMEGA Extended Performance (EP) Laser System. (authors)
High-performance OPCPA laser system

Energy Technology Data Exchange (ETDEWEB)

Zuegel, J.D.; Bagnoud, V.; Bromage, J.; Begishev, I.A.; Puth, J. [Rochester Univ., Lab. for Laser Energetics, NY (United States)

2006-06-15

Optical parametric chirped-pulse amplification (OPCPA) is ideally suited for amplifying ultra-fast laser pulses since it provides broadband gain across a wide range of wavelengths without many of the disadvantages of regenerative amplification. A high-performance OPCPA system has been demonstrated as a prototype for the front end of the OMEGA Extended Performance (EP) Laser System. (authors)
Framework for generating expert systems to perform computer security risk analysis

International Nuclear Information System (INIS)

Smith, S.T.; Lim, J.J.

1985-01-01

At Los Alamos we are developing a framework to generate knowledge-based expert systems for performing automated risk analyses upon a subject system. The expert system is a computer program that models experts' knowledge about a topic, including facts, assumptions, insights, and decision rationale. The subject system, defined as the collection of information, procedures, devices, and real property upon which the risk analysis is to be performed, is a member of the class of systems that have three identifying characteristics: a set of desirable assets (or targets), a set of adversaries (or threats) desiring to obtain or to do harm to the assets, and a set of protective mechanisms to safeguard the assets from the adversaries. Risk analysis evaluates both vulnerability to and the impact of successful threats against the targets by determining the overall effectiveness of the subject system safeguards, identifying vulnerabilities in that set of safeguards, and determining cost-effective improvements to the safeguards. As a testbed, we evaluate the inherent vulnerabilities and risks in a system of computer security safeguards. The method considers safeguards protecting four generic targets (physical plant of the computer installation, its hardware, its software, and its documents and displays) against three generic threats (natural hazards, direct human actions requiring the presence of the adversary, and indirect human actions wherein the adversary is not on the premises-perhaps using such access tools as wiretaps, dialup lines, and so forth). Our automated procedure to assess the effectiveness of computer security safeguards differs from traditional risk analysis methods
Bringing high-performance computing to the biologist's workbench: approaches, applications, and challenges

International Nuclear Information System (INIS)

Oehmen, C S; Cannon, W R

2008-01-01

Data-intensive and high-performance computing are poised to significantly impact the future of biological research which is increasingly driven by the prevalence of high-throughput experimental methodologies for genome sequencing, transcriptomics, proteomics, and other areas. Large centers such as NIH's National Center for Biotechnology Information, The Institute for Genomic Research, and the DOE's Joint Genome Institute) have made extensive use of multiprocessor architectures to deal with some of the challenges of processing, storing and curating exponentially growing genomic and proteomic datasets, thus enabling users to rapidly access a growing public data source, as well as use analysis tools transparently on high-performance computing resources. Applying this computational power to single-investigator analysis, however, often relies on users to provide their own computational resources, forcing them to endure the learning curve of porting, building, and running software on multiprocessor architectures. Solving the next generation of large-scale biology challenges using multiprocessor machines-from small clusters to emerging petascale machines-can most practically be realized if this learning curve can be minimized through a combination of workflow management, data management and resource allocation as well as intuitive interfaces and compatibility with existing common data formats
High-Performance Computer Modeling of the Cosmos-Iridium Collision

Energy Technology Data Exchange (ETDEWEB)

Olivier, S; Cook, K; Fasenfest, B; Jefferson, D; Jiang, M; Leek, J; Levatin, J; Nikolaev, S; Pertica, A; Phillion, D; Springer, K; De Vries, W

2009-08-28

This paper describes the application of a new, integrated modeling and simulation framework, encompassing the space situational awareness (SSA) enterprise, to the recent Cosmos-Iridium collision. This framework is based on a flexible, scalable architecture to enable efficient simulation of the current SSA enterprise, and to accommodate future advancements in SSA systems. In particular, the code is designed to take advantage of massively parallel, high-performance computer systems available, for example, at Lawrence Livermore National Laboratory. We will describe the application of this framework to the recent collision of the Cosmos and Iridium satellites, including (1) detailed hydrodynamic modeling of the satellite collision and resulting debris generation, (2) orbital propagation of the simulated debris and analysis of the increased risk to other satellites (3) calculation of the radar and optical signatures of the simulated debris and modeling of debris detection with space surveillance radar and optical systems (4) determination of simulated debris orbits from modeled space surveillance observations and analysis of the resulting orbital accuracy, (5) comparison of these modeling and simulation results with Space Surveillance Network observations. We will also discuss the use of this integrated modeling and simulation framework to analyze the risks and consequences of future satellite collisions and to assess strategies for mitigating or avoiding future incidents, including the addition of new sensor systems, used in conjunction with the Space Surveillance Network, for improving space situational awareness.
High performance in software development

CERN Multimedia

CERN. Geneva; Haapio, Petri; Liukkonen, Juha-Matti

2015-01-01

What are the ingredients of high-performing software? Software development, especially for large high-performance systems, is one the most complex tasks mankind has ever tried. Technological change leads to huge opportunities but challenges our old ways of working. Processing large data sets, possibly in real time or with other tight computational constraints, requires an efficient solution architecture. Efficiency requirements span from the distributed storage and large-scale organization of computation and data onto the lowest level of processor and data bus behavior. Integrating performance behavior over these levels is especially important when the computation is resource-bounded, as it is in numerics: physical simulation, machine learning, estimation of statistical models, etc. For example, memory locality and utilization of vector processing are essential for harnessing the computing power of modern processor architectures due to the deep memory hierarchies of modern general-purpose computers. As a r...
Computer-Related Task Performance

DEFF Research Database (Denmark)

Longstreet, Phil; Xiao, Xiao; Sarker, Saonee

2016-01-01

The existing information system (IS) literature has acknowledged computer self-efficacy (CSE) as an important factor contributing to enhancements in computer-related task performance. However, the empirical results of CSE on performance have not always been consistent, and increasing an individual......'s CSE is often a cumbersome process. Thus, we introduce the theoretical concept of self-prophecy (SP) and examine how this social influence strategy can be used to improve computer-related task performance. Two experiments are conducted to examine the influence of SP on task performance. Results show...... that SP and CSE interact to influence performance. Implications are then discussed in terms of organizations’ ability to increase performance....

Systems, methods and computer-readable media to model kinetic performance of rechargeable electrochemical devices

Science.gov (United States)

Gering, Kevin L.

2013-01-01

A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics. The computing system also analyzes the cell information of the electrochemical cell with a Butler-Volmer (BV) expression modified to determine exchange current density of the electrochemical cell by including kinetic performance information related to pulse-time dependence, electrode surface availability, or a combination thereof. A set of sigmoid-based expressions may be included with the modified-BV expression to determine kinetic performance as a function of pulse time. The determined exchange current density may be used with the modified-BV expression, with or without the sigmoid expressions, to analyze other characteristics of the electrochemical cell. Model parameters can be defined in terms of cell aging, making the overall kinetics model amenable to predictive estimates of cell kinetic performance along the aging timeline.
A Lightweight, High-performance I/O Management Package for Data-intensive Computing

Energy Technology Data Exchange (ETDEWEB)

Wang, Jun

2011-06-22

Our group has been working with ANL collaborators on the topic bridging the gap between parallel file system and local file system during the course of this project period. We visited Argonne National Lab -- Dr. Robert Ross's group for one week in the past summer 2007. We looked over our current project progress and planned the activities for the incoming years 2008-09. The PI met Dr. Robert Ross several times such as HEC FSIO workshop 08, SC08 and SC10. We explored the opportunities to develop a production system by leveraging our current prototype to (SOGP+PVFS) a new PVFS version. We delivered SOGP+PVFS codes to ANL PVFS2 group in 2008.We also talked about exploring a potential project on developing new parallel programming models and runtime systems for data-intensive scalable computing (DISC). The methodology is to evolve MPI towards DISC by incorporating some functions of Google MapReduce parallel programming model. More recently, we are together exploring how to leverage existing works to perform (1) coordination/aggregation of local I/O operations prior to movement over the WAN, (2) efficient bulk data movement over the WAN, (3) latency hiding techniques for latency-intensive operations. Since 2009, we start applying Hadoop/MapReduce to some HEC applications with LANL scientists John Bent and Salman Habib. Another on-going work is to improve checkpoint performance at I/O forwarding Layer for the Road Runner super computer with James Nuetz and Gary Gridder at LANL. Two senior undergraduates from our research group did summer internships about high-performance file and storage system projects in LANL since 2008 for consecutive three years. Both of them are now pursuing Ph.D. degree in our group and will be 4th year in the PhD program in Fall 2011 and go to LANL to advance two above-mentioned works during this winter break. Since 2009, we have been collaborating with several computer scientists (Gary Grider, John bent, Parks Fields, James Nunez, Hsing
The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC

Science.gov (United States)

Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan

2016-04-01

The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.
Resilient computer system design

CERN Document Server

Castano, Victor

2015-01-01

This book presents a paradigm for designing new generation resilient and evolving computer systems, including their key concepts, elements of supportive theory, methods of analysis and synthesis of ICT with new properties of evolving functioning, as well as implementation schemes and their prototyping. The book explains why new ICT applications require a complete redesign of computer systems to address challenges of extreme reliability, high performance, and power efficiency. The authors present a comprehensive treatment for designing the next generation of computers, especially addressing safety-critical, autonomous, real time, military, banking, and wearable health care systems. § Describes design solutions for new computer system - evolving reconfigurable architecture (ERA) that is free from drawbacks inherent in current ICT and related engineering models § Pursues simplicity, reliability, scalability principles of design implemented through redundancy and re-configurability; targeted for energy-,...
Overview of Parallel Platforms for Common High Performance Computing

Directory of Open Access Journals (Sweden)

T. Fryza

2012-04-01

Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.
Features of the Synthesis of Performance Security Information in Computer Systems

Directory of Open Access Journals (Sweden)

V. K. Dzhogan

2011-12-01

Full Text Available Synthesis of a scorecard is a gradual process of composition, since the set of elements that reflect the original, systematized their condition, and, through a series of intermediates, linking them in a single bound to the structure ends with one element that reflects the purpose of the system. The hierarchical structure of the system performance of information security in computer systems is a structure with regard to “one to many”. The article reflects the extent of information security tools capabilities influence at the security of information resources of computer systems (from indirect — Class 1, to direct — Class 4.
Fine tuning of work practices of common radiological investigations performed using computed radiography system

International Nuclear Information System (INIS)

Livingstone, Roshan S.; Timothy Peace, B.S.; Sunny, S.; Victor Raj, D.

2007-01-01

Introduction: The advent of the computed radiography (CR) has brought about remarkable changes in the field of diagnostic radiology. A relatively large cross-section of the human population is exposed to ionizing radiation on account of common radiological investigations. This study is intended to audit radiation doses imparted to patients during common radiological investigations involving the use of CR systems. Method: The entrance surface doses (ESD) were measured using thermoluminescent dosimeters (TLD) for various radiological investigations performed using the computed radiography (CR) systems. Optimization of radiographic techniques and radiation doses was done by fine tuning the work practices. Results and conclusion: Reduction of radiation doses as high as 47% was achieved during certain investigations with the use of optimized exposure factors and fine-tuned work practices
HIGH PERFORMANCE PHOTOGRAMMETRIC PROCESSING ON COMPUTER CLUSTERS

Directory of Open Access Journals (Sweden)

V. N. Adrov

2012-07-01

Full Text Available Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.
The Convergence of High Performance Computing and Large Scale Data Analytics

Science.gov (United States)

Duffy, D.; Bowen, M. K.; Thompson, J. H.; Yang, C. P.; Hu, F.; Wills, B.

2015-12-01

As the combinations of remote sensing observations and model outputs have grown, scientists are increasingly burdened with both the necessity and complexity of large-scale data analysis. Scientists are increasingly applying traditional high performance computing (HPC) solutions to solve their "Big Data" problems. While this approach has the benefit of limiting data movement, the HPC system is not optimized to run analytics, which can create problems that permeate throughout the HPC environment. To solve these issues and to alleviate some of the strain on the HPC environment, the NASA Center for Climate Simulation (NCCS) has created the Advanced Data Analytics Platform (ADAPT), which combines both HPC and cloud technologies to create an agile system designed for analytics. Large, commonly used data sets are stored in this system in a write once/read many file system, such as Landsat, MODIS, MERRA, and NGA. High performance virtual machines are deployed and scaled according to the individual scientist's requirements specifically for data analysis. On the software side, the NCCS and GMU are working with emerging commercial technologies and applying them to structured, binary scientific data in order to expose the data in new ways. Native NetCDF data is being stored within a Hadoop Distributed File System (HDFS) enabling storage-proximal processing through MapReduce while continuing to provide accessibility of the data to traditional applications. Once the data is stored within HDFS, an additional indexing scheme is built on top of the data and placed into a relational database. This spatiotemporal index enables extremely fast mappings of queries to data locations to dramatically speed up analytics. These are some of the first steps toward a single unified platform that optimizes for both HPC and large-scale data analysis, and this presentation will elucidate the resulting and necessary exascale architectures required for future systems.
Multicore Challenges and Benefits for High Performance Scientific Computing

Directory of Open Access Journals (Sweden)

Ida M.B. Nielsen

2008-01-01

Full Text Available Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.
High performance MRI simulations of motion on multi-GPU systems.

Science.gov (United States)

Xanthis, Christos G; Venetis, Ioannis E; Aletras, Anthony H

2014-07-04

MRI physics simulators have been developed in the past for optimizing imaging protocols and for training purposes. However, these simulators have only addressed motion within a limited scope. The purpose of this study was the incorporation of realistic motion, such as cardiac motion, respiratory motion and flow, within MRI simulations in a high performance multi-GPU environment. Three different motion models were introduced in the Magnetic Resonance Imaging SIMULator (MRISIMUL) of this study: cardiac motion, respiratory motion and flow. Simulation of a simple Gradient Echo pulse sequence and a CINE pulse sequence on the corresponding anatomical model was performed. Myocardial tagging was also investigated. In pulse sequence design, software crushers were introduced to accommodate the long execution times in order to avoid spurious echoes formation. The displacement of the anatomical model isochromats was calculated within the Graphics Processing Unit (GPU) kernel for every timestep of the pulse sequence. Experiments that would allow simulation of custom anatomical and motion models were also performed. Last, simulations of motion with MRISIMUL on single-node and multi-node multi-GPU systems were examined. Gradient Echo and CINE images of the three motion models were produced and motion-related artifacts were demonstrated. The temporal evolution of the contractility of the heart was presented through the application of myocardial tagging. Better simulation performance and image quality were presented through the introduction of software crushers without the need to further increase the computational load and GPU resources. Last, MRISIMUL demonstrated an almost linear scalable performance with the increasing number of available GPU cards, in both single-node and multi-node multi-GPU computer systems. MRISIMUL is the first MR physics simulator to have implemented motion with a 3D large computational load on a single computer multi-GPU configuration. The incorporation
Definition, modeling and simulation of a grid computing system for high throughput computing

CERN Document Server

Caron, E; Tsaregorodtsev, A Yu

2006-01-01

In this paper, we study and compare grid and global computing systems and outline the benefits of having an hybrid system called dirac. To evaluate the dirac scheduling for high throughput computing, a new model is presented and a simulator was developed for many clusters of heterogeneous nodes belonging to a local network. These clusters are assumed to be connected to each other through a global network and each cluster is managed via a local scheduler which is shared by many users. We validate our simulator by comparing the experimental and analytical results of a M/M/4 queuing system. Next, we do the comparison with a real batch system and we obtain an average error of 10.5% for the response time and 12% for the makespan. We conclude that the simulator is realistic and well describes the behaviour of a large-scale system. Thus we can study the scheduling of our system called dirac in a high throughput context. We justify our decentralized, adaptive and oppor! tunistic approach in comparison to a centralize...
The computational challenges of Earth-system science.

Science.gov (United States)

O'Neill, Alan; Steenman-Clark, Lois

2002-06-15

The Earth system--comprising atmosphere, ocean, land, cryosphere and biosphere--is an immensely complex system, involving processes and interactions on a wide range of space- and time-scales. To understand and predict the evolution of the Earth system is one of the greatest challenges of modern science, with success likely to bring enormous societal benefits. High-performance computing, along with the wealth of new observational data, is revolutionizing our ability to simulate the Earth system with computer models that link the different components of the system together. There are, however, considerable scientific and technical challenges to be overcome. This paper will consider four of them: complexity, spatial resolution, inherent uncertainty and time-scales. Meeting these challenges requires a significant increase in the power of high-performance computers. The benefits of being able to make reliable predictions about the evolution of the Earth system should, on their own, amply repay this investment.
Validation of the solar heating and cooling high speed performance (HISPER) computer code

Science.gov (United States)

Wallace, D. B.

1980-01-01

Developed to give a quick and accurate predictions HISPER, a simplification of the TRNSYS program, achieves its computational speed by not simulating detailed system operations or performing detailed load computations. In order to validate the HISPER computer for air systems the simulation was compared to the actual performance of an operational test site. Solar insolation, ambient temperature, water usage rate, and water main temperatures from the data tapes for an office building in Huntsville, Alabama were used as input. The HISPER program was found to predict the heating loads and solar fraction of the loads with errors of less than ten percent. Good correlation was found on both a seasonal basis and a monthly basis. Several parameters (such as infiltration rate and the outside ambient temperature above which heating is not required) were found to require careful selection for accurate simulation.
High-Performance Networking

CERN Multimedia

CERN. Geneva

2003-01-01

The series will start with an historical introduction about what people saw as high performance message communication in their time and how that developed to the now to day known "standard computer network communication". It will be followed by a far more technical part that uses the High Performance Computer Network standards of the 90's, with 1 Gbit/sec systems as introduction for an in depth explanation of the three new 10 Gbit/s network and interconnect technology standards that exist already or emerge. If necessary for a good understanding some sidesteps will be included to explain important protocols as well as some necessary details of concerned Wide Area Network (WAN) standards details including some basics of wavelength multiplexing (DWDM). Some remarks will be made concerning the rapid expanding applications of networked storage.
Optimized Architectural Approaches in Hardware and Software Enabling Very High Performance Shared Storage Systems

CERN Multimedia

CERN. Geneva

2004-01-01

There are issues encountered in high performance storage systems that normally lead to compromises in architecture. Compute clusters tend to have compute phases followed by an I/O phase that must move data from the entire cluster in one operation. That data may then be shared by a large number of clients creating unpredictable read and write patterns. In some cases the aggregate performance of a server cluster must exceed 100 GB/s to minimize the time required for the I/O cycle thus maximizing compute availability. Accessing the same content from multiple points in a shared file system leads to the classical problems of data "hot spots" on the disk drive side and access collisions on the data connectivity side. The traditional method for increasing apparent bandwidth usually includes data replication which is costly in both storage and management. Scaling a model that includes replicated data presents additional management challenges as capacity and bandwidth expand asymmetrically while the system is scaled. ...
Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce.

Science.gov (United States)

Aji, Ablimit; Wang, Fusheng; Vo, Hoang; Lee, Rubao; Liu, Qiaoling; Zhang, Xiaodong; Saltz, Joel

2013-08-01

Support of high performance queries on large volumes of spatial data becomes increasingly important in many application domains, including geospatial problems in numerous fields, location based services, and emerging scientific applications that are increasingly data- and compute-intensive. The emergence of massive scale spatial data is due to the proliferation of cost effective and ubiquitous positioning technologies, development of high resolution imaging technologies, and contribution from a large number of community users. There are two major challenges for managing and querying massive spatial data to support spatial queries: the explosion of spatial data, and the high computational complexity of spatial queries. In this paper, we present Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through spatial partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects. Hadoop-GIS utilizes global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture. Our experiments have demonstrated the high efficiency of Hadoop-GIS on query response and high scalability to run on commodity clusters. Our comparative experiments have showed that performance of Hadoop-GIS is on par with parallel SDBMS and outperforms SDBMS for compute-intensive queries. Hadoop-GIS is available as a set of library for processing spatial queries, and as an integrated software package in Hive.
DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

2014-10-17

U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.
Role of information systems in controlling costs: the electronic medical record (EMR) and the high-performance computing and communications (HPCC) efforts

Science.gov (United States)

Kun, Luis G.

1994-12-01

On October 18, 1991, the IEEE-USA produced an entity statement which endorsed the vital importance of the High Performance Computer and Communications Act of 1991 (HPCC) and called for the rapid implementation of all its elements. Efforts are now underway to develop a Computer Based Patient Record (CBPR), the National Information Infrastructure (NII) as part of the HPCC, and the so-called `Patient Card'. Multiple legislative initiatives which address these and related information technology issues are pending in Congress. Clearly, a national information system will greatly affect the way health care delivery is provided to the United States public. Timely and reliable information represents a critical element in any initiative to reform the health care system as well as to protect and improve the health of every person. Appropriately used, information technologies offer a vital means of improving the quality of patient care, increasing access to universal care and lowering overall costs within a national health care program. Health care reform legislation should reflect increased budgetary support and a legal mandate for the creation of a national health care information system by: (1) constructing a National Information Infrastructure; (2) building a Computer Based Patient Record System; (3) bringing the collective resources of our National Laboratories to bear in developing and implementing the NII and CBPR, as well as a security system with which to safeguard the privacy rights of patients and the physician-patient privilege; and (4) utilizing Government (e.g. DOD, DOE) capabilities (technology and human resources) to maximize resource utilization, create new jobs and accelerate technology transfer to address health care issues.
The tracking performance of distributed recoverable flight control systems subject to high intensity radiated fields

Science.gov (United States)

Wang, Rui

It is known that high intensity radiated fields (HIRF) can produce upsets in digital electronics, and thereby degrade the performance of digital flight control systems. Such upsets, either from natural or man-made sources, can change data values on digital buses and memory and affect CPU instruction execution. HIRF environments are also known to trigger common-mode faults, affecting nearly-simultaneously multiple fault containment regions, and hence reducing the benefits of n-modular redundancy and other fault-tolerant computing techniques. Thus, it is important to develop models which describe the integration of the embedded digital system, where the control law is implemented, as well as the dynamics of the closed-loop system. In this dissertation, theoretical tools are presented to analyze the relationship between the design choices for a class of distributed recoverable computing platforms and the tracking performance degradation of a digital flight control system implemented on such a platform while operating in a HIRF environment. Specifically, a tractable hybrid performance model is developed for a digital flight control system implemented on a computing platform inspired largely by the NASA family of fault-tolerant, reconfigurable computer architectures known as SPIDER (scalable processor-independent design for enhanced reliability). The focus will be on the SPIDER implementation, which uses the computer communication system known as ROBUS-2 (reliable optical bus). A physical HIRF experiment was conducted at the NASA Langley Research Center in order to validate the theoretical tracking performance degradation predictions for a distributed Boeing 747 flight control system subject to a HIRF environment. An extrapolation of these results for scenarios that could not be physically tested is also presented.

Export Controls: Implementation of the 1998 Legislative Mandate for High Performance Computers

National Research Council Canada - National Science Library

1999-01-01

We found that most of the 938 proposed exports of high performance computers to civilian end users in countries of concern from February 3, 1998, when procedures implementing the 1998 authorization...
Towards High Performance Processing In Modern Java Based Control Systems

CERN Document Server

Misiowiec, M; Buttner, M

2011-01-01

CERN controls software is often developed on Java foundation. Some systems carry out a combination of data, network and processor intensive tasks within strict time limits. Hence, there is a demand for high performing, quasi real time solutions. Extensive prototyping of the new CERN monitoring and alarm software required us to address such expectations. The system must handle dozens of thousands of data samples every second, along its three tiers, applying complex computations throughout. To accomplish the goal, a deep understanding of multithreading, memory management and interprocess communication was required. There are unexpected traps hidden behind an excessive use of 64 bit memory or severe impact on the processing flow of modern garbage collectors. Tuning JVM configuration significantly affects the execution of the code. Even more important is the amount of threads and the data structures used between them. Accurately dividing work into independent tasks might boost system performance. Thorough profili...
Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

2016-02-27

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Performance management of high performance computing for medical image processing in Amazon Web Services

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

2016-03-01

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Computer science of the high performance; Informatica del alto rendimiento

Energy Technology Data Exchange (ETDEWEB)

Moraleda, A.

2008-07-01

The high performance computing is taking shape as a powerful accelerator of the process of innovation, to drastically reduce the waiting times for access to the results and the findings in a growing number of processes and activities as complex and important as medicine, genetics, pharmacology, environment, natural resources management or the simulation of complex processes in a wide variety of industries. (Author)
Biocellion: accelerating computer simulation of multicellular biological system models.

Science.gov (United States)

Kang, Seunghwa; Kahan, Simon; McDermott, Jason; Flann, Nicholas; Shmulevich, Ilya

2014-11-01

Biological system behaviors are often the outcome of complex interactions among a large number of cells and their biotic and abiotic environment. Computational biologists attempt to understand, predict and manipulate biological system behavior through mathematical modeling and computer simulation. Discrete agent-based modeling (in combination with high-resolution grids to model the extracellular environment) is a popular approach for building biological system models. However, the computational complexity of this approach forces computational biologists to resort to coarser resolution approaches to simulate large biological systems. High-performance parallel computers have the potential to address the computing challenge, but writing efficient software for parallel computers is difficult and time-consuming. We have developed Biocellion, a high-performance software framework, to solve this computing challenge using parallel computers. To support a wide range of multicellular biological system models, Biocellion asks users to provide their model specifics by filling the function body of pre-defined model routines. Using Biocellion, modelers without parallel computing expertise can efficiently exploit parallel computers with less effort than writing sequential programs from scratch. We simulate cell sorting, microbial patterning and a bacterial system in soil aggregate as case studies. Biocellion runs on x86 compatible systems with the 64 bit Linux operating system and is freely available for academic use. Visit http://biocellion.com for additional information. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Development of High-speed Visualization System of Hypocenter Data Using CUDA-based GPU computing

Science.gov (United States)

Kumagai, T.; Okubo, K.; Uchida, N.; Matsuzawa, T.; Kawada, N.; Takeuchi, N.

2014-12-01

After the Great East Japan Earthquake on March 11, 2011, intelligent visualization of seismic information is becoming important to understand the earthquake phenomena. On the other hand, to date, the quantity of seismic data becomes enormous as a progress of high accuracy observation network; we need to treat many parameters (e.g., positional information, origin time, magnitude, etc.) to efficiently display the seismic information. Therefore, high-speed processing of data and image information is necessary to handle enormous amounts of seismic data. Recently, GPU (Graphic Processing Unit) is used as an acceleration tool for data processing and calculation in various study fields. This movement is called GPGPU (General Purpose computing on GPUs). In the last few years the performance of GPU keeps on improving rapidly. GPU computing gives us the high-performance computing environment at a lower cost than before. Moreover, use of GPU has an advantage of visualization of processed data, because GPU is originally architecture for graphics processing. In the GPU computing, the processed data is always stored in the video memory. Therefore, we can directly write drawing information to the VRAM on the video card by combining CUDA and the graphics API. In this study, we employ CUDA and OpenGL and/or DirectX to realize full-GPU implementation. This method makes it possible to write drawing information to the VRAM on the video card without PCIe bus data transfer: It enables the high-speed processing of seismic data. The present study examines the GPU computing-based high-speed visualization and the feasibility for high-speed visualization system of hypocenter data.
Distributed simulation of large computer systems

International Nuclear Information System (INIS)

Marzolla, M.

2001-01-01

Sequential simulation of large complex physical systems is often regarded as a computationally expensive task. In order to speed-up complex discrete-event simulations, the paradigm of Parallel and Distributed Discrete Event Simulation (PDES) has been introduced since the late 70s. The authors analyze the applicability of PDES to the modeling and analysis of large computer system; such systems are increasingly common in the area of High Energy and Nuclear Physics, because many modern experiments make use of large 'compute farms'. Some feasibility tests have been performed on a prototype distributed simulator
The NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform to Support the Analysis of Petascale Environmental Data Collections

Science.gov (United States)

Evans, B. J. K.; Pugh, T.; Wyborn, L. A.; Porter, D.; Allen, C.; Smillie, J.; Antony, J.; Trenham, C.; Evans, B. J.; Beckett, D.; Erwin, T.; King, E.; Hodge, J.; Woodcock, R.; Fraser, R.; Lescinsky, D. T.

2014-12-01

The National Computational Infrastructure (NCI) has co-located a priority set of national data assets within a HPC research platform. This powerful in-situ computational platform has been created to help serve and analyse the massive amounts of data across the spectrum of environmental collections - in particular the climate, observational data and geoscientific domains. This paper examines the infrastructure, innovation and opportunity for this significant research platform. NCI currently manages nationally significant data collections (10+ PB) categorised as 1) earth system sciences, climate and weather model data assets and products, 2) earth and marine observations and products, 3) geosciences, 4) terrestrial ecosystem, 5) water management and hydrology, and 6) astronomy, social science and biosciences. The data is largely sourced from the NCI partners (who include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. By co-locating these large valuable data assets, new opportunities have arisen by harmonising the data collections, making a powerful transdisciplinary research platformThe data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. New scientific software, cloud-scale techniques, server-side visualisation and data services have been harnessed and integrated into the platform, so that analysis is performed seamlessly across the traditional boundaries of the underlying data domains. Characterisation of the techniques along with performance profiling ensures scalability of each software component, all of which can either be enhanced or replaced through future improvements. A Development-to-Operations (DevOps) framework has also been implemented to manage the scale of the software complexity alone. This ensures that
Usage of super high speed computer for clarification of complex phenomena

International Nuclear Information System (INIS)

Sekiguchi, Tomotsugu; Sato, Mitsuhisa; Nakata, Hideki; Tatebe, Osami; Takagi, Hiromitsu

1999-01-01

This study aims at construction of an efficient super high speed computer system application environment in response to parallel distributed system with easy transplantation to different computer system and different number by conducting research and development on super high speed computer application technology required for elucidation of complicated phenomenon in elucidation of complicated phenomenon of nuclear power field due to computed scientific method. In order to realize such environment, the Electrotechnical Laboratory has conducted development on Ninf, a network numerical information library. This Ninf system can supply a global network infrastructure for worldwide computing with high performance on further wide range distributed network (G.K.)
Systems, methods and computer-readable media for modeling cell performance fade of rechargeable electrochemical devices

Science.gov (United States)

Gering, Kevin L

2013-08-27

A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware periodically samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics of the electrochemical cell. The computing system also develops a mechanistic level model of the electrochemical cell to determine performance fade characteristics of the electrochemical cell and analyzing the mechanistic level model to estimate performance fade characteristics over aging of a similar electrochemical cell. The mechanistic level model uses first constant-current pulses applied to the electrochemical cell at a first aging period and at three or more current values bracketing a first exchange current density. The mechanistic level model also is based on second constant-current pulses applied to the electrochemical cell at a second aging period and at three or more current values bracketing the second exchange current density.
Business Models of High Performance Computing Centres in Higher Education in Europe

Science.gov (United States)

Eurich, Markus; Calleja, Paul; Boutellier, Roman

2013-01-01

High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…
High Performance Commercial Fenestration Framing Systems

Energy Technology Data Exchange (ETDEWEB)

Mike Manteghi; Sneh Kumar; Joshua Early; Bhaskar Adusumalli

2010-01-31

A major objective of the U.S. Department of Energy is to have a zero energy commercial building by the year 2025. Windows have a major influence on the energy performance of the building envelope as they control over 55% of building energy load, and represent one important area where technologies can be developed to save energy. Aluminum framing systems are used in over 80% of commercial fenestration products (i.e. windows, curtain walls, store fronts, etc.). Aluminum framing systems are often required in commercial buildings because of their inherent good structural properties and long service life, which is required from commercial and architectural frames. At the same time, they are lightweight and durable, requiring very little maintenance, and offer design flexibility. An additional benefit of aluminum framing systems is their relatively low cost and easy manufacturability. Aluminum, being an easily recyclable material, also offers sustainable features. However, from energy efficiency point of view, aluminum frames have lower thermal performance due to the very high thermal conductivity of aluminum. Fenestration systems constructed of aluminum alloys therefore have lower performance in terms of being effective barrier to energy transfer (heat loss or gain). Despite the lower energy performance, aluminum is the choice material for commercial framing systems and dominates the commercial/architectural fenestration market because of the reasons mentioned above. In addition, there is no other cost effective and energy efficient replacement material available to take place of aluminum in the commercial/architectural market. Hence it is imperative to improve the performance of aluminum framing system to improve the energy performance of commercial fenestration system and in turn reduce the energy consumption of commercial building and achieve zero energy building by 2025. The objective of this project was to develop high performance, energy efficient commercial
The Benefits and Complexities of Operating Geographic Information Systems (GIS) in a High Performance Computing (HPC) Environment

Science.gov (United States)

Shute, J.; Carriere, L.; Duffy, D.; Hoy, E.; Peters, J.; Shen, Y.; Kirschbaum, D.

2017-12-01

The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center is building and maintaining an Enterprise GIS capability for its stakeholders, to include NASA scientists, industry partners, and the public. This platform is powered by three GIS subsystems operating in a highly-available, virtualized environment: 1) the Spatial Analytics Platform is the primary NCCS GIS and provides users discoverability of the vast DigitalGlobe/NGA raster assets within the NCCS environment; 2) the Disaster Mapping Platform provides mapping and analytics services to NASA's Disaster Response Group; and 3) the internal (Advanced Data Analytics Platform/ADAPT) enterprise GIS provides users with the full suite of Esri and open source GIS software applications and services. All systems benefit from NCCS's cutting edge infrastructure, to include an InfiniBand network for high speed data transfers; a mixed/heterogeneous environment featuring seamless sharing of information between Linux and Windows subsystems; and in-depth system monitoring and warning systems. Due to its co-location with the NCCS Discover High Performance Computing (HPC) environment and the Advanced Data Analytics Platform (ADAPT), the GIS platform has direct access to several large NCCS datasets including DigitalGlobe/NGA, Landsat, MERRA, and MERRA2. Additionally, the NCCS ArcGIS Desktop Windows virtual machines utilize existing NetCDF and OPeNDAP assets for visualization, modelling, and analysis - thus eliminating the need for data duplication. With the advent of this platform, Earth scientists have full access to vast data repositories and the industry-leading tools required for successful management and analysis of these multi-petabyte, global datasets. The full system architecture and integration with scientific datasets will be presented. Additionally, key applications and scientific analyses will be explained, to include the NASA Global Landslide Catalog (GLC) Reporter crowdsourcing application, the
Construction of Blaze at the University of Illinois at Chicago: A Shared, High-Performance, Visual Computer for Next-Generation Cyberinfrastructure-Accelerated Scientific, Engineering, Medical and Public Policy Research

Energy Technology Data Exchange (ETDEWEB)

Brown, Maxine D. [Acting Director, EVL; Leigh, Jason [PI

2014-02-17

The Blaze high-performance visual computing system serves the high-performance computing research and education needs of University of Illinois at Chicago (UIC). Blaze consists of a state-of-the-art, networked, computer cluster and ultra-high-resolution visualization system called CAVE2(TM) that is currently not available anywhere in Illinois. This system is connected via a high-speed 100-Gigabit network to the State of Illinois' I-WIRE optical network, as well as to national and international high speed networks, such as the Internet2, and the Global Lambda Integrated Facility. This enables Blaze to serve as an on-ramp to national cyberinfrastructure, such as the National Science Foundation’s Blue Waters petascale computer at the National Center for Supercomputing Applications at the University of Illinois at Chicago and the Department of Energy’s Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. DOE award # DE-SC005067, leveraged with NSF award #CNS-0959053 for “Development of the Next-Generation CAVE Virtual Environment (NG-CAVE),” enabled us to create a first-of-its-kind high-performance visual computing system. The UIC Electronic Visualization Laboratory (EVL) worked with two U.S. companies to advance their commercial products and maintain U.S. leadership in the global information technology economy. New applications are being enabled with the CAVE2/Blaze visual computing system that is advancing scientific research and education in the U.S. and globally, and help train the next-generation workforce.
Dynamic Performance Optimization for Cloud Computing Using M/M/m Queueing System

Directory of Open Access Journals (Sweden)

Lizheng Guo

2014-01-01

Full Text Available Successful development of cloud computing has attracted more and more people and enterprises to use it. On one hand, using cloud computing reduces the cost; on the other hand, using cloud computing improves the efficiency. As the users are largely concerned about the Quality of Services (QoS, performance optimization of the cloud computing has become critical to its successful application. In order to optimize the performance of multiple requesters and services in cloud computing, by means of queueing theory, we analyze and conduct the equation of each parameter of the services in the data center. Then, through analyzing the performance parameters of the queueing system, we propose the synthesis optimization mode, function, and strategy. Lastly, we set up the simulation based on the synthesis optimization mode; we also compare and analyze the simulation results to the classical optimization methods (short service time first and first in, first out method, which show that the proposed model can optimize the average wait time, average queue length, and the number of customer.
Confabulation Based Real-time Anomaly Detection for Wide-area Surveillance Using Heterogeneous High Performance Computing Architecture

Science.gov (United States)

2015-06-01

CONFABULATION BASED REAL-TIME ANOMALY DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE SYRACUSE...DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-12-1-0251 5b. GRANT...processors including graphic processor units (GPUs) and Intel Xeon Phi processors. Experimental results showed significant speedups, which can enable
The application of AFS in the high energy physics computing system

International Nuclear Information System (INIS)

Xu Dong; Yan Xiaofei; Chen Yaodong; Chen Gang; Yu Chuansong

2010-01-01

With the development of high energy physics, physics experiments are producing large amount of data. The workload of data analysis is very large, and the analysis work needs to be finished by many scientists together. So, the computing system must provide more secure user manage function and higher level of data-sharing ability. The article introduces a solution based on AFS in the high energy physics computing system, which not only make user management safer, but also make data-sharing easier. (authors)
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

CERN Document Server

Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2014-01-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

Science.gov (United States)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2015-05-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

International Nuclear Information System (INIS)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad; Knight, Robert

2015-01-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). (paper)
High performance computing environment for multidimensional image analysis.

Science.gov (United States)

Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

2007-07-10

The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup. Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.
Mixed-Language High-Performance Computing for Plasma Simulations

Directory of Open Access Journals (Sweden)

Quanming Lu

2003-01-01

Full Text Available Java is receiving increasing attention as the most popular platform for distributed computing. However, programmers are still reluctant to embrace Java as a tool for writing scientific and engineering applications due to its still noticeable performance drawbacks compared with other programming languages such as Fortran or C. In this paper, we present a hybrid Java/Fortran implementation of a parallel particle-in-cell (PIC algorithm for plasma simulations. In our approach, the time-consuming components of this application are designed and implemented as Fortran subroutines, while less calculation-intensive components usually involved in building the user interface are written in Java. The two types of software modules have been glued together using the Java native interface (JNI. Our mixed-language PIC code was tested and its performance compared with pure Java and Fortran versions of the same algorithm on a Sun E6500 SMP system and a Linux cluster of Pentium~III machines.
Big Data and High-Performance Computing in Global Seismology

Science.gov (United States)

Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Peter, Daniel; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen

2014-05-01

Much of our knowledge of Earth's interior is based on seismic observations and measurements. Adjoint methods provide an efficient way of incorporating 3D full wave propagation in iterative seismic inversions to enhance tomographic images and thus our understanding of processes taking place inside the Earth. Our aim is to take adjoint tomography, which has been successfully applied to regional and continental scale problems, further to image the entire planet. This is one of the extreme imaging challenges in seismology, mainly due to the intense computational requirements and vast amount of high-quality seismic data that can potentially be assimilated. We have started low-resolution inversions (T > 30 s and T > 60 s for body and surface waves, respectively) with a limited data set (253 carefully selected earthquakes and seismic data from permanent and temporary networks) on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Recent improvements in our 3D global wave propagation solvers, such as a GPU version of the SPECFEM3D_GLOBE package, will enable us perform higher-resolution (T > 9 s) and longer duration (~180 m) simulations to take the advantage of high-frequency body waves and major-arc surface waves, thereby improving imbalanced ray coverage as a result of the uneven global distribution of sources and receivers. Our ultimate goal is to use all earthquakes in the global CMT catalogue within the magnitude range of our interest and data from all available seismic networks. To take the full advantage of computational resources, we need a solid framework to manage big data sets during numerical simulations, pre-processing (i.e., data requests and quality checks, processing data, window selection, etc.) and post-processing (i.e., pre-conditioning and smoothing kernels, etc.). We address the bottlenecks in our global seismic workflow, which are mainly coming from heavy I/O traffic during simulations and the pre- and post-processing stages, by defining new data
Acceleration of FDTD mode solver by high-performance computing techniques.

Science.gov (United States)

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
High performance computing applied to simulation of the flow in pipes; Computacao de alto desempenho aplicada a simulacao de escoamento em dutos

Energy Technology Data Exchange (ETDEWEB)

Cozin, Cristiane; Lueders, Ricardo; Morales, Rigoberto E.M. [Universidade Tecnologica Federal do Parana (UTFPR), Curitiba, PR (Brazil). Dept. de Engenharia Mecanica

2008-07-01

In recent years, computer cluster has emerged as a real alternative to solution of problems which require high performance computing. Consequently, the development of new applications has been driven. Among them, flow simulation represents a real computational burden specially for large systems. This work presents a study of using parallel computing for numerical fluid flow simulation in pipelines. A mathematical flow model is numerically solved. In general, this procedure leads to a tridiagonal system of equations suitable to be solved by a parallel algorithm. In this work, this is accomplished by a parallel odd-oven reduction method found in the literature which is implemented on Fortran programming language. A computational platform composed by twelve processors was used. Many measures of CPU times for different tridiagonal system sizes and number of processors were obtained, highlighting the communication time between processors as an important issue to be considered when evaluating the performance of parallel applications. (author)
Multi-Language Programming Environments for High Performance Java Computing

Directory of Open Access Journals (Sweden)

Vladimir Getov

1999-01-01

Full Text Available Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI tool which provides application programmers wishing to use Java with immediate accessibility to existing scientific packages. The JCI tool also facilitates rapid development and reuse of existing code. These benefits are provided at minimal cost to the programmer. While beneficial to the programmer, the additional advantages of mixed‐language programming in terms of application performance and portability are addressed in detail within the context of this paper. In addition, we discuss how the JCI tool is complementing other ongoing projects such as IBM’s High‐Performance Compiler for Java (HPCJ and IceT’s metacomputing environment.
PERC 2 High-End Computer System Performance: Scalable Science and Engineering

Energy Technology Data Exchange (ETDEWEB)

Daniel Reed

2006-10-15

During two years of SciDAC PERC-2, our activities had centered largely on development of new performance analysis techniques to enable efficient use on systems containing thousands or tens of thousands of processors. In addition, we continued our application engagement efforts and utilized our tools to study the performance of various SciDAC applications on a variety of HPC platforms.
Gender Differences in Attitudes toward Computers and Performance in the Accounting Information Systems Class

Science.gov (United States)

Lenard, Mary Jane; Wessels, Susan; Khanlarian, Cindi

2010-01-01

Using a model developed by Young (2000), this paper explores the relationship between performance in the Accounting Information Systems course, self-assessed computer skills, and attitudes toward computers. Results show that after taking the AIS course, students experience a change in perception about their use of computers. Females'…
Current state and future direction of computer systems at NASA Langley Research Center

Science.gov (United States)

Rogers, James L. (Editor); Tucker, Jerry H. (Editor)

1992-01-01

Computer systems have advanced at a rate unmatched by any other area of technology. As performance has dramatically increased there has been an equally dramatic reduction in cost. This constant cost performance improvement has precipitated the pervasiveness of computer systems into virtually all areas of technology. This improvement is due primarily to advances in microelectronics. Most people are now convinced that the new generation of supercomputers will be built using a large number (possibly thousands) of high performance microprocessors. Although the spectacular improvements in computer systems have come about because of these hardware advances, there has also been a steady improvement in software techniques. In an effort to understand how these hardware and software advances will effect research at NASA LaRC, the Computer Systems Technical Committee drafted this white paper to examine the current state and possible future directions of computer systems at the Center. This paper discusses selected important areas of computer systems including real-time systems, embedded systems, high performance computing, distributed computing networks, data acquisition systems, artificial intelligence, and visualization.
Comprehensive Simulation Lifecycle Management for High Performance Computing Modeling and Simulation, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — There are significant logistical barriers to entry-level high performance computing (HPC) modeling and simulation (M IllinoisRocstar) sets up the infrastructure for...
How to build a high-performance compute cluster for the Grid

CERN Document Server

Reinefeld, A

2001-01-01

The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage resources at geographically distributed sites. Currently, much effort is spent on the design of Grid management software (Datagrid, Globus, etc.), while the effective integration of computing nodes has been largely neglected up to now. This is the focus of our work. We present a framework for a high- performance cluster that can be used as a reliable computing node in the Grid. We outline the cluster architecture, the management of distributed data and the seamless integration of the cluster into the Grid environment. (11 refs).
TOWARD HIGHLY SECURE AND AUTONOMIC COMPUTING SYSTEMS: A HIERARCHICAL APPROACH

Energy Technology Data Exchange (ETDEWEB)

Lee, Hsien-Hsin S

2010-05-11

The overall objective of this research project is to develop novel architectural techniques as well as system software to achieve a highly secure and intrusion-tolerant computing system. Such system will be autonomous, self-adapting, introspective, with self-healing capability under the circumstances of improper operations, abnormal workloads, and malicious attacks. The scope of this research includes: (1) System-wide, unified introspection techniques for autonomic systems, (2) Secure information-flow microarchitecture, (3) Memory-centric security architecture, (4) Authentication control and its implication to security, (5) Digital right management, (5) Microarchitectural denial-of-service attacks on shared resources. During the period of the project, we developed several architectural techniques and system software for achieving a robust, secure, and reliable computing system toward our goal.
High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

International Nuclear Information System (INIS)

Bouchard, Kristofer E.

2016-01-01

A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
Data processing system with a micro-computer for high magnetic field tokamak, TRIAM-1

International Nuclear Information System (INIS)

Kawasaki, Shoji; Nakamura, Kazuo; Nakamura, Yukio; Hiraki, Naoharu; Toi, Kazuo

1981-01-01

A data processing system was designed and constructed for the purpose of analyzing the data of the high magnetic field tokamak TRIAM-1. The system consists of a 10-channel A-D converter, a 20 K byte memory (RAM), an address bus control circuit, a data bus control circuit, a timing pulse and control signal generator, a D-A converter, a micro-computer, and a power source. The memory can be used as a CPU memory except at the time of sampling and data output. The out-put devices of the system are an X-Y recorder and an oscilloscope. The computer is composed of a CPU, a memory and an I/O part. The memory size can be extended. A cassette tape recorder is provided to keep the programs of the computer. An interface circuit between the computer and the tape recorder was designed and constructed. An electric discharge printer as an I/O device can be connected. From TRIAM-1, the signals of magnetic probes, plasma current, vertical field coil current, and one-turn loop voltage are fed into the processing system. The plasma displacement calculated from these signals is shown by one of I/O devices. The results of test run showed good performance. (Kato, T.)
Data processing system with a micro-computer for high magnetic field tokamak, TRIAM-1

Energy Technology Data Exchange (ETDEWEB)

Kawasaki, S; Nakamura, K; Nakamura, Y; Hiraki, N; Toi, K [Kyushu Univ., Fukuoka (Japan). Research Inst. for Applied Mechanics

1981-02-01

A data processing system was designed and constructed for the purpose of analyzing the data of the high magnetic field tokamak TRIAM-1. The system consists of a 10-channel A-D converter, a 20 K byte memory (RAM), an address bus control circuit, a data bus control circuit, a timing pulse and control signal generator, a D-A converter, a micro-computer, and a power source. The memory can be used as a CPU memory except at the time of sampling and data output. The out-put devices of the system are an X-Y recorder and an oscilloscope. The computer is composed of a CPU, a memory and an I/O part. The memory size can be extended. A cassette tape recorder is provided to keep the programs of the computer. An interface circuit between the computer and the tape recorder was designed and constructed. An electric discharge printer as an I/O device can be connected. From TRIAM-1, the signals of magnetic probes, plasma current, vertical field coil current, and one-turn loop voltage are fed into the processing system. The plasma displacement calculated from these signals is shown by one of I/O devices. The results of test run showed good performance.
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

Science.gov (United States)

Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

2012-01-01

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
Enabling Efficient Climate Science Workflows in High Performance Computing Environments

Science.gov (United States)

Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

2015-12-01

A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.
New computing systems, future computing environment, and their implications on structural analysis and design

Science.gov (United States)

Noor, Ahmed K.; Housner, Jerrold M.

1993-01-01

Recent advances in computer technology that are likely to impact structural analysis and design of flight vehicles are reviewed. A brief summary is given of the advances in microelectronics, networking technologies, and in the user-interface hardware and software. The major features of new and projected computing systems, including high performance computers, parallel processing machines, and small systems, are described. Advances in programming environments, numerical algorithms, and computational strategies for new computing systems are reviewed. The impact of the advances in computer technology on structural analysis and the design of flight vehicles is described. A scenario for future computing paradigms is presented, and the near-term needs in the computational structures area are outlined.
Analysis of Application Power and Schedule Composition in a High Performance Computing Environment

Energy Technology Data Exchange (ETDEWEB)

Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Gruchalla, Kenny [National Renewable Energy Lab. (NREL), Golden, CO (United States); Phillips, Caleb [National Renewable Energy Lab. (NREL), Golden, CO (United States); Purkayastha, Avi [National Renewable Energy Lab. (NREL), Golden, CO (United States); Wunder, Nick [National Renewable Energy Lab. (NREL), Golden, CO (United States)

2016-01-05

As the capacity of high performance computing (HPC) systems continues to grow, small changes in energy management have the potential to produce significant energy savings. In this paper, we employ an extensive informatics system for aggregating and analyzing real-time performance and power use data to evaluate energy footprints of jobs running in an HPC data center. We look at the effects of algorithmic choices for a given job on the resulting energy footprints, and analyze application-specific power consumption, and summarize average power use in the aggregate. All of these views reveal meaningful power variance between classes of applications as well as chosen methods for a given job. Using these data, we discuss energy-aware cost-saving strategies based on reordering the HPC job schedule. Using historical job and power data, we present a hypothetical job schedule reordering that: (1) reduces the facility's peak power draw and (2) manages power in conjunction with a large-scale photovoltaic array. Lastly, we leverage this data to understand the practical limits on predicting key power use metrics at the time of submission.

Nested Interrupt Analysis of Low Cost and High Performance Embedded Systems Using GSPN Framework

Science.gov (United States)

Lin, Cheng-Min

Interrupt service routines are a key technology for embedded systems. In this paper, we introduce the standard approach for using Generalized Stochastic Petri Nets (GSPNs) as a high-level model for generating CTMC Continuous-Time Markov Chains (CTMCs) and then use Markov Reward Models (MRMs) to compute the performance for embedded systems. This framework is employed to analyze two embedded controllers with low cost and high performance, ARM7 and Cortex-M3. Cortex-M3 is designed with a tail-chaining mechanism to improve the performance of ARM7 when a nested interrupt occurs on an embedded controller. The Platform Independent Petri net Editor 2 (PIPE2) tool is used to model and evaluate the controllers in terms of power consumption and interrupt overhead performance. Using numerical results, in spite of the power consumption or interrupt overhead, Cortex-M3 performs better than ARM7.
Secure Enclaves: An Isolation-centric Approach for Creating Secure High Performance Computing Environments

Energy Technology Data Exchange (ETDEWEB)

Aderholdt, Ferrol [Tennessee Technological Univ., Cookeville, TN (United States); Caldwell, Blake A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Hicks, Susan Elaine [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Koch, Scott M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Naughton, III, Thomas J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Pelfrey, Daniel S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Pogge, James R [Tennessee Technological Univ., Cookeville, TN (United States); Scott, Stephen L [Tennessee Technological Univ., Cookeville, TN (United States); Shipman, Galen M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Sorrillo, Lawrence [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2017-01-01

High performance computing environments are often used for a wide variety of workloads ranging from simulation, data transformation and analysis, and complex workflows to name just a few. These systems may process data at various security levels but in so doing are often enclaved at the highest security posture. This approach places significant restrictions on the users of the system even when processing data at a lower security level and exposes data at higher levels of confidentiality to a much broader population than otherwise necessary. The traditional approach of isolation, while effective in establishing security enclaves poses significant challenges for the use of shared infrastructure in HPC environments. This report details current state-of-the-art in virtualization, reconfigurable network enclaving via Software Defined Networking (SDN), and storage architectures and bridging techniques for creating secure enclaves in HPC environments.
LHC@Home: A Volunteer computing system for Massive Numerical Simulations of Beam Dynamics and High Energy Physics Events

CERN Document Server

Giovannozzi, M; Høimyr, N; Jones, PL; Karneyeu, A; Marquina, MA; McIntosh, E; Segal, B; Skands, P; Grey, F; Lombraña González, D; Rivkin, L; Zacharov, I

2012-01-01

Recently, the LHC@home system has been revived at CERN. It is a volunteer computing system based on BOINC which boosts the available CPU-power in institutional computer centres with the help of individuals that donate the CPU-time of their PCs. Currently two projects are hosted on the system, namely SixTrack and Test4Theory. The first is aimed at performing beam dynamics simulations, while the latter deals with the simulation of high-energy events. In this paper the details of the global system, as well a discussion of the capabilities of each project will be presented.
Current configuration and performance of the TFTR computer system

International Nuclear Information System (INIS)

Sauthoff, N.R.; Barnes, D.J.; Daniels, R.; Davis, S.; Reid, A.; Snyder, T.; Oliaro, G.; Stark, W.; Thompson, J.R. Jr.

1986-01-01

Developments in the TFTR (Tokamak Fusion Test Reactor) computer support system since its startup phases are described. Early emphasis on tokamak process control have been augmented by improved physics data handling, both on-line and off-line. Data acquisition volume and rate have been increased, and data is transmitted automatically to a new VAX-based off-line data reduction system. The number of interface points has increased dramatically, as has the number of man-machine interfaces. The graphics system performance has been accelerated by the introduction of parallelism, and new features such as shadowing and device independence have been added. To support multicycle operation for neutral beam conditioning and independence, the program control system has been generalized. A status and alarm system, including calculated variables, is in the installation phase. System reliability has been enhanced by both the redesign of weaker components and installation of a system status monitor. Development productivity has been enhanced by the addition of tools
Information Power Grid: Distributed High-Performance Computing and Large-Scale Data Management for Science and Engineering

Science.gov (United States)

Johnston, William E.; Gannon, Dennis; Nitzberg, Bill

2000-01-01

We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3
High-level language computer architecture

CERN Document Server

Chu, Yaohan

1975-01-01

High-Level Language Computer Architecture offers a tutorial on high-level language computer architecture, including von Neumann architecture and syntax-oriented architecture as well as direct and indirect execution architecture. Design concepts of Japanese-language data processing systems are discussed, along with the architecture of stack machines and the SYMBOL computer system. The conceptual design of a direct high-level language processor is also described.Comprised of seven chapters, this book first presents a classification of high-level language computer architecture according to the pr
Computer control system for sup 6 sup 0 Co industrial DR nondestructive testing system

CERN Document Server

Chen Hai Jun

2002-01-01

The author presents the application of sup 6 sup 0 Co industrial DR nondestructive testing system, which including the control of step-motor, electrical protection, computer monitor program. The computer control system has good performance, high reliability and cheap expense
Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

Science.gov (United States)

Caulfield, H John; Dolev, Shlomi; Green, William M J

2009-08-01

The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.
Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

Science.gov (United States)

Ivanovic, Pavle; Richter, Harald

2018-01-01

High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.
Peregrine System Configuration | High-Performance Computing | NREL

Science.gov (United States)

.hpc.nrel.gov Login Nodes There are four login nodes on the system, HP Proliant DL380 G8 servers with Intel E5 , /projects, /mss and /nopt file systems are mounted on all login nodes. Users may connect to peregrine.hpc.nrel.gov. This will connect to one of the 4 login nodes. Users also have the option of connecting directly
High performance communication by people with paralysis using an intracortical brain-computer interface

Science.gov (United States)

Pandarinath, Chethan; Nuyujukian, Paul; Blabe, Christine H; Sorice, Brittany L; Saab, Jad; Willett, Francis R; Hochberg, Leigh R

2017-01-01

Brain-computer interfaces (BCIs) have the potential to restore communication for people with tetraplegia and anarthria by translating neural activity into control signals for assistive communication devices. While previous pre-clinical and clinical studies have demonstrated promising proofs-of-concept (Serruya et al., 2002; Simeral et al., 2011; Bacher et al., 2015; Nuyujukian et al., 2015; Aflalo et al., 2015; Gilja et al., 2015; Jarosiewicz et al., 2015; Wolpaw et al., 1998; Hwang et al., 2012; Spüler et al., 2012; Leuthardt et al., 2004; Taylor et al., 2002; Schalk et al., 2008; Moran, 2010; Brunner et al., 2011; Wang et al., 2013; Townsend and Platsko, 2016; Vansteensel et al., 2016; Nuyujukian et al., 2016; Carmena et al., 2003; Musallam et al., 2004; Santhanam et al., 2006; Hochberg et al., 2006; Ganguly et al., 2011; O’Doherty et al., 2011; Gilja et al., 2012), the performance of human clinical BCI systems is not yet high enough to support widespread adoption by people with physical limitations of speech. Here we report a high-performance intracortical BCI (iBCI) for communication, which was tested by three clinical trial participants with paralysis. The system leveraged advances in decoder design developed in prior pre-clinical and clinical studies (Gilja et al., 2015; Kao et al., 2016; Gilja et al., 2012). For all three participants, performance exceeded previous iBCIs (Bacher et al., 2015; Jarosiewicz et al., 2015) as measured by typing rate (by a factor of 1.4–4.2) and information throughput (by a factor of 2.2–4.0). This high level of performance demonstrates the potential utility of iBCIs as powerful assistive communication devices for people with limited motor function. Clinical Trial No: NCT00912041 DOI: http://dx.doi.org/10.7554/eLife.18554.001 PMID:28220753
A CAD Open Platform for High Performance Reconfigurable Systems in the EXTRA Project

NARCIS (Netherlands)

Rabozzi, M.; Brondolin, R.; Natale, G.; Del Sozzo, E.; Huebner, M.; Brokalakis, A.; Ciobanu, C.; Stroobandt, D.; Santambrogio, M.D.; Hübner, M.; Reis, R.; Stan, M.; Voros, N.

2017-01-01

As the power wall has become one of the main limiting factors for the performance of general purpose processors, the trend in High Performance Computing (HPC) is moving towards application-specific accelerators in order to meet the stringent performance requirements for exascale computing while
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto; Sevilla, Galo T.; Hussain, Muhammad Mustafa

2013-01-01

cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today's computers. One limitation is silicon's rigidity and brittleness. Here we show a generic batch process
Department of Energy Mathematical, Information, and Computational Sciences Division: High Performance Computing and Communications Program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-11-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, The DOE Program in HPCC), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW).
Simulation model of load balancing in distributed computing systems

Science.gov (United States)

Botygin, I. A.; Popov, V. N.; Frolov, S. G.

2017-02-01

The availability of high-performance computing, high speed data transfer over the network and widespread of software for the design and pre-production in mechanical engineering have led to the fact that at the present time the large industrial enterprises and small engineering companies implement complex computer systems for efficient solutions of production and management tasks. Such computer systems are generally built on the basis of distributed heterogeneous computer systems. The analytical problems solved by such systems are the key models of research, but the system-wide problems of efficient distribution (balancing) of the computational load and accommodation input, intermediate and output databases are no less important. The main tasks of this balancing system are load and condition monitoring of compute nodes, and the selection of a node for transition of the user’s request in accordance with a predetermined algorithm. The load balancing is one of the most used methods of increasing productivity of distributed computing systems through the optimal allocation of tasks between the computer system nodes. Therefore, the development of methods and algorithms for computing optimal scheduling in a distributed system, dynamically changing its infrastructure, is an important task.
Commercialization issues and funding opportunities for high-performance optoelectronic computing modules

Science.gov (United States)

Hessenbruch, John M.; Guilfoyle, Peter S.

1997-01-01

Low power, optoelectronic integrated circuits are being developed for high speed switching and data processing applications. These high performance optoelectronic computing modules consist of three primary components: vertical cavity surface emitting lasers, diffractive optical interconnect elements, and detector/amplifier/laser driver arrays. Following the design and fabrication of an HPOC module prototype, selected commercial funding sources will be evaluated to support a product development stage. These include the formation of a strategic alliance with one or more microprocessor or telecommunications vendors, and/or equity investment from one or more venture capital firms.
Integrated modeling tool for performance engineering of complex computer systems

Science.gov (United States)

Wright, Gary; Ball, Duane; Hoyt, Susan; Steele, Oscar

1989-01-01

This report summarizes Advanced System Technologies' accomplishments on the Phase 2 SBIR contract NAS7-995. The technical objectives of the report are: (1) to develop an evaluation version of a graphical, integrated modeling language according to the specification resulting from the Phase 2 research; and (2) to determine the degree to which the language meets its objectives by evaluating ease of use, utility of two sets of performance predictions, and the power of the language constructs. The technical approach followed to meet these objectives was to design, develop, and test an evaluation prototype of a graphical, performance prediction tool. The utility of the prototype was then evaluated by applying it to a variety of test cases found in the literature and in AST case histories. Numerous models were constructed and successfully tested. The major conclusion of this Phase 2 SBIR research and development effort is that complex, real-time computer systems can be specified in a non-procedural manner using combinations of icons, windows, menus, and dialogs. Such a specification technique provides an interface that system designers and architects find natural and easy to use. In addition, PEDESTAL's multiview approach provides system engineers with the capability to perform the trade-offs necessary to produce a design that meets timing performance requirements. Sample system designs analyzed during the development effort showed that models could be constructed in a fraction of the time required by non-visual system design capture tools.
High-Performance Monitoring Architecture for Large-Scale Distributed Systems Using Event Filtering

Science.gov (United States)

Maly, K.

1998-01-01

Monitoring is an essential process to observe and improve the reliability and the performance of large-scale distributed (LSD) systems. In an LSD environment, a large number of events is generated by the system components during its execution or interaction with external objects (e.g. users or processes). Monitoring such events is necessary for observing the run-time behavior of LSD systems and providing status information required for debugging, tuning and managing such applications. However, correlated events are generated concurrently and could be distributed in various locations in the applications environment which complicates the management decisions process and thereby makes monitoring LSD systems an intricate task. We propose a scalable high-performance monitoring architecture for LSD systems to detect and classify interesting local and global events and disseminate the monitoring information to the corresponding end- points management applications such as debugging and reactive control tools to improve the application performance and reliability. A large volume of events may be generated due to the extensive demands of the monitoring applications and the high interaction of LSD systems. The monitoring architecture employs a high-performance event filtering mechanism to efficiently process the large volume of event traffic generated by LSD systems and minimize the intrusiveness of the monitoring process by reducing the event traffic flow in the system and distributing the monitoring computation. Our architecture also supports dynamic and flexible reconfiguration of the monitoring mechanism via its Instrumentation and subscription components. As a case study, we show how our monitoring architecture can be utilized to improve the reliability and the performance of the Interactive Remote Instruction (IRI) system which is a large-scale distributed system for collaborative distance learning. The filtering mechanism represents an Intrinsic component integrated
Quo vadis: Hydrologic inverse analyses using high-performance computing and a D-Wave quantum annealer

Science.gov (United States)

O'Malley, D.; Vesselinov, V. V.

2017-12-01

Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
9th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

2016-01-01

High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...

High performance computing and quantum trajectory method in CPU and GPU systems

International Nuclear Information System (INIS)

Wiśniewska, Joanna; Sawerwain, Marek; Leoński, Wiesław

2015-01-01

Nowadays, a dynamic progress in computational techniques allows for development of various methods, which offer significant speed-up of computations, especially those related to the problems of quantum optics and quantum computing. In this work, we propose computational solutions which re-implement the quantum trajectory method (QTM) algorithm in modern parallel computation environments in which multi-core CPUs and modern many-core GPUs can be used. In consequence, new computational routines are developed in more effective way than those applied in other commonly used packages, such as Quantum Optics Toolbox (QOT) for Matlab or QuTIP for Python
Scientific Data Services -- A High-Performance I/O System with Array Semantics

Energy Technology Data Exchange (ETDEWEB)

Wu, Kesheng; Byna, Surendra; Rotem, Doron; Shoshani, Arie

2011-09-21

As high-performance computing approaches exascale, the existing I/O system design is having trouble keeping pace in both performance and scalability. We propose to address this challenge by adopting database principles and techniques in parallel I/O systems. First, we propose to adopt an array data model because many scientific applications represent their data in arrays. This strategy follows a cardinal principle from database research, which separates the logical view from the physical layout of data. This high-level data model gives the underlying implementation more freedom to optimize the physical layout and to choose the most effective way of accessing the data. For example, knowing that a set of write operations is working on a single multi-dimensional array makes it possible to keep the subarrays in a log structure during the write operations and reassemble them later into another physical layout as resources permit. While maintaining the high-level view, the storage system could compress the user data to reduce the physical storage requirement, collocate data records that are frequently used together, or replicate data to increase availability and fault-tolerance. Additionally, the system could generate secondary data structures such as database indexes and summary statistics. We expect the proposed Scientific Data Services approach to create a “live” storage system that dynamically adjusts to user demands and evolves with the massively parallel storage hardware.
Introducing remarks upon the analysis of computer systems performance

International Nuclear Information System (INIS)

Baum, D.

1980-05-01

Some of the basis ideas of analytical techniques to study the behaviour of computer systems are presented. Single systems as well as networks of computers are viewed as stochastic dynamical systems which may be modelled by queueing networks. Therefore this report primarily serves as an introduction to probabilistic methods for qualitative analysis of systems. It is supplemented by an application example of Chandy's collapsing method. (orig.) [de
Towards Building a High Performance Spatial Query System for Large Scale Medical Imaging Data.

Science.gov (United States)

Aji, Ablimit; Wang, Fusheng; Saltz, Joel H

2012-11-06

Support of high performance queries on large volumes of scientific spatial data is becoming increasingly important in many applications. This growth is driven by not only geospatial problems in numerous fields, but also emerging scientific applications that are increasingly data- and compute-intensive. For example, digital pathology imaging has become an emerging field during the past decade, where examination of high resolution images of human tissue specimens enables more effective diagnosis, prediction and treatment of diseases. Systematic analysis of large-scale pathology images generates tremendous amounts of spatially derived quantifications of micro-anatomic objects, such as nuclei, blood vessels, and tissue regions. Analytical pathology imaging provides high potential to support image based computer aided diagnosis. One major requirement for this is effective querying of such enormous amount of data with fast response, which is faced with two major challenges: the "big data" challenge and the high computation complexity. In this paper, we present our work towards building a high performance spatial query system for querying massive spatial data on MapReduce. Our framework takes an on demand index building approach for processing spatial queries and a partition-merge approach for building parallel spatial query pipelines, which fits nicely with the computing model of MapReduce. We demonstrate our framework on supporting multi-way spatial joins for algorithm evaluation and nearest neighbor queries for microanatomic objects. To reduce query response time, we propose cost based query optimization to mitigate the effect of data skew. Our experiments show that the framework can efficiently support complex analytical spatial queries on MapReduce.
THE RELATION OF HIGH-PERFORMANCE WORK SYSTEMS WITH EMPLOYEE INVOLVEMENT

Directory of Open Access Journals (Sweden)

Bilal AFSAR

2010-01-01

Full Text Available The basic aim of high performance work systems is to enable employees to exercise decision making, leading to flexibility, innovation, improvement and skill sharing. By facilitating the development of high performance work systems we help organizations make continuous improvement a way of life.The notion of a high-performance work system (HPWS constitutes a claim that there exists a system of work practices for core workers in an organisation that leads in some way to superior performance. This article will discuss the relation that HPWS has with the improvement of firms’ performance and high involvement of the employees.
High performance statistical computing with parallel R: applications to biology and climate modelling

International Nuclear Information System (INIS)

Samatova, Nagiza F; Branstetter, Marcia; Ganguly, Auroop R; Hettich, Robert; Khan, Shiraj; Kora, Guruprasad; Li, Jiangtian; Ma, Xiaosong; Pan, Chongle; Shoshani, Arie; Yoginath, Srikanth

2006-01-01

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines
Energy efficient distributed computing systems

CERN Document Server

Lee, Young-Choon

2012-01-01

The energy consumption issue in distributed computing systems raises various monetary, environmental and system performance concerns. Electricity consumption in the US doubled from 2000 to 2005. From a financial and environmental standpoint, reducing the consumption of electricity is important, yet these reforms must not lead to performance degradation of the computing systems. These contradicting constraints create a suite of complex problems that need to be resolved in order to lead to 'greener' distributed computing systems. This book brings together a group of outsta
The Use of High Performance Computing (HPC) to Strengthen the Development of Army Systems

Science.gov (United States)

2011-11-01

changes in what the warfighter wants – in the middle of an acquisition cycle such changes create havoc in terms of delays, recycling of the research...A little bit later the first personal computers (PCs) came on the market, mostly as curiosities . The operating systems were either ms-dos or cp/m
Towards high performance processing in modern Java-based control systems

International Nuclear Information System (INIS)

Misiowiec, M.; Buczak, W.; Buttner, M.

2012-01-01

CERN controls software is often developed on Java foundation. Some systems carry out a combination of data, network and processor intensive tasks within strict time limits. Hence, there is a demand for high performing, quasi real time solutions. The system must handle dozens of thousands of data samples every second, along its three tiers, applying complex computations throughout. To accomplish the goal, a deep understanding of multi-threading, memory management and inter process communication was required. There are unexpected traps hidden behind an excessive use of 64 bit memory or severe impact on the processing flow of modern garbage collectors. Tuning JVM configuration significantly affects the execution of the code. Even more important is the amount of threads and the data structures used between them. Accurately dividing work into independent tasks might boost system performance. Thorough profiling with dedicated tools helped understand the bottlenecks and choose algorithmically optimal solutions. Different virtual machines were tested, in a variety of setups and garbage collection options. The overall work provided for discovering actual hard limits of the whole setup. We present this process of designing a challenging system in view of the characteristics and limitations of the contemporary Java run-time environment. (authors)
A data acquisition computer for high energy physics applications DAFNE:- hardware manual

International Nuclear Information System (INIS)

Barlow, J.; Seller, P.; De-An, W.

1983-07-01

A high performance stand alone computer system based on the Motorola 68000 micro processor has been built at the Rutherford Appleton Laboratory. Although the design was strongly influenced by the requirement to provide a compact data acquisition computer for the high energy physics environment, the system is sufficiently general to find applications in a wider area. It provides colour graphics and tape and disc storage together with access to CAMAC systems. This report is the hardware manual of the data acquisition computer, DAFNE (Data Acquisition For Nuclear Experiments), and as such contains a full description of the hardware structure of the computer system. (author)
High performance computer code for molecular dynamics simulations

International Nuclear Information System (INIS)

Levay, I.; Toekesi, K.

2007-01-01

Complete text of publication follows. Molecular Dynamics (MD) simulation is a widely used technique for modeling complicated physical phenomena. Since 2005 we are developing a MD simulations code for PC computers. The computer code is written in C++ object oriented programming language. The aim of our work is twofold: a) to develop a fast computer code for the study of random walk of guest atoms in Be crystal, b) 3 dimensional (3D) visualization of the particles motion. In this case we mimic the motion of the guest atoms in the crystal (diffusion-type motion), and the motion of atoms in the crystallattice (crystal deformation). Nowadays, it is common to use Graphics Devices in intensive computational problems. There are several ways to use this extreme processing performance, but never before was so easy to programming these devices as now. The CUDA (Compute Unified Device) Architecture introduced by nVidia Corporation in 2007 is a very useful for every processor hungry application. A Unified-architecture GPU include 96-128, or more stream processors, so the raw calculation performance is 576(!) GFLOPS. It is ten times faster, than the fastest dual Core CPU [Fig.1]. Our improved MD simulation software uses this new technology, which speed up our software and the code run 10 times faster in the critical calculation code segment. Although the GPU is a very powerful tool, it has a strongly paralleled structure. It means, that we have to create an algorithm, which works on several processors without deadlock. Our code currently uses 256 threads, shared and constant on-chip memory, instead of global memory, which is 100 times slower than others. It is possible to implement the total algorithm on GPU, therefore we do not need to download and upload the data in every iteration. On behalf of maximal throughput, every thread run with the same instructions
Investigating the effectiveness of many-core network processors for high performance cyber protection systems. Part I, FY2011.

Energy Technology Data Exchange (ETDEWEB)

Wheeler, Kyle Bruce; Naegle, John Hunt; Wright, Brian J.; Benner, Robert E., Jr.; Shelburg, Jeffrey Scott; Pearson, David Benjamin; Johnson, Joshua Alan; Onunkwo, Uzoma A.; Zage, David John; Patel, Jay S.

2011-09-01

This report documents our first year efforts to address the use of many-core processors for high performance cyber protection. As the demands grow for higher bandwidth (beyond 1 Gbits/sec) on network connections, the need to provide faster and more efficient solution to cyber security grows. Fortunately, in recent years, the development of many-core network processors have seen increased interest. Prior working experiences with many-core processors have led us to investigate its effectiveness for cyber protection tools, with particular emphasis on high performance firewalls. Although advanced algorithms for smarter cyber protection of high-speed network traffic are being developed, these advanced analysis techniques require significantly more computational capabilities than static techniques. Moreover, many locations where cyber protections are deployed have limited power, space and cooling resources. This makes the use of traditionally large computing systems impractical for the front-end systems that process large network streams; hence, the drive for this study which could potentially yield a highly reconfigurable and rapidly scalable solution.
An Interactive, Web-based High Performance Modeling Environment for Computational Epidemiology.

Science.gov (United States)

Deodhar, Suruchi; Bisset, Keith R; Chen, Jiangzhuo; Ma, Yifei; Marathe, Madhav V

2014-07-01

We present an integrated interactive modeling environment to support public health epidemiology. The environment combines a high resolution individual-based model with a user-friendly web-based interface that allows analysts to access the models and the analytics back-end remotely from a desktop or a mobile device. The environment is based on a loosely-coupled service-oriented-architecture that allows analysts to explore various counter factual scenarios. As the modeling tools for public health epidemiology are getting more sophisticated, it is becoming increasingly hard for non-computational scientists to effectively use the systems that incorporate such models. Thus an important design consideration for an integrated modeling environment is to improve ease of use such that experimental simulations can be driven by the users. This is achieved by designing intuitive and user-friendly interfaces that allow users to design and analyze a computational experiment and steer the experiment based on the state of the system. A key feature of a system that supports this design goal is the ability to start, stop, pause and roll-back the disease propagation and intervention application process interactively. An analyst can access the state of the system at any point in time and formulate dynamic interventions based on additional information obtained through state assessment. In addition, the environment provides automated services for experiment set-up and management, thus reducing the overall time for conducting end-to-end experimental studies. We illustrate the applicability of the system by describing computational experiments based on realistic pandemic planning scenarios. The experiments are designed to demonstrate the system's capability and enhanced user productivity.
InfoMall: An Innovative Strategy for High-Performance Computing and Communications Applications Development.

Science.gov (United States)

Mills, Kim; Fox, Geoffrey

1994-01-01

Describes the InfoMall, a program led by the Northeast Parallel Architectures Center (NPAC) at Syracuse University (New York). The InfoMall features a partnership of approximately 24 organizations offering linked programs in High Performance Computing and Communications (HPCC) technology integration, software development, marketing, education and…
A C++11 implementation of arbitrary-rank tensors for high-performance computing

Science.gov (United States)

Aragón, Alejandro M.

2014-11-01

This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
High performance parallel I/O

CERN Document Server

Prabhat

2014-01-01

Gain Critical Insight into the Parallel I/O EcosystemParallel I/O is an integral component of modern high performance computing (HPC), especially in storing and processing very large datasets to facilitate scientific discovery. Revealing the state of the art in this field, High Performance Parallel I/O draws on insights from leading practitioners, researchers, software architects, developers, and scientists who shed light on the parallel I/O ecosystem.The first part of the book explains how large-scale HPC facilities scope, configure, and operate systems, with an emphasis on choices of I/O har
High performance data transfer

Science.gov (United States)

Cottrell, R.; Fang, C.; Hanushevsky, A.; Kreuger, W.; Yang, W.

2017-10-01

The exponentially increasing need for high speed data transfer is driven by big data, and cloud computing together with the needs of data intensive science, High Performance Computing (HPC), defense, the oil and gas industry etc. We report on the Zettar ZX software. This has been developed since 2013 to meet these growing needs by providing high performance data transfer and encryption in a scalable, balanced, easy to deploy and use way while minimizing power and space utilization. In collaboration with several commercial vendors, Proofs of Concept (PoC) consisting of clusters have been put together using off-the- shelf components to test the ZX scalability and ability to balance services using multiple cores, and links. The PoCs are based on SSD flash storage that is managed by a parallel file system. Each cluster occupies 4 rack units. Using the PoCs, between clusters we have achieved almost 200Gbps memory to memory over two 100Gbps links, and 70Gbps parallel file to parallel file with encryption over a 5000 mile 100Gbps link.
FPGAs in High Perfomance Computing: Results from Two LDRD Projects.

Energy Technology Data Exchange (ETDEWEB)

Underwood, Keith D; Ulmer, Craig D.; Thompson, David; Hemmert, Karl Scott

2006-11-01

Field programmable gate arrays (FPGAs) have been used as alternative computational de-vices for over a decade; however, they have not been used for traditional scientific com-puting due to their perceived lack of floating-point performance. In recent years, there hasbeen a surge of interest in alternatives to traditional microprocessors for high performancecomputing. Sandia National Labs began two projects to determine whether FPGAs wouldbe a suitable alternative to microprocessors for high performance scientific computing and,if so, how they should be integrated into the system. We present results that indicate thatFPGAs could have a significant impact on future systems. FPGAs have thepotentialtohave order of magnitude levels of performance wins on several key algorithms; however,there are serious questions as to whether the system integration challenge can be met. Fur-thermore, there remain challenges in FPGA programming and system level reliability whenusing FPGA devices.4 AcknowledgmentArun Rodrigues provided valuable support and assistance in the use of the Structural Sim-ulation Toolkit within an FPGA context. Curtis Janssen and Steve Plimpton provided valu-able insights into the workings of two Sandia applications (MPQC and LAMMPS, respec-tively).5
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

International Nuclear Information System (INIS)

Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-01-01

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

Energy Technology Data Exchange (ETDEWEB)

Oelerich, Jan Oliver, E-mail: jan.oliver.oelerich@physik.uni-marburg.de; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-06-15

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.

ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

Directory of Open Access Journals (Sweden)

Kim Taeho

2010-09-01

Full Text Available Abstract Background There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC environment with a greatly extended data storage capacity. Results We designed ClustalXeed, a software system for multiple sequence alignment with incremental improvements over previous versions of the ClustalX and ClustalW-MPI software. The primary advantage of ClustalXeed over other multiple sequence alignment software is its ability to align a large family of protein or nucleic acid sequences. To solve the conventional memory-dependency problem, ClustalXeed uses both physical random access memory (RAM and a distributed file-allocation system for distance matrix construction and pair-align computation. The computation efficiency of disk-storage system was markedly improved by implementing an efficient load-balancing algorithm, called "idle node-seeking task algorithm" (INSTA. The new editing option and the graphical user interface (GUI provide ready access to a parallel-computing environment for users who seek fast and easy alignment of large DNA and protein sequence sets. Conclusions ClustalXeed can now compute a large volume of biological sequence data sets, which were not tractable in any other parallel or single MSA program. The main developments include: 1 the ability to tackle larger sequence alignment problems than possible with previous systems through markedly improved storage-handling capabilities. 2 Implementing an efficient task load-balancing algorithm, INSTA, which improves overall processing times for multiple sequence alignment with input sequences of non-uniform length. 3 Support for both single PC and distributed cluster systems.
SCALE: A modular code system for performing standardized computer analyses for licensing evaluation

International Nuclear Information System (INIS)

1997-03-01

This Manual represents Revision 5 of the user documentation for the modular code system referred to as SCALE. The history of the SCALE code system dates back to 1969 when the current Computational Physics and Engineering Division at Oak Ridge National Laboratory (ORNL) began providing the transportation package certification staff at the U.S. Atomic Energy Commission with computational support in the use of the new KENO code for performing criticality safety assessments with the statistical Monte Carlo method. From 1969 to 1976 the certification staff relied on the ORNL staff to assist them in the correct use of codes and data for criticality, shielding, and heat transfer analyses of transportation packages. However, the certification staff learned that, with only occasional use of the codes, it was difficult to become proficient in performing the calculations often needed for an independent safety review. Thus, shortly after the move of the certification staff to the U.S. Nuclear Regulatory Commission (NRC), the NRC staff proposed the development of an easy-to-use analysis system that provided the technical capabilities of the individual modules with which they were familiar. With this proposal, the concept of the Standardized Computer Analyses for Licensing Evaluation (SCALE) code system was born. This manual covers an array of modules written for the SCALE package, consisting of drivers, system libraries, cross section and materials properties libraries, input/output routines, storage modules, and help files
SCALE: A modular code system for performing standardized computer analyses for licensing evaluation

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-03-01

This Manual represents Revision 5 of the user documentation for the modular code system referred to as SCALE. The history of the SCALE code system dates back to 1969 when the current Computational Physics and Engineering Division at Oak Ridge National Laboratory (ORNL) began providing the transportation package certification staff at the U.S. Atomic Energy Commission with computational support in the use of the new KENO code for performing criticality safety assessments with the statistical Monte Carlo method. From 1969 to 1976 the certification staff relied on the ORNL staff to assist them in the correct use of codes and data for criticality, shielding, and heat transfer analyses of transportation packages. However, the certification staff learned that, with only occasional use of the codes, it was difficult to become proficient in performing the calculations often needed for an independent safety review. Thus, shortly after the move of the certification staff to the U.S. Nuclear Regulatory Commission (NRC), the NRC staff proposed the development of an easy-to-use analysis system that provided the technical capabilities of the individual modules with which they were familiar. With this proposal, the concept of the Standardized Computer Analyses for Licensing Evaluation (SCALE) code system was born. This manual covers an array of modules written for the SCALE package, consisting of drivers, system libraries, cross section and materials properties libraries, input/output routines, storage modules, and help files.
High energy physics computing in Japan

International Nuclear Information System (INIS)

Watase, Yoshiyuki

1989-01-01

A brief overview of the computing provision for high energy physics in Japan is presented. Most of the computing power for high energy physics is concentrated in KEK. Here there are two large scale systems: one providing a general computing service including vector processing and the other dedicated to TRISTAN experiments. Each university group has a smaller sized mainframe or VAX system to facilitate both their local computing needs and the remote use of the KEK computers through a network. The large computer system for the TRISTAN experiments is described. An overview of a prospective future large facility is also given. (orig.)
Opportunity for Realizing Ideal Computing System using Cloud Computing Model

OpenAIRE

Sreeramana Aithal; Vaikunth Pai T

2017-01-01

An ideal computing system is a computing system with ideal characteristics. The major components and their performance characteristics of such hypothetical system can be studied as a model with predicted input, output, system and environmental characteristics using the identified objectives of computing which can be used in any platform, any type of computing system, and for application automation, without making modifications in the form of structure, hardware, and software coding by an exte...
Reliable computer systems.

Science.gov (United States)

Wear, L L; Pinkert, J R

1993-11-01

In this article, we looked at some decisions that apply to the design of reliable computer systems. We began with a discussion of several terms such as testability, then described some systems that call for highly reliable hardware and software. The article concluded with a discussion of methods that can be used to achieve higher reliability in computer systems. Reliability and fault tolerance in computers probably will continue to grow in importance. As more and more systems are computerized, people will want assurances about the reliability of these systems, and their ability to work properly even when sub-systems fail.
Applying Machine Learning and High Performance Computing to Water Quality Assessment and Prediction

Directory of Open Access Journals (Sweden)

Ruijian Zhang

2017-12-01

Full Text Available Water quality assessment and prediction is a more and more important issue. Traditional ways either take lots of time or they can only do assessments. In this research, by applying machine learning algorithm to a long period time of water attributes’ data; we can generate a decision tree so that it can predict the future day’s water quality in an easy and efficient way. The idea is to combine the traditional ways and the computer algorithms together. Using machine learning algorithms, the assessment of water quality will be far more efficient, and by generating the decision tree, the prediction will be quite accurate. The drawback of the machine learning modeling is that the execution takes quite long time, especially when we employ a better accuracy but more time-consuming algorithm in clustering. Therefore, we applied the high performance computing (HPC System to deal with this problem. Up to now, the pilot experiments have achieved very promising preliminary results. The visualized water quality assessment and prediction obtained from this project would be published in an interactive website so that the public and the environmental managers could use the information for their decision making.
An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

Science.gov (United States)

Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

2012-01-01

The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
Toward High Performance in Industrial Refrigeration Systems

DEFF Research Database (Denmark)

Thybo, C.; Izadi-Zamanabadi, Roozbeh; Niemann, H.

2002-01-01

Achieving high performance in complex industrial systems requires information manipulation at different system levels. The paper shows how different models of same subsystems, but using different quality of information/data, are used for fault diagnosis as well as robust control design...
Towards high performance in industrial refrigeration systems

DEFF Research Database (Denmark)

Thybo, C.; Izadi-Zamanabadi, R.; Niemann, Hans Henrik

2002-01-01

Achieving high performance in complex industrial systems requires information manipulation at different system levels. The paper shows how different models of same subsystems, but using different quality of information/data, are used for fault diagnosis as well as robust control design...
SURE: a system of computer codes for performing sensitivity/uncertainty analyses with the RELAP code

International Nuclear Information System (INIS)

Bjerke, M.A.

1983-02-01

A package of computer codes has been developed to perform a nonlinear uncertainty analysis on transient thermal-hydraulic systems which are modeled with the RELAP computer code. Using an uncertainty around the analyses of experiments in the PWR-BDHT Separate Effects Program at Oak Ridge National Laboratory. The use of FORTRAN programs running interactively on the PDP-10 computer has made the system very easy to use and provided great flexibility in the choice of processing paths. Several experiments simulating a loss-of-coolant accident in a nuclear reactor have been successfully analyzed. It has been shown that the system can be automated easily to further simplify its use and that the conversion of the entire system to a base code other than RELAP is possible
Computer-controlled detection system for high-precision isotope ratio measurements

International Nuclear Information System (INIS)

McCord, B.R.; Taylor, J.W.

1986-01-01

In this paper the authors describe a detection system for high-precision isotope ratio measurements. In this new system, the requirement for a ratioing digital voltmeter has been eliminated, and a standard digital voltmeter interfaced to a computer is employed. Instead of measuring the ratio of the two steadily increasing output voltages simultaneously, the digital voltmeter alternately samples the outputs at a precise rate over a certain period of time. The data are sent to the computer which calculates the rate of charge of each amplifier and divides the two rates to obtain the isotopic ratio. These results simulate a coincident measurement of the output of both integrators. The charge rate is calculated by using a linear regression method, and the standard error of the slope gives a measure of the stability of the system at the time the measurement was taken
ArrayBridge: Interweaving declarative array processing with high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Xing, Haoyuan [The Ohio State Univ., Columbus, OH (United States); Floratos, Sofoklis [The Ohio State Univ., Columbus, OH (United States); Blanas, Spyros [The Ohio State Univ., Columbus, OH (United States); Byna, Suren [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Prabhat, Prabhat [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wu, Kesheng [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Brown, Paul [Paradigm4, Inc., Waltham, MA (United States)

2017-05-04

Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aims to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.
Design and Implementation of High-Performance GIS Dynamic Objects Rendering Engine

Science.gov (United States)

Zhong, Y.; Wang, S.; Li, R.; Yun, W.; Song, G.

2017-12-01

Spatio-temporal dynamic visualization is more vivid than static visualization. It important to use dynamic visualization techniques to reveal the variation process and trend vividly and comprehensively for the geographical phenomenon. To deal with challenges caused by dynamic visualization of both 2D and 3D spatial dynamic targets, especially for different spatial data types require high-performance GIS dynamic objects rendering engine. The main approach for improving the rendering engine with vast dynamic targets relies on key technologies of high-performance GIS, including memory computing, parallel computing, GPU computing and high-performance algorisms. In this study, high-performance GIS dynamic objects rendering engine is designed and implemented for solving the problem based on hybrid accelerative techniques. The high-performance GIS rendering engine contains GPU computing, OpenGL technology, and high-performance algorism with the advantage of 64-bit memory computing. It processes 2D, 3D dynamic target data efficiently and runs smoothly with vast dynamic target data. The prototype system of high-performance GIS dynamic objects rendering engine is developed based SuperMap GIS iObjects. The experiments are designed for large-scale spatial data visualization, the results showed that the high-performance GIS dynamic objects rendering engine have the advantage of high performance. Rendering two-dimensional and three-dimensional dynamic objects achieve 20 times faster on GPU than on CPU.
Real-time image reconstruction and display system for MRI using a high-speed personal computer.

Science.gov (United States)

Haishi, T; Kose, K

1998-09-01

A real-time NMR image reconstruction and display system was developed using a high-speed personal computer and optimized for the 32-bit multitasking Microsoft Windows 95 operating system. The system was operated at various CPU clock frequencies by changing the motherboard clock frequency and the processor/bus frequency ratio. When the Pentium CPU was used at the 200 MHz clock frequency, the reconstruction time for one 128 x 128 pixel image was 48 ms and that for the image display on the enlarged 256 x 256 pixel window was about 8 ms. NMR imaging experiments were performed with three fast imaging sequences (FLASH, multishot EPI, and one-shot EPI) to demonstrate the ability of the real-time system. It was concluded that in most cases, high-speed PC would be the best choice for the image reconstruction and display system for real-time MRI. Copyright 1998 Academic Press.
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

International Nuclear Information System (INIS)

Brown, W. Michael; Wang, Peng; Plimpton, Steven J.; Tharrington, Arnold N.

2011-01-01

The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both many-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.
A Modular Environment for Geophysical Inversion and Run-time Autotuning using Heterogeneous Computing Systems

Science.gov (United States)

Myre, Joseph M.

Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that
Expansion of the TFTR neutral beam computer system

International Nuclear Information System (INIS)

McEnerney, J.; Chu, J.; Davis, S.; Fitzwater, J.; Fleming, G.; Funk, P.; Hirsch, J.; Lagin, L.; Locasak, V.; Randerson, L.; Schechtman, N.; Silber, K.; Skelly, G.; Stark, W.

1992-01-01

Previous TFTR Neutral Beam computing support was based primarily on an Encore Concept 32/8750 computer within the TFTR Central Instrumentation, Control and Data Acquisition System (CICADA). The resources of this machine were 90% utilized during a 2.5 minute duty cycle. Both interactive and automatic processes were supported, with interactive response suffering at lower priority. Further, there were additional computing requirements and no cost effective path for expansion within the Encore framework. Two elements provided a solution to these problems: improved price performance for computing and a high speed bus link to the SELBUS. The purchase of a Sun SPARCstation and a VME/SELBUS bus link, allowed offloading the automatic processing to the workstation. This paper describes the details of the system including the performance of the bus link and Sun SPARCstation, raw data acquisition and data server functions, application software conversion issues, and experiences with the UNIX operating system in the mixed platform environment
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations

Energy Technology Data Exchange (ETDEWEB)

Pieper, Andreas [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Kreutzer, Moritz [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Alvermann, Andreas, E-mail: alvermann@physik.uni-greifswald.de [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Galgon, Martin [Bergische Universität Wuppertal (Germany); Fehske, Holger [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Hager, Georg [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Lang, Bruno [Bergische Universität Wuppertal (Germany); Wellein, Gerhard [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany)

2016-11-15

We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique the subspace projection onto the target space of wanted eigenvectors is approximated with filter polynomials obtained from Chebyshev expansions of window functions. After the discussion of the conceptual foundations of Chebyshev filter diagonalization we analyze the impact of the choice of the damping kernel, search space size, and filter polynomial degree on the computational accuracy and effort, before we describe the necessary steps towards a parallel high-performance implementation. Because Chebyshev filter diagonalization avoids the need for matrix inversion it can deal with matrices and problem sizes that are presently not accessible with rational function methods based on direct or iterative linear solvers. To demonstrate the potential of Chebyshev filter diagonalization for large-scale problems of this kind we include as an example the computation of the 10{sup 2} innermost eigenpairs of a topological insulator matrix with dimension 10{sup 9} derived from quantum physics applications.
High-Performance Energy Applications and Systems

Energy Technology Data Exchange (ETDEWEB)

Miller, Barton [Univ. of Wisconsin, Madison, WI (United States)

2014-01-01

The Paradyn project has a history of developing algorithms, techniques, and software that push the cutting edge of tool technology for high-end computing systems. Under this funding, we are working on a three-year agenda to make substantial new advances in support of new and emerging Petascale systems. The overall goal for this work is to address the steady increase in complexity of these petascale systems. Our work covers two key areas: (1) The analysis, instrumentation and control of binary programs. Work in this area falls under the general framework of the Dyninst API tool kits. (2) Infrastructure for building tools and applications at extreme scale. Work in this area falls under the general framework of the MRNet scalability framework. Note that work done under this funding is closely related to work done under a contemporaneous grant, “Foundational Tools for Petascale Computing”, SC0003922/FG02-10ER25940, UW PRJ27NU.

High Performance Proactive Digital Forensics

International Nuclear Information System (INIS)

Alharbi, Soltan; Traore, Issa; Moa, Belaid; Weber-Jahnke, Jens

2012-01-01

With the increase in the number of digital crimes and in their sophistication, High Performance Computing (HPC) is becoming a must in Digital Forensics (DF). According to the FBI annual report, the size of data processed during the 2010 fiscal year reached 3,086 TB (compared to 2,334 TB in 2009) and the number of agencies that requested Regional Computer Forensics Laboratory assistance increasing from 689 in 2009 to 722 in 2010. Since most investigation tools are both I/O and CPU bound, the next-generation DF tools are required to be distributed and offer HPC capabilities. The need for HPC is even more evident in investigating crimes on clouds or when proactive DF analysis and on-site investigation, requiring semi-real time processing, are performed. Although overcoming the performance challenge is a major goal in DF, as far as we know, there is almost no research on HPC-DF except for few papers. As such, in this work, we extend our work on the need of a proactive system and present a high performance automated proactive digital forensic system. The most expensive phase of the system, namely proactive analysis and detection, uses a parallel extension of the iterative z algorithm. It also implements new parallel information-based outlier detection algorithms to proactively and forensically handle suspicious activities. To analyse a large number of targets and events and continuously do so (to capture the dynamics of the system), we rely on a multi-resolution approach to explore the digital forensic space. Data set from the Honeynet Forensic Challenge in 2001 is used to evaluate the system from DF and HPC perspectives.
Computational Design and Experimental Validation of New Thermal Barrier Systems

Energy Technology Data Exchange (ETDEWEB)

Guo, Shengmin; Yang, Shizhong; Khosravi, Ebrahim

2011-12-31

This project (10/01/2010-9/30/2013), “Computational Design and Experimental Validation of New Thermal Barrier Systems”, originates from Louisiana State University (LSU) Mechanical Engineering Department and Southern University (SU) Department of Computer Science. This proposal will directly support the technical goals specified in DE-FOA-0000248, Topic Area 3: Turbine Materials, by addressing key technologies needed to enable the development of advanced turbines and turbine-based systems that will operate safely and efficiently using coal-derived synthesis gases. We will develop novel molecular dynamics method to improve the efficiency of simulation on novel TBC materials; we will perform high performance computing (HPC) on complex TBC structures to screen the most promising TBC compositions; we will perform material characterizations and oxidation/corrosion tests; and we will demonstrate our new Thermal barrier coating (TBC) systems experimentally under Integrated gasification combined cycle (IGCC) environments. The durability of the coating will be examined using the proposed High Temperature/High Pressure Durability Test Rig under real syngas product compositions.
Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons

Directory of Open Access Journals (Sweden)

Ernestina Martel

2018-06-01

Full Text Available Dimensionality reduction represents a critical preprocessing step in order to increase the efficiency and the performance of many hyperspectral imaging algorithms. However, dimensionality reduction algorithms, such as the Principal Component Analysis (PCA, suffer from their computationally demanding nature, becoming advisable for their implementation onto high-performance computer architectures for applications under strict latency constraints. This work presents the implementation of the PCA algorithm onto two different high-performance devices, namely, an NVIDIA Graphics Processing Unit (GPU and a Kalray manycore, uncovering a highly valuable set of tips and tricks in order to take full advantage of the inherent parallelism of these high-performance computing platforms, and hence, reducing the time that is required to process a given hyperspectral image. Moreover, the achieved results obtained with different hyperspectral images have been compared with the ones that were obtained with a field programmable gate array (FPGA-based implementation of the PCA algorithm that has been recently published, providing, for the first time in the literature, a comprehensive analysis in order to highlight the pros and cons of each option.
High Performance Computing and Storage Requirements for Nuclear Physics: Target 2017

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2014-04-30

In April 2014, NERSC, ASCR, and the DOE Office of Nuclear Physics (NP) held a review to characterize high performance computing (HPC) and storage requirements for NP research through 2017. This review is the 12th in a series of reviews held by NERSC and Office of Science program offices that began in 2009. It is the second for NP, and the final in the second round of reviews that covered the six Office of Science program offices. This report is the result of that review
Preparing systems engineering and computing science students in disciplined methods, quantitative, and advanced statistical techniques to improve process performance

Science.gov (United States)

McCray, Wilmon Wil L., Jr.

The research was prompted by a need to conduct a study that assesses process improvement, quality management and analytical techniques taught to students in U.S. colleges and universities undergraduate and graduate systems engineering and the computing science discipline (e.g., software engineering, computer science, and information technology) degree programs during their academic training that can be applied to quantitatively manage processes for performance. Everyone involved in executing repeatable processes in the software and systems development lifecycle processes needs to become familiar with the concepts of quantitative management, statistical thinking, process improvement methods and how they relate to process-performance. Organizations are starting to embrace the de facto Software Engineering Institute (SEI) Capability Maturity Model Integration (CMMI RTM) Models as process improvement frameworks to improve business processes performance. High maturity process areas in the CMMI model imply the use of analytical, statistical, quantitative management techniques, and process performance modeling to identify and eliminate sources of variation, continually improve process-performance; reduce cost and predict future outcomes. The research study identifies and provides a detail discussion of the gap analysis findings of process improvement and quantitative analysis techniques taught in U.S. universities systems engineering and computing science degree programs, gaps that exist in the literature, and a comparison analysis which identifies the gaps that exist between the SEI's "healthy ingredients " of a process performance model and courses taught in U.S. universities degree program. The research also heightens awareness that academicians have conducted little research on applicable statistics and quantitative techniques that can be used to demonstrate high maturity as implied in the CMMI models. The research also includes a Monte Carlo simulation optimization
Effectiveness of an Electronic Performance Support System on Computer Ethics and Ethical Decision-Making Education

Science.gov (United States)

Kert, Serhat Bahadir; Uz, Cigdem; Gecu, Zeynep

2014-01-01

This study examined the effectiveness of an electronic performance support system (EPSS) on computer ethics education and the ethical decision-making processes. There were five different phases to this ten month study: (1) Writing computer ethics scenarios, (2) Designing a decision-making framework (3) Developing EPSS software (4) Using EPSS in a…
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time. © 2011 Springer-Verlag.
Peer-to-peer computing for secure high performance data copying

International Nuclear Information System (INIS)

Hanushevsky, A.; Trunov, A.; Cottrell, L.

2001-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model--if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, the authors present the bbcp architecture, it's various features, and the reasons for their inclusion
Peer-to-Peer Computing for Secure High Performance Data Copying

International Nuclear Information System (INIS)

2002-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model -- if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, we preset the bbcp architecture, it's various features, and the reasons for their inclusion
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

Energy Technology Data Exchange (ETDEWEB)

Aimone, James Bradley [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Betty, Rita [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2015-03-01

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.
High-End Computing Challenges in Aerospace Design and Engineering

Science.gov (United States)

Bailey, F. Ronald

2004-01-01

High-End Computing (HEC) has had significant impact on aerospace design and engineering and is poised to make even more in the future. In this paper we describe four aerospace design and engineering challenges: Digital Flight, Launch Simulation, Rocket Fuel System and Digital Astronaut. The paper discusses modeling capabilities needed for each challenge and presents projections of future near and far-term HEC computing requirements. NASA's HEC Project Columbia is described and programming strategies presented that are necessary to achieve high real performance.
Advanced topics in security computer system design

International Nuclear Information System (INIS)

Stachniak, D.E.; Lamb, W.R.

1989-01-01

The capability, performance, and speed of contemporary computer processors, plus the associated performance capability of the operating systems accommodating the processors, have enormously expanded the scope of possibilities for designers of nuclear power plant security computer systems. This paper addresses the choices that could be made by a designer of security computer systems working with contemporary computers and describes the improvement in functionality of contemporary security computer systems based on an optimally chosen design. Primary initial considerations concern the selection of (a) the computer hardware and (b) the operating system. Considerations for hardware selection concern processor and memory word length, memory capacity, and numerous processor features
Computer System Design System-on-Chip

CERN Document Server

Flynn, Michael J

2011-01-01

The next generation of computer system designers will be less concerned about details of processors and memories, and more concerned about the elements of a system tailored to particular applications. These designers will have a fundamental knowledge of processors and other elements in the system, but the success of their design will depend on the skills in making system-level tradeoffs that optimize the cost, performance and other attributes to meet application requirements. This book provides a new treatment of computer system design, particularly for System-on-Chip (SOC), which addresses th
SISYPHUS: A high performance seismic inversion factory

Science.gov (United States)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

In the recent years the massively parallel high performance computers became the standard instruments for solving the forward and inverse problems in seismology. The respective software packages dedicated to forward and inverse waveform modelling specially designed for such computers (SPECFEM3D, SES3D) became mature and widely available. These packages achieve significant computational performance and provide researchers with an opportunity to solve problems of bigger size at higher resolution within a shorter time. However, a typical seismic inversion process contains various activities that are beyond the common solver functionality. They include management of information on seismic events and stations, 3D models, observed and synthetic seismograms, pre-processing of the observed signals, computation of misfits and adjoint sources, minimization of misfits, and process workflow management. These activities are time consuming, seldom sufficiently automated, and therefore represent a bottleneck that can substantially offset performance benefits provided by even the most powerful modern supercomputers. Furthermore, a typical system architecture of modern supercomputing platforms is oriented towards the maximum computational performance and provides limited standard facilities for automation of the supporting activities. We present a prototype solution that automates all aspects of the seismic inversion process and is tuned for the modern massively parallel high performance computing systems. We address several major aspects of the solution architecture, which include (1) design of an inversion state database for tracing all relevant aspects of the entire solution process, (2) design of an extensible workflow management framework, (3) integration with wave propagation solvers, (4) integration with optimization packages, (5) computation of misfits and adjoint sources, and (6) process monitoring. The inversion state database represents a hierarchical structure with
Fast Performance Computing Model for Smart Distributed Power Systems

Directory of Open Access Journals (Sweden)

Umair Younas

2017-06-01

Full Text Available Plug-in Electric Vehicles (PEVs are becoming the more prominent solution compared to fossil fuels cars technology due to its significant role in Greenhouse Gas (GHG reduction, flexible storage, and ancillary service provision as a Distributed Generation (DG resource in Vehicle to Grid (V2G regulation mode. However, large-scale penetration of PEVs and growing demand of energy intensive Data Centers (DCs brings undesirable higher load peaks in electricity demand hence, impose supply-demand imbalance and threaten the reliability of wholesale and retail power market. In order to overcome the aforementioned challenges, the proposed research considers smart Distributed Power System (DPS comprising conventional sources, renewable energy, V2G regulation, and flexible storage energy resources. Moreover, price and incentive based Demand Response (DR programs are implemented to sustain the balance between net demand and available generating resources in the DPS. In addition, we adapted a novel strategy to implement the computational intensive jobs of the proposed DPS model including incoming load profiles, V2G regulation, battery State of Charge (SOC indication, and fast computation in decision based automated DR algorithm using Fast Performance Computing resources of DCs. In response, DPS provide economical and stable power to DCs under strict power quality constraints. Finally, the improved results are verified using case study of ISO California integrated with hybrid generation.
Scalability of DL_POLY on High Performance Computing Platform

CSIR Research Space (South Africa)

Mabakane, Mabule S

2017-12-01

Full Text Available stream_source_info Mabakanea_19979_2017.pdf.txt stream_content_type text/plain stream_size 33716 Content-Encoding UTF-8 stream_name Mabakanea_19979_2017.pdf.txt Content-Type text/plain; charset=UTF-8 SACJ 29(3) December... when using many processors within the compute nodes of the supercomputer. The type of the processors of compute nodes and their memory also play an important role in the overall performance of the parallel application running on a supercomputer. DL...
ABINIT: Plane-Wave-Based Density-Functional Theory on High Performance Computers

Science.gov (United States)

Torrent, Marc

2014-03-01

For several years, a continuous effort has been produced to adapt electronic structure codes based on Density-Functional Theory to the future computing architectures. Among these codes, ABINIT is based on a plane-wave description of the wave functions which allows to treat systems of any kind. Porting such a code on petascale architectures pose difficulties related to the many-body nature of the DFT equations. To improve the performances of ABINIT - especially for what concerns standard LDA/GGA ground-state and response-function calculations - several strategies have been followed: A full multi-level parallelisation MPI scheme has been implemented, exploiting all possible levels and distributing both computation and memory. It allows to increase the number of distributed processes and could not be achieved without a strong restructuring of the code. The core algorithm used to solve the eigen problem (``Locally Optimal Blocked Congugate Gradient''), a Blocked-Davidson-like algorithm, is based on a distribution of processes combining plane-waves and bands. In addition to the distributed memory parallelization, a full hybrid scheme has been implemented, using standard shared-memory directives (openMP/openACC) or porting some comsuming code sections to Graphics Processing Units (GPU). As no simple performance model exists, the complexity of use has been increased; the code efficiency strongly depends on the distribution of processes among the numerous levels. ABINIT is able to predict the performances of several process distributions and automatically choose the most favourable one. On the other hand, a big effort has been carried out to analyse the performances of the code on petascale architectures, showing which sections of codes have to be improved; they all are related to Matrix Algebra (diagonalisation, orthogonalisation). The different strategies employed to improve the code scalability will be described. They are based on an exploration of new diagonalization
Study of application technology of ultra-high speed computer to the elucidation of complex phenomena

International Nuclear Information System (INIS)

Sekiguchi, Tomotsugu

1996-01-01

The basic design of numerical information library in the decentralized computer network was explained at the first step of constructing the application technology of ultra-high speed computer to the elucidation of complex phenomena. Establishment of the system makes possible to construct the efficient application environment of ultra-high speed computer system to be scalable with the different computing systems. We named the system Ninf (Network Information Library for High Performance Computing). The summary of application technology of library was described as follows: the application technology of library under the distributed environment, numeric constants, retrieval of value, library of special functions, computing library, Ninf library interface, Ninf remote library and registration. By the system, user is able to use the program concentrating the analyzing technology of numerical value with high precision, reliability and speed. (S.Y.)
Human performance models for computer-aided engineering

Science.gov (United States)

Elkind, Jerome I. (Editor); Card, Stuart K. (Editor); Hochberg, Julian (Editor); Huey, Beverly Messick (Editor)

1989-01-01

This report discusses a topic important to the field of computational human factors: models of human performance and their use in computer-based engineering facilities for the design of complex systems. It focuses on a particular human factors design problem -- the design of cockpit systems for advanced helicopters -- and on a particular aspect of human performance -- vision and related cognitive functions. By focusing in this way, the authors were able to address the selected topics in some depth and develop findings and recommendations that they believe have application to many other aspects of human performance and to other design domains.
Analysis of parallel computing performance of the code MCNP

International Nuclear Information System (INIS)

Wang Lei; Wang Kan; Yu Ganglin

2006-01-01

Parallel computing can reduce the running time of the code MCNP effectively. With the MPI message transmitting software, MCNP5 can achieve its parallel computing on PC cluster with Windows operating system. Parallel computing performance of MCNP is influenced by factors such as the type, the complexity level and the parameter configuration of the computing problem. This paper analyzes the parallel computing performance of MCNP regarding with these factors and gives measures to improve the MCNP parallel computing performance. (authors)

Intelligent computational systems for space applications

Science.gov (United States)

Lum, Henry; Lau, Sonie

Intelligent computational systems can be described as an adaptive computational system integrating both traditional computational approaches and artificial intelligence (AI) methodologies to meet the science and engineering data processing requirements imposed by specific mission objectives. These systems will be capable of integrating, interpreting, and understanding sensor input information; correlating that information to the "world model" stored within its data base and understanding the differences, if any; defining, verifying, and validating a command sequence to merge the "external world" with the "internal world model"; and, controlling the vehicle and/or platform to meet the scientific and engineering mission objectives. Performance and simulation data obtained to date indicate that the current flight processors baselined for many missions such as Space Station Freedom do not have the computational power to meet the challenges of advanced automation and robotics systems envisioned for the year 2000 era. Research issues which must be addressed to achieve greater than giga-flop performance for on-board intelligent computational systems have been identified, and a technology development program has been initiated to achieve the desired long-term system performance objectives.
Optimal dynamic performance for high-precision actuators/stages

International Nuclear Information System (INIS)

Preissner, C.; Lee, S.-H.; Royston, T. J.; Shu, D.

2002-01-01

System dynamic performance of actuator/stage groups, such as those found in optical instrument positioning systems and other high-precision applications, is dependent upon both individual component behavior and the system configuration. Experimental modal analysis techniques were implemented to determine the six degree of freedom stiffnesses and damping for individual actuator components. These experimental data were then used in a multibody dynamic computer model to investigate the effect of stage group configuration. Running the computer model through the possible stage configurations and observing the predicted vibratory response determined the optimal stage group configuration. Configuration optimization can be performed for any group of stages, provided there is stiffness and damping data available for the constituent pieces
FY 1996 Blue Book: High Performance Computing and Communications: Foundations for America`s Information Future

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
FY 1997 Blue Book: High Performance Computing and Communications: Advancing the Frontiers of Information Technology

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
SUMO, System performance assessment for a high-level nuclear waste repository: Mathematical models

International Nuclear Information System (INIS)

Eslinger, P.W.; Miley, T.B.; Engel, D.W.; Chamberlain, P.J. II.

1992-09-01

Following completion of the preliminary risk assessment of the potential Yucca Mountain Site by Pacific Northwest Laboratory (PNL) in 1988, the Office of Civilian Radioactive Waste Management (OCRWM) of the US Department of Energy (DOE) requested the Performance Assessment Scientific Support (PASS) Program at PNL to develop an integrated system model and computer code that provides performance and risk assessment analysis capabilities for a potential high-level nuclear waste repository. The system model that has been developed addresses the cumulative radionuclide release criteria established by the US Environmental Protection Agency (EPA) and estimates population risks in terms of dose to humans. The system model embodied in the SUMO (System Unsaturated Model) code will also allow benchmarking of other models being developed for the Yucca Mountain Project. The system model has three natural divisions: (1) source term, (2) far-field transport, and (3) dose to humans. This document gives a detailed description of the mathematics of each of these three divisions. Each of the governing equations employed is based on modeling assumptions that are widely accepted within the scientific community
Research on high-performance mass storage system

International Nuclear Information System (INIS)

Cheng Yaodong; Wang Lu; Huang Qiulan; Zheng Wei

2010-01-01

With the enlargement of scientific experiments, more and more data will be produced, which brings great challenge to storage system. Large storage capacity and high data access performance are both important to Mass storage system. This paper firstly reviews some kinds of popular storage systems including network storage system, SAN-based sharing system, WAN File system, object-based parallel file system, hierarchical storage system and cloud storage systems. Then some key technologies are presented. Finally, this paper takes BES storage system as an example and introduces its requirements, architecture and operation results. (authors)
System performance optimization

International Nuclear Information System (INIS)

Bednarz, R.J.

1978-01-01

The System Performance Optimization has become an important and difficult field for large scientific computer centres. Important because the centres must satisfy increasing user demands at the lowest possible cost. Difficult because the System Performance Optimization requires a deep understanding of hardware, software and workload. The optimization is a dynamic process depending on the changes in hardware configuration, current level of the operating system and user generated workload. With the increasing complication of the computer system and software, the field for the optimization manoeuvres broadens. The hardware of two manufacturers IBM and CDC is discussed. Four IBM and two CDC operating systems are described. The description concentrates on the organization of the operating systems, the job scheduling and I/O handling. The performance definitions, workload specification and tools for the system stimulation are given. The measurement tools for the System Performance Optimization are described. The results of the measurement and various methods used for the operating system tuning are discussed. (Auth.)
Ultra-high performance, solid-state, autoradiographic image digitization and analysis system

International Nuclear Information System (INIS)

Lear, J.L.; Pratt, J.P.; Ackermann, R.F.; Plotnick, J.; Rumley, S.

1990-01-01

We developed a Macintosh II-based, charge-coupled device (CCD), image digitization and analysis system for high-speed, high-resolution quantification of autoradiographic image data. A linear CCD array with 3,500 elements was attached to a precision drive assembly and mounted behind a high-uniformity lens. The drive assembly was used to sweep the array perpendicularly to its axis so that an entire 20 x 25-cm autoradiographic image-containing film could be digitized into 256 gray levels at 50-microns resolution in less than 30 sec. The scanner was interfaced to a Macintosh II computer through a specially constructed NuBus circuit board and software was developed for autoradiographic data analysis. The system was evaluated by scanning individual films multiple times, then measuring the variability of the digital data between the different scans. Image data were found to be virtually noise free. The coefficient of variation averaged less than 1%, a value significantly exceeding the accuracy of both high-speed, low-resolution, video camera (VC) systems and low-speed, high-resolution, rotating drum densitometers (RDD). Thus, the CCD scanner-Macintosh computer analysis system offers the advantage over VC systems of the ability to digitize entire films containing many autoradiograms, but with much greater speed and accuracy than achievable with RDD scanners
Performance Analysis and Application of Three Different Computational Methods for Solar Heating System with Seasonal Water Tank Heat Storage

Directory of Open Access Journals (Sweden)

Dongliang Sun

2013-01-01

Full Text Available We analyze and compare three different computational methods for a solar heating system with seasonal water tank heat storage (SHS-SWTHS. These methods are accurate numerical method, temperature stratification method, and uniform temperature method. The accurate numerical method can accurately predict the performance of the system, but it takes about 4 to 5 weeks, which is too long and hard for the performance analysis of this system. The temperature stratification method obtains relatively accurate computation results and takes a relatively short computation time, which is about 2 to 3 hours. Therefore, this method is most suitable for the performance analysis of this system. The deviation of the computational results of the uniform temperature method is great, and the time consumed is similar to that of the temperature stratification method. Therefore, this method is not recommended herein. Based on the above analyses, the temperature stratification method is applied to analyze the influence of the embedded depth of water tank, the thickness of thermal insulation material, and the collection area on the performance of this system. The results will provide a design basis for the related demonstration projects.
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Wirth, Brian D., E-mail: bdwirth@utk.edu [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Nuclear Science and Engineering Directorate, Oak Ridge National Laboratory, Oak Ridge, TN (United States); Hammond, K.D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Krasheninnikov, S.I. [University of California, San Diego, La Jolla, CA (United States); Maroudas, D. [University of Massachusetts, Amherst, Amherst, MA 01003 (United States)

2015-08-15

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification.
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

International Nuclear Information System (INIS)

Wirth, Brian D.; Hammond, K.D.; Krasheninnikov, S.I.; Maroudas, D.

2015-01-01

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification
High performance computations using dynamical nucleation theory

International Nuclear Information System (INIS)

Windus, T L; Crosby, L D; Kathmann, S M

2008-01-01

Chemists continue to explore the use of very large computations to perform simulations that describe the molecular level physics of critical challenges in science. In this paper, we describe the Dynamical Nucleation Theory Monte Carlo (DNTMC) model - a model for determining molecular scale nucleation rate constants - and its parallel capabilities. The potential for bottlenecks and the challenges to running on future petascale or larger resources are delineated. A 'master-slave' solution is proposed to scale to the petascale and will be developed in the NWChem software. In addition, mathematical and data analysis challenges are described
Computing in high energy physics

International Nuclear Information System (INIS)

Hertzberger, L.O.; Hoogland, W.

1986-01-01

This book deals with advanced computing applications in physics, and in particular in high energy physics environments. The main subjects covered are networking; vector and parallel processing; and embedded systems. Also examined are topics such as operating systems, future computer architectures and commercial computer products. The book presents solutions that are foreseen as coping, in the future, with computing problems in experimental and theoretical High Energy Physics. In the experimental environment the large amounts of data to be processed offer special problems on-line as well as off-line. For on-line data reduction, embedded special purpose computers, which are often used for trigger applications are applied. For off-line processing, parallel computers such as emulator farms and the cosmic cube may be employed. The analysis of these topics is therefore a main feature of this volume
High performance optical encryption based on computational ghost imaging with QR code and compressive sensing technique

Science.gov (United States)

Zhao, Shengmei; Wang, Le; Liang, Wenqiang; Cheng, Weiwen; Gong, Longyan

2015-10-01

In this paper, we propose a high performance optical encryption (OE) scheme based on computational ghost imaging (GI) with QR code and compressive sensing (CS) technique, named QR-CGI-OE scheme. N random phase screens, generated by Alice, is a secret key and be shared with its authorized user, Bob. The information is first encoded by Alice with QR code, and the QR-coded image is then encrypted with the aid of computational ghost imaging optical system. Here, measurement results from the GI optical system's bucket detector are the encrypted information and be transmitted to Bob. With the key, Bob decrypts the encrypted information to obtain the QR-coded image with GI and CS techniques, and further recovers the information by QR decoding. The experimental and numerical simulated results show that the authorized users can recover completely the original image, whereas the eavesdroppers can not acquire any information about the image even the eavesdropping ratio (ER) is up to 60% at the given measurement times. For the proposed scheme, the number of bits sent from Alice to Bob are reduced considerably and the robustness is enhanced significantly. Meantime, the measurement times in GI system is reduced and the quality of the reconstructed QR-coded image is improved.
Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Science.gov (United States)

Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

2013-11-05

Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
High performance computing, supercomputing, náročné počítání

Czech Academy of Sciences Publication Activity Database

Okrouhlík, Miloslav

2003-01-01

Roč. 10, č. 5 (2003), s. 429-438 ISSN 1210-2717 R&D Projects: GA ČR GA101/02/0072 Institutional research plan: CEZ:AV0Z2076919 Keywords : high performance computing * vector and parallel computers * programing tools for parellelization Subject RIV: BI - Acoustics
Automated System Tests High-Power MOSFET's

Science.gov (United States)

Huston, Steven W.; Wendt, Isabel O.

1994-01-01

Computer-controlled system tests metal-oxide/semiconductor field-effect transistors (MOSFET's) at high voltages and currents. Measures seven parameters characterizing performance of MOSFET, with view toward obtaining early indication MOSFET defective. Use of test system prior to installation of power MOSFET in high-power circuit saves time and money.
High Performance Work Systems for Online Education

Science.gov (United States)

Contacos-Sawyer, Jonna; Revels, Mark; Ciampa, Mark

2010-01-01

The purpose of this paper is to identify the key elements of a High Performance Work System (HPWS) and explore the possibility of implementation in an online institution of higher learning. With the projected rapid growth of the demand for online education and its importance in post-secondary education, providing high quality curriculum, excellent…
Requirements for high performance computing for lattice QCD. Report of the ECFA working panel

International Nuclear Information System (INIS)

Jegerlehner, F.; Kenway, R.D.; Martinelli, G.; Michael, C.; Pene, O.; Petersson, B.; Petronzio, R.; Sachrajda, C.T.; Schilling, K.

2000-01-01

This report, prepared at the request of the European Committee for Future Accelerators (ECFA), contains an assessment of the High Performance Computing resources which will be required in coming years by European physicists working in Lattice Field Theory and a review of the scientific opportunities which these resources would open. (orig.)
Photons, photosynthesis, and high-performance computing: challenges, progress, and promise of modeling metabolism in green algae

International Nuclear Information System (INIS)

Chang, C H; Graf, P; Alber, D M; Kim, K; Murray, G; Posewitz, M; Seibert, M

2008-01-01

The complexity associated with biological metabolism considered at a kinetic level presents a challenge to quantitative modeling. In particular, the relatively sparse knowledge of parameters for enzymes with known kinetic responses is problematic. The possible space of these parameters is of high-dimension, and sampling of such a space typifies a combinatorial explosion of possible dynamic states. However, with sufficient quantitative transcriptomics, proteomics, and metabolomics data at hand, these challenges could be met by high-performance software with sampling, fitting, and optimization capabilities. With this in mind, we present the High-Performance Systems Biology Toolkit HiPer SBTK, an evolving software package to simulate, fit, and optimize metabolite concentrations and fluxes within the space of rate and binding parameters associated with detailed enzyme kinetic models. We present our chosen modeling paradigm for the formulation of metabolic pathway models, the means to address the challenge of representing such models in a precise and persistent fashion using the standardized Systems Biology Markup Language, and our second-generation model of H2-associated Chlamydomonas metabolism. Processing of such models for hierarchically parallelized simulation and optimization, job specification by the user through a GUI interface, software capabilities and initial scaling data, and the mapping of the computation to biological questions is also discussed. Moreover, we present near-term future software and model development goals

High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

CERN Document Server

Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

2014-01-01

The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment
Brain inspired high performance electronics on flexible silicon

KAUST Repository

Sevilla, Galo T.; Rojas, Jhonathan Prieto; Hussain, Muhammad Mustafa

2014-01-01

Brain's stunning speed, energy efficiency and massive parallelism makes it the role model for upcoming high performance computation systems. Although human brain components are a million times slower than state of the art silicon industry components
Computer systems a programmer's perspective

CERN Document Server

Bryant, Randal E

2016-01-01

Computer systems: A Programmer’s Perspective explains the underlying elements common among all computer systems and how they affect general application performance. Written from the programmer’s perspective, this book strives to teach readers how understanding basic elements of computer systems and executing real practice can lead them to create better programs. Spanning across computer science themes such as hardware architecture, the operating system, and systems software, the Third Edition serves as a comprehensive introduction to programming. This book strives to create programmers who understand all elements of computer systems and will be able to engage in any application of the field--from fixing faulty software, to writing more capable programs, to avoiding common flaws. It lays the groundwork for readers to delve into more intensive topics such as computer architecture, embedded systems, and cybersecurity. This book focuses on systems that execute an x86-64 machine code, and recommends th...
Department of Energy: MICS (Mathematical Information, and Computational Sciences Division). High performance computing and communications program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-06-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, {open_quotes}The DOE Program in HPCC{close_quotes}), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW). The information pointed to by the URL is updated frequently, and the interested reader is urged to access the WWW for the latest information.
High performance stream computing for particle beam transport simulations

International Nuclear Information System (INIS)

Appleby, R; Bailey, D; Higham, J; Salt, M

2008-01-01

Understanding modern particle accelerators requires simulating charged particle transport through the machine elements. These simulations can be very time consuming due to the large number of particles and the need to consider many turns of a circular machine. Stream computing offers an attractive way to dramatically improve the performance of such simulations by calculating the simultaneous transport of many particles using dedicated hardware. Modern Graphics Processing Units (GPUs) are powerful and affordable stream computing devices. The results of simulations of particle transport through the booster-to-storage-ring transfer line of the DIAMOND synchrotron light source using an NVidia GeForce 7900 GPU are compared to the standard transport code MAD. It is found that particle transport calculations are suitable for stream processing and large performance increases are possible. The accuracy and potential speed gains are compared and the prospects for future work in the area are discussed
Large Scale Document Inversion using a Multi-threaded Computing System.

Science.gov (United States)

Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

2017-06-01

Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.
Operational mesoscale atmospheric dispersion prediction using high performance parallel computing cluster for emergency response

International Nuclear Information System (INIS)

Srinivas, C.V.; Venkatesan, R.; Muralidharan, N.V.; Das, Someshwar; Dass, Hari; Eswara Kumar, P.

2005-08-01

An operational atmospheric dispersion prediction system is implemented on a cluster super computer for 'Online Emergency Response' for Kalpakkam nuclear site. The numerical system constitutes a parallel version of a nested grid meso-scale meteorological model MM5 coupled to a random walk particle dispersion model FLEXPART. The system provides 48 hour forecast of the local weather and radioactive plume dispersion due to hypothetical air borne releases in a range of 100 km around the site. The parallel code was implemented on different cluster configurations like distributed and shared memory systems. Results of MM5 run time performance for 1-day prediction are reported on all the machines available for testing. A reduction of 5 times in runtime is achieved using 9 dual Xeon nodes (18 physical/36 logical processors) compared to a single node sequential run. Based on the above run time results a cluster computer facility with 9-node Dual Xeon is commissioned at IGCAR for model operation. The run time of a triple nested domain MM5 is about 4 h for 24 h forecast. The system has been operated continuously for a few months and results were ported on the IMSc home page. Initial and periodic boundary condition data for MM5 are provided by NCMRWF, New Delhi. An alternative source is found to be NCEP, USA. These two sources provide the input data to the operational models at different spatial and temporal resolutions and using different assimilation methods. A comparative study on the results of forecast is presented using these two data sources for present operational use. Slight improvement is noticed in rainfall, winds, geopotential heights and the vertical atmospheric structure while using NCEP data probably because of its high spatial and temporal resolution. (author)
Computed radiography systems performance evaluation;Avaliacao de desempenho de sistemas de radiografia computadorizada

Energy Technology Data Exchange (ETDEWEB)

Xavier, Clarice C.; Nersissian, Denise Y.; Furquim, Tania A.C. [Universidade de Sao Paulo (IEE/USP), SP (Brazil). Inst. de Eletrotecnica e Energia

2009-07-01

The performance of a computed radiography system was evaluated, according to the AAPM Report No. 93. Evaluation tests proposed by the publication were performed, and the following nonconformities were found: imaging p/ate (lP) dark noise, which compromises the clinical image acquired using the IP; exposure indicator uncalibrated, which can cause underexposure to the IP; nonlinearity of the system response, which causes overexposure; resolution limit under the declared by the manufacturer and erasure thoroughness uncalibrated, impairing structures visualization; Moire pattern visualized at the grid response, and IP Throughput over the specified by the manufacturer. These non-conformities indicate that digital imaging systems' lack of calibration can cause an increase in dose in order that image prob/ems can be so/ved. (author)
The off-line computation system for supervising performance of JOYO: JOYPAC system, 1

International Nuclear Information System (INIS)

Katsuragi, Satoru; Inoue, Teruji; Shimizu, Akinao; Yoshino, Fujio; Suzuki, Masao.

1976-10-01

A code system JOYPAC for monitoring the operation of the fast experimental reactor JOYO has been developed. This is an off-line code system designed for use in making calculation of the nuclear and thermohydraulic characteristics of the reactor core and also to make computation of the history of core irradiation after reactor operation. The use of the code system makes it possible to calculate the various core characteristics with a high degree of accuracy by simplified procedure for the diverse operation patterns of JOYO to confirm its safety. It also enables the details of the history of irradiation of the core to be obtained quickly and accurately after reactor operation. The above include all the operation data and in-pile characteristics that are required for the irradiation test. Furthermore, it is also possible to provide the data for the on-line computer system of JOYO and the data for nuclear material accountability. The code system consists of the detailed subsystem and the simplified subsystem. The former is used for obtaining the nuclear and thermohydraulic characteristics of the core by use of a detailed calculation model such as three-dimensional hexagonal lattice, for instance, in order to back up the simplified subsystem. On the other hand, the latter is designed to obtain the various core characteristics by use of simple extrapolation and interpolation methods, whose conception is based on the great deal of information obtained by the design calculation of JOYO and the many parameter surveys. The system is used for the normal cycle operation. (J.P.N.)
Contributing to the design of run-time systems dedicated to high performance computing; Contribution a l'elaboration d'environnements de programmation dedies au calcul scientifique hautes performances

Energy Technology Data Exchange (ETDEWEB)

Perache, M

2006-10-15

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
A high performance, low power computational platform for complex sensing operations in smart cities

KAUST Repository

Jiang, Jiming; Claudel, Christian

2017-01-01

This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4GHz 802.15.4802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from [1]. The hardware design is under CERN Open Hardware License v1.2.
A high performance, low power computational platform for complex sensing operations in smart cities

KAUST Repository

Jiang, Jiming

2017-02-02

This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4GHz 802.15.4802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from [1]. The hardware design is under CERN Open Hardware License v1.2.
Visual Analysis of Cloud Computing Performance Using Behavioral Lines.

Science.gov (United States)

Muelder, Chris; Zhu, Biao; Chen, Wei; Zhang, Hongxin; Ma, Kwan-Liu

2016-02-29

Cloud computing is an essential technology to Big Data analytics and services. A cloud computing system is often comprised of a large number of parallel computing and storage devices. Monitoring the usage and performance of such a system is important for efficient operations, maintenance, and security. Tracing every application on a large cloud system is untenable due to scale and privacy issues. But profile data can be collected relatively efficiently by regularly sampling the state of the system, including properties such as CPU load, memory usage, network usage, and others, creating a set of multivariate time series for each system. Adequate tools for studying such large-scale, multidimensional data are lacking. In this paper, we present a visual based analysis approach to understanding and analyzing the performance and behavior of cloud computing systems. Our design is based on similarity measures and a layout method to portray the behavior of each compute node over time. When visualizing a large number of behavioral lines together, distinct patterns often appear suggesting particular types of performance bottleneck. The resulting system provides multiple linked views, which allow the user to interactively explore the data by examining the data or a selected subset at different levels of detail. Our case studies, which use datasets collected from two different cloud systems, show that this visual based approach is effective in identifying trends and anomalies of the systems.
Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems

CERN Document Server

Barry, Peter

2012-01-01

Modern embedded systems are used for connected, media-rich, and highly integrated handheld devices such as mobile phones, digital cameras, and MP3 players. All of these embedded systems require networking, graphic user interfaces, and integration with PCs, as opposed to traditional embedded processors that can perform only limited functions for industrial applications. While most books focus on these controllers, Modern Embedded Computing provides a thorough understanding of the platform architecture of modern embedded computing systems that drive mobile devices. The book offers a comprehen
Safety of High Speed Ground Transportation Systems : Analytical Methodology for Safety Validation of Computer Controlled Subsystems : Volume 2. Development of a Safety Validation Methodology

Science.gov (United States)

1995-01-01

This report describes the development of a methodology designed to assure that a sufficiently high level of safety is achieved and maintained in computer-based systems which perform safety cortical functions in high-speed rail or magnetic levitation ...
Soft computing in green and renewable energy systems

Energy Technology Data Exchange (ETDEWEB)

Gopalakrishnan, Kasthurirangan [Iowa State Univ., Ames, IA (United States). Iowa Bioeconomy Inst.; US Department of Energy, Ames, IA (United States). Ames Lab; Kalogirou, Soteris [Cyprus Univ. of Technology, Limassol (Cyprus). Dept. of Mechanical Engineering and Materials Sciences and Engineering; Khaitan, Siddhartha Kumar (eds.) [Iowa State Univ. of Science and Technology, Ames, IA (United States). Dept. of Electrical Engineering and Computer Engineering

2011-07-01

Soft Computing in Green and Renewable Energy Systems provides a practical introduction to the application of soft computing techniques and hybrid intelligent systems for designing, modeling, characterizing, optimizing, forecasting, and performance prediction of green and renewable energy systems. Research is proceeding at jet speed on renewable energy (energy derived from natural resources such as sunlight, wind, tides, rain, geothermal heat, biomass, hydrogen, etc.) as policy makers, researchers, economists, and world agencies have joined forces in finding alternative sustainable energy solutions to current critical environmental, economic, and social issues. The innovative models, environmentally benign processes, data analytics, etc. employed in renewable energy systems are computationally-intensive, non-linear and complex as well as involve a high degree of uncertainty. Soft computing technologies, such as fuzzy sets and systems, neural science and systems, evolutionary algorithms and genetic programming, and machine learning, are ideal in handling the noise, imprecision, and uncertainty in the data, and yet achieve robust, low-cost solutions. As a result, intelligent and soft computing paradigms are finding increasing applications in the study of renewable energy systems. Researchers, practitioners, undergraduate and graduate students engaged in the study of renewable energy systems will find this book very useful. (orig.)
THE IMPROVEMENT OF COMPUTER NETWORK PERFORMANCE WITH BANDWIDTH MANAGEMENT IN KEMURNIAN II SENIOR HIGH SCHOOL

Directory of Open Access Journals (Sweden)

Bayu Kanigoro

2012-05-01

Full Text Available This research describes the improvement of computer network performance with bandwidth management in Kemurnian II Senior High School. The main issue of this research is the absence of bandwidth division on computer, which makes user who is downloading data, the provided bandwidth will be absorbed by the user. It leads other users do not get the bandwidth. Besides that, it has been done IP address division on each room, such as computer, teacher and administration room for supporting learning process in Kemurnian II Senior High School, so wireless network is needed. The method is location observation and interview with related parties in Kemurnian II Senior High School, the network analysis has run and designed a new topology network including the wireless network along with its configuration and separation bandwidth on microtic router and its limitation. The result is network traffic on Kemurnian II Senior High School can be shared evenly to each user; IX and IIX traffic are separated, which improve the speed on network access at school and the implementation of wireless network.Keywords: Bandwidth Management; Wireless Network
RGCA: A Reliable GPU Cluster Architecture for Large-Scale Internet of Things Computing Based on Effective Performance-Energy Optimization.

Science.gov (United States)

Fang, Yuling; Chen, Qingkui; Xiong, Neal N; Zhao, Deyu; Wang, Jingjuan

2017-08-04

This paper aims to develop a low-cost, high-performance and high-reliability computing system to process large-scale data using common data mining algorithms in the Internet of Things (IoT) computing environment. Considering the characteristics of IoT data processing, similar to mainstream high performance computing, we use a GPU (Graphics Processing Unit) cluster to achieve better IoT services. Firstly, we present an energy consumption calculation method (ECCM) based on WSNs. Then, using the CUDA (Compute Unified Device Architecture) Programming model, we propose a Two-level Parallel Optimization Model (TLPOM) which exploits reasonable resource planning and common compiler optimization techniques to obtain the best blocks and threads configuration considering the resource constraints of each node. The key to this part is dynamic coupling Thread-Level Parallelism (TLP) and Instruction-Level Parallelism (ILP) to improve the performance of the algorithms without additional energy consumption. Finally, combining the ECCM and the TLPOM, we use the Reliable GPU Cluster Architecture (RGCA) to obtain a high-reliability computing system considering the nodes' diversity, algorithm characteristics, etc. The results show that the performance of the algorithms significantly increased by 34.1%, 33.96% and 24.07% for Fermi, Kepler and Maxwell on average with TLPOM and the RGCA ensures that our IoT computing system provides low-cost and high-reliability services.
High Performance Work System, HRD Climate and Organisational Performance: An Empirical Study

Science.gov (United States)

Muduli, Ashutosh

2015-01-01

Purpose: This paper aims to study the relationship between high-performance work system (HPWS) and organizational performance and to examine the role of human resource development (HRD) Climate in mediating the relationship between HPWS and the organizational performance in the context of the power sector of India. Design/methodology/approach: The…
Imaging performance of a hybrid x-ray computed tomography-fluorescence molecular tomography system using priors.

Science.gov (United States)

Ale, Angelique; Schulz, Ralf B; Sarantopoulos, Athanasios; Ntziachristos, Vasilis

2010-05-01

The performance is studied of two newly introduced and previously suggested methods that incorporate priors into inversion schemes associated with data from a recently developed hybrid x-ray computed tomography and fluorescence molecular tomography system, the latter based on CCD camera photon detection. The unique data set studied attains accurately registered data of high spatially sampled photon fields propagating through tissue along 360 degrees projections. Approaches that incorporate structural prior information were included in the inverse problem by adding a penalty term to the minimization function utilized for image reconstructions. Results were compared as to their performance with simulated and experimental data from a lung inflammation animal model and against the inversions achieved when not using priors. The importance of using priors over stand-alone inversions is also showcased with high spatial sampling simulated and experimental data. The approach of optimal performance in resolving fluorescent biodistribution in small animals is also discussed. Inclusion of prior information from x-ray CT data in the reconstruction of the fluorescence biodistribution leads to improved agreement between the reconstruction and validation images for both simulated and experimental data.

A virtual computing infrastructure for TS-CV SCADA systems

CERN Document Server

Poulsen, S

2008-01-01

In modern data centres, it is an emerging trend to operate and manage computers as software components or logical resources and not as physical machines. This technique is known as âﾜvirtualisationâ and the new computers are referred to as âﾜvirtual machinesâ (VMs). Multiple VMs can be consolidated on a single hardware platform and managed in ways that are not possible with physical machines. However, this is not yet widely practiced for control system deployment. In TS-CV, a collection of VMs or a âﾜvirtual infrastructureâ is installed since 2005 for SCADA systems, PLC program development, and alarm transmission. This makes it possible to consolidate distributed, heterogeneous operating systems and applications on a limited number of standardised high-performance servers in the Central Control Room (CCR). More generally, virtualisation assists in offering continuous computing services for controls and maintaining performance and assuring quality. Implementing our systems in a vi...
Secure computing on reconfigurable systems

OpenAIRE

Fernandes Chaves, R.J.

2007-01-01

This thesis proposes a Secure Computing Module (SCM) for reconfigurable computing systems. SC provides a protected and reliable computational environment, where data security and protection against malicious attacks to the system is assured. SC is strongly based on encryption algorithms and on the attestation of the executed functions. The use of SC on reconfigurable devices has the advantage of being highly adaptable to the application and the user requirements, while providing high performa...
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

Energy Technology Data Exchange (ETDEWEB)

Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-09-01

As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.
2003 Conference for Computing in High Energy and Nuclear Physics

International Nuclear Information System (INIS)

Schalk, T.

2003-01-01

The conference was subdivided into the follow separate tracks. Electronic presentations and/or videos are provided on the main website link. Sessions: Plenary Talks and Panel Discussion; Grid Architecture, Infrastructure, and Grid Security; HENP Grid Applications, Testbeds, and Demonstrations; HENP Computing Systems and Infrastructure; Monitoring; High Performance Networking; Data Acquisition, Triggers and Controls; First Level Triggers and Trigger Hardware; Lattice Gauge Computing; HENP Software Architecture and Software Engineering; Data Management and Persistency; Data Analysis Environment and Visualization; Simulation and Modeling; and Collaboration Tools and Information Systems
Computationally-optimized bone mechanical modeling from high-resolution structural images.

Directory of Open Access Journals (Sweden)

Jeremy F Magland

Full Text Available Image-based mechanical modeling of the complex micro-structure of human bone has shown promise as a non-invasive method for characterizing bone strength and fracture risk in vivo. In particular, elastic moduli obtained from image-derived micro-finite element (μFE simulations have been shown to correlate well with results obtained by mechanical testing of cadaveric bone. However, most existing large-scale finite-element simulation programs require significant computing resources, which hamper their use in common laboratory and clinical environments. In this work, we theoretically derive and computationally evaluate the resources needed to perform such simulations (in terms of computer memory and computation time, which are dependent on the number of finite elements in the image-derived bone model. A detailed description of our approach is provided, which is specifically optimized for μFE modeling of the complex three-dimensional architecture of trabecular bone. Our implementation includes domain decomposition for parallel computing, a novel stopping criterion, and a system for speeding up convergence by pre-iterating on coarser grids. The performance of the system is demonstrated on a dual quad-core Xeon 3.16 GHz CPUs equipped with 40 GB of RAM. Models of distal tibia derived from 3D in-vivo MR images in a patient comprising 200,000 elements required less than 30 seconds to converge (and 40 MB RAM. To illustrate the system's potential for large-scale μFE simulations, axial stiffness was estimated from high-resolution micro-CT images of a voxel array of 90 million elements comprising the human proximal femur in seven hours CPU time. In conclusion, the system described should enable image-based finite-element bone simulations in practical computation times on high-end desktop computers with applications to laboratory studies and clinical imaging.
Accessible high performance computing solutions for near real-time image processing for time critical applications

Science.gov (United States)

Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

2009-09-01

High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.
GPUs for real-time processing in HEP trigger systems (CHEP2013: 20. international conference on computing in high energy and nuclear physics)

Energy Technology Data Exchange (ETDEWEB)

Lamanna, G; Lamanna, G; Piandani, R [INFN, Pisa (Italy); Ammendola, R [INFN, Rome " Tor Vergata" (Italy); Bauce, M; Giagu, S; Messina, A [University, Rome " Sapienza" (Italy); Biagioni, A; Lonardo, A; Paolucci, P S; Rescigno, M; Simula, F; Vicini, P [INFN, Rome " Sapienza" (Italy); Fantechi, R [CERN, Geneve (Switzerland); Fiorini, M [University and INFN, Ferrara (Italy); Graverini, E; Pantaleo, F; Sozzi, M [University, Pisa (Italy)

2014-06-11

We describe a pilot project for the use of Graphics Processing Units (GPUs) for online triggering applications in High Energy Physics (HEP) experiments. Two major trends can be identified in the development of trigger and DAQ systems for HEP experiments: the massive use of general-purpose commodity systems such as commercial multicore PC farms for data acquisition, and the reduction of trigger levels implemented in hardware, towards a pure software selection system (trigger-less). The very innovative approach presented here aims at exploiting the parallel computing power of commercial GPUs to perform fast computations in software both at low- and high-level trigger stages. General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughputs, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming very attractive. We discuss in details the use of online parallel computing on GPUs for synchronous low-level trigger with fixed latency. In particular we show preliminary results on a first test in the NA62 experiment at CERN. The use of GPUs in high-level triggers is also considered, the ATLAS experiment (and in particular the muon trigger) at CERN will be taken as a study case of possible applications.
GPUs for real-time processing in HEP trigger systems (CHEP2013: 20. international conference on computing in high energy and nuclear physics)

International Nuclear Information System (INIS)

Lamanna, G; Lamanna, G; Piandani, R; Tor Vergata (Italy))" data-affiliation=" (INFN, Rome Tor Vergata (Italy))" >Ammendola, R; Sapienza (Italy))" data-affiliation=" (University, Rome Sapienza (Italy))" >Bauce, M; Sapienza (Italy))" data-affiliation=" (University, Rome Sapienza (Italy))" >Giagu, S; Sapienza (Italy))" data-affiliation=" (University, Rome Sapienza (Italy))" >Messina, A; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Biagioni, A; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Lonardo, A; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Paolucci, P S; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Rescigno, M; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Simula, F; Sapienza (Italy))" data-affiliation=" (INFN, Rome Sapienza (Italy))" >Vicini, P; Fantechi, R; Fiorini, M; Graverini, E; Pantaleo, F; Sozzi, M

2014-01-01

We describe a pilot project for the use of Graphics Processing Units (GPUs) for online triggering applications in High Energy Physics (HEP) experiments. Two major trends can be identified in the development of trigger and DAQ systems for HEP experiments: the massive use of general-purpose commodity systems such as commercial multicore PC farms for data acquisition, and the reduction of trigger levels implemented in hardware, towards a pure software selection system (trigger-less). The very innovative approach presented here aims at exploiting the parallel computing power of commercial GPUs to perform fast computations in software both at low- and high-level trigger stages. General-purpose computing on GPUs is emerging as a new paradigm in several fields of science, although so far applications have been tailored to the specific strengths of such devices as accelerator in offline computation. With the steady reduction of GPU latencies, and the increase in link and memory throughputs, the use of such devices for real-time applications in high-energy physics data acquisition and trigger systems is becoming very attractive. We discuss in details the use of online parallel computing on GPUs for synchronous low-level trigger with fixed latency. In particular we show preliminary results on a first test in the NA62 experiment at CERN. The use of GPUs in high-level triggers is also considered, the ATLAS experiment (and in particular the muon trigger) at CERN will be taken as a study case of possible applications.
Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

2014-11-11

has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).
A methodology for performing computer security reviews

International Nuclear Information System (INIS)

Hunteman, W.J.

1991-01-01

DOE Order 5637.1, ''Classified Computer Security,'' requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, we have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system. 1 tab
A methodology for performing computer security reviews

International Nuclear Information System (INIS)

Hunteman, W.J.

1991-01-01

This paper reports on DIE Order 5637.1, Classified Computer Security, which requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, the authors have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

Science.gov (United States)

Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
Brain inspired high performance electronics on flexible silicon

KAUST Repository

Sevilla, Galo T.

2014-06-01

Brain\\'s stunning speed, energy efficiency and massive parallelism makes it the role model for upcoming high performance computation systems. Although human brain components are a million times slower than state of the art silicon industry components [1], they can perform 1016 operations per second while consuming less power than an electrical light bulb. In order to perform the same amount of computation with today\\'s most advanced computers, the output of an entire power station would be needed. In that sense, to obtain brain like computation, ultra-fast devices with ultra-low power consumption will have to be integrated in extremely reduced areas, achievable only if brain folded structure is mimicked. Therefore, to allow brain-inspired computation, flexible and transparent platform will be needed to achieve foldable structures and their integration on asymmetric surfaces. In this work, we show a new method to fabricate 3D and planar FET architectures in flexible and semitransparent silicon fabric without comprising performance and maintaining cost/yield advantage offered by silicon-based electronics.
Computer control system of the cooler-synchrotron TARN-II

International Nuclear Information System (INIS)

Watanabe, S.; Watanabe, T.; Yoshizawa, M.; Katayama, T.

1993-11-01

The client-server model enables us to develop the flexible control system such as a TARN-II computer control system. The system forms a single machine including a message bus to communicate between them. An auxiliary control path in the client-server model serves a high speed device control. The configuration and performance of that control system are described. (author)
Large Scale Document Inversion using a Multi-threaded Computing System

Science.gov (United States)

Jung, Sungbo; Chang, Dar-Jen; Park, Juw Won

2018-01-01

Current microprocessor architecture is moving towards multi-core/multi-threaded systems. This trend has led to a surge of interest in using multi-threaded computing devices, such as the Graphics Processing Unit (GPU), for general purpose computing. We can utilize the GPU in computation as a massive parallel coprocessor because the GPU consists of multiple cores. The GPU is also an affordable, attractive, and user-programmable commodity. Nowadays a lot of information has been flooded into the digital domain around the world. Huge volume of data, such as digital libraries, social networking services, e-commerce product data, and reviews, etc., is produced or collected every moment with dramatic growth in size. Although the inverted index is a useful data structure that can be used for full text searches or document retrieval, a large number of documents will require a tremendous amount of time to create the index. The performance of document inversion can be improved by multi-thread or multi-core GPU. Our approach is to implement a linear-time, hash-based, single program multiple data (SPMD), document inversion algorithm on the NVIDIA GPU/CUDA programming platform utilizing the huge computational power of the GPU, to develop high performance solutions for document indexing. Our proposed parallel document inversion system shows 2-3 times faster performance than a sequential system on two different test datasets from PubMed abstract and e-commerce product reviews. CCS Concepts •Information systems➝Information retrieval • Computing methodologies➝Massively parallel and high-performance simulations.
Impact of new computing systems on finite element computations

International Nuclear Information System (INIS)

Noor, A.K.; Fulton, R.E.; Storaasi, O.O.

1983-01-01

Recent advances in computer technology that are likely to impact finite element computations are reviewed. The characteristics of supersystems, highly parallel systems, and small systems (mini and microcomputers) are summarized. The interrelations of numerical algorithms and software with parallel architectures are discussed. A scenario is presented for future hardware/software environment and finite element systems. A number of research areas which have high potential for improving the effectiveness of finite element analysis in the new environment are identified
WinHPC System Policies | High-Performance Computing | NREL

Science.gov (United States)

) cluster. The WinHPC login node (WinHPC02) is intended to allow users with approved access to connect to also be run from the login node. There is a single login node for this system so any applications
International Conference on Emerging Technologies for Information Systems, Computing, and Management

CERN Document Server

Ma, Tinghuai; Emerging Technologies for Information Systems, Computing, and Management

2013-01-01

This book aims to examine innovation in the fields of information technology, software engineering, industrial engineering, management engineering. Topics covered in this publication include; Information System Security, Privacy, Quality Assurance, High-Performance Computing and Information System Management and Integration. The book presents papers from The Second International Conference for Emerging Technologies Information Systems, Computing, and Management (ICM2012) which was held on December 1 to 2, 2012 in Hangzhou, China.
CHEP95: Computing in high energy physics. Abstracts

International Nuclear Information System (INIS)

1995-01-01

These proceedings cover the technical papers on computation in High Energy Physics, including computer codes, computer devices, control systems, simulations, data acquisition systems. New approaches on computer architectures are also discussed
Frontiers of performance analysis on leadership-class systems

Energy Technology Data Exchange (ETDEWEB)

Fowler, R [Renaissance Computing Institute, UNC, Chapel Hill, North Carolina (United States); Adhianto, L; Fagan, M; Krentel, M; Mellor-Crummey, J; Tallent, N [Rice University, Houston, Texas (United States); Supinski, B de; Gamblin, T; Schulz, M [Lawrence Livermore National Laboratory (United States)

2009-07-01

The number of cores in high-end systems for scientific computing are employingis increasing rapidly. As a result, there is an pressing need for tools that can measure, model, and diagnose performance problems in highly-parallel runs. We describe two tools that employ complementary approaches for analysis at scale and we illustrate their use on DOE leadership-class systems.

Frontiers of performance analysis on leadership-class systems

International Nuclear Information System (INIS)

Fowler, R; Adhianto, L; Fagan, M; Krentel, M; Mellor-Crummey, J; Tallent, N; Supinski, B de; Gamblin, T; Schulz, M

2009-01-01

The number of cores in high-end systems for scientific computing are employingis increasing rapidly. As a result, there is an pressing need for tools that can measure, model, and diagnose performance problems in highly-parallel runs. We describe two tools that employ complementary approaches for analysis at scale and we illustrate their use on DOE leadership-class systems.
Enhancing performance of next generation FSO communication systems using soft computing-based predictions.

Science.gov (United States)

Kazaura, Kamugisha; Omae, Kazunori; Suzuki, Toshiji; Matsumoto, Mitsuji; Mutafungwa, Edward; Korhonen, Timo O; Murakami, Tadaaki; Takahashi, Koichi; Matsumoto, Hideki; Wakamori, Kazuhiko; Arimoto, Yoshinori

2006-06-12

The deterioration and deformation of a free-space optical beam wave-front as it propagates through the atmosphere can reduce the link availability and may introduce burst errors thus degrading the performance of the system. We investigate the suitability of utilizing soft-computing (SC) based tools for improving performance of free-space optical (FSO) communications systems. The SC based tools are used for the prediction of key parameters of a FSO communications system. Measured data collected from an experimental FSO communication system is used as training and testing data for a proposed multi-layer neural network predictor (MNNP) used to predict future parameter values. The predicted parameters are essential for reducing transmission errors by improving the antenna's accuracy of tracking data beams. This is particularly essential for periods considered to be of strong atmospheric turbulence. The parameter values predicted using the proposed tool show acceptable conformity with original measurements.
High Performance Polar Decomposition on Distributed Memory Systems

KAUST Repository

Sukkari, Dalal E.

2016-08-08

The polar decomposition of a dense matrix is an important operation in linear algebra. It can be directly calculated through the singular value decomposition (SVD) or iteratively using the QR dynamically-weighted Halley algorithm (QDWH). The former is difficult to parallelize due to the preponderant number of memory-bound operations during the bidiagonal reduction. We investigate the latter scenario, which performs more floating-point operations but exposes at the same time more parallelism, and therefore, runs closer to the theoretical peak performance of the system, thanks to more compute-bound matrix operations. Profiling results show the performance scalability of QDWH for calculating the polar decomposition using around 9200 MPI processes on well and ill-conditioned matrices of 100K×100K problem size. We study then the performance impact of the QDWH-based polar decomposition as a pre-processing step toward calculating the SVD itself. The new distributed-memory implementation of the QDWH-SVD solver achieves up to five-fold speedup against current state-of-the-art vendor SVD implementations. © Springer International Publishing Switzerland 2016.
Running Interactive Jobs on Peregrine | High-Performance Computing | NREL

Science.gov (United States)

shell prompt, which allows users to execute commands and scripts as they would on the login nodes. Login performed on the compute nodes rather than on login nodes. This page provides instructions and examples of , start GUIs etc. and the commands will execute on that node instead of on the login node. The -V option
An Adaptive Middleware for Improved Computational Performance

DEFF Research Database (Denmark)

Bonnichsen, Lars Frydendal

, we are improving computational performance by exploiting modern hardware features, such as dynamic voltage-frequency scaling and transactional memory. Adapting software is an iterative process, requiring that we continually revisit it to meet new requirements or realities; a time consuming process......The performance improvements in computer systems over the past 60 years have been fueled by an exponential increase in energy efficiency. In recent years, the phenomenon known as the end of Dennard’s scaling has slowed energy efficiency improvements — but improving computer energy efficiency...... is more important now than ever. Traditionally, most improvements in computer energy efficiency have come from improvements in lithography — the ability to produce smaller transistors — and computer architecture - the ability to apply those transistors efficiently. Since the end of scaling, we have seen...
GRID computing for experimental high energy physics

International Nuclear Information System (INIS)

Moloney, G.R.; Martin, L.; Seviour, E.; Taylor, G.N.; Moorhead, G.F.

2002-01-01

Full text: The Large Hadron Collider (LHC), to be completed at the CERN laboratory in 2006, will generate 11 petabytes of data per year. The processing of this large data stream requires a large, distributed computing infrastructure. A recent innovation in high performance distributed computing, the GRID, has been identified as an important tool in data analysis for the LHC. GRID computing has actual and potential application in many fields which require computationally intensive analysis of large, shared data sets. The Australian experimental High Energy Physics community has formed partnerships with the High Performance Computing community to establish a GRID node at the University of Melbourne. Through Australian membership of the ATLAS experiment at the LHC, Australian researchers have an opportunity to be involved in the European DataGRID project. This presentation will include an introduction to the GRID, and it's application to experimental High Energy Physics. We will present the results of our studies, including participation in the first LHC data challenge
Fault tolerant computing systems

International Nuclear Information System (INIS)

Randell, B.

1981-01-01

Fault tolerance involves the provision of strategies for error detection damage assessment, fault treatment and error recovery. A survey is given of the different sorts of strategies used in highly reliable computing systems, together with an outline of recent research on the problems of providing fault tolerance in parallel and distributed computing systems. (orig.)
Decal electronics for printed high performance cmos electronic systems

KAUST Repository

Hussain, Muhammad Mustafa

2017-11-23

High performance complementary metal oxide semiconductor (CMOS) electronics are critical for any full-fledged electronic system. However, state-of-the-art CMOS electronics are rigid and bulky making them unusable for flexible electronic applications. While there exist bulk material reduction methods to flex them, such thinned CMOS electronics are fragile and vulnerable to handling for high throughput manufacturing. Here, we show a fusion of a CMOS technology compatible fabrication process for flexible CMOS electronics, with inkjet and conductive cellulose based interconnects, followed by additive manufacturing (i.e. 3D printing based packaging) and finally roll-to-roll printing of packaged decal electronics (thin film transistors based circuit components and sensors) focusing on printed high performance flexible electronic systems. This work provides the most pragmatic route for packaged flexible electronic systems for wide ranging applications.
Designing a Scalable Fault Tolerance Model for High Performance Computational Chemistry: A Case Study with Coupled Cluster Perturbative Triples.

Science.gov (United States)

van Dam, Hubertus J J; Vishnu, Abhinav; de Jong, Wibe A

2011-01-11

In the past couple of decades, the massive computational power provided by the most modern supercomputers has resulted in simulation of higher-order computational chemistry methods, previously considered intractable. As the system sizes continue to increase, the computational chemistry domain continues to escalate this trend using parallel computing with programming models such as Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) programming models such as Global Arrays. The ever increasing scale of these supercomputers comes at a cost of reduced Mean Time Between Failures (MTBF), currently on the order of days and projected to be on the order of hours for upcoming extreme scale systems. While traditional disk-based check pointing methods are ubiquitous for storing intermediate solutions, they suffer from high overhead of writing and recovering from checkpoints. In practice, checkpointing itself often brings the system down. Clearly, methods beyond checkpointing are imperative to handling the aggravating issue of reducing MTBF. In this paper, we address this challenge by designing and implementing an efficient fault tolerant version of the Coupled Cluster (CC) method with NWChem, using in-memory data redundancy. We present the challenges associated with our design, including an efficient data storage model, maintenance of at least one consistent data copy, and the recovery process. Our performance evaluation without faults shows that the current design exhibits a small overhead. In the presence of a simulated fault, the proposed design incurs negligible overhead in comparison to the state of the art implementation without faults.
Relationship between quality of care and choice of clinical computing system: retrospective analysis of family practice performance under the UK's quality and outcomes framework.

Science.gov (United States)

Kontopantelis, Evangelos; Buchan, Iain; Reeves, David; Checkland, Kath; Doran, Tim

2013-08-02

To investigate the relationship between performance on the UK Quality and Outcomes Framework pay-for-performance scheme and choice of clinical computer system. Retrospective longitudinal study. Data for 2007-2008 to 2010-2011, extracted from the clinical computer systems of general practices in England. All English practices participating in the pay-for-performance scheme: average 8257 each year, covering over 99% of the English population registered with a general practice. Levels of achievement on 62 quality-of-care indicators, measured as: reported achievement (levels of care after excluding inappropriate patients); population achievement (levels of care for all patients with the relevant condition) and percentage of available quality points attained. Multilevel mixed effects multiple linear regression models were used to identify population, practice and clinical computing system predictors of achievement. Seven clinical computer systems were consistently active in the study period, collectively holding approximately 99% of the market share. Of all population and practice characteristics assessed, choice of clinical computing system was the strongest predictor of performance across all three outcome measures. Differences between systems were greatest for intermediate outcomes indicators (eg, control of cholesterol levels). Under the UK's pay-for-performance scheme, differences in practice performance were associated with the choice of clinical computing system. This raises the question of whether particular system characteristics facilitate higher quality of care, better data recording or both. Inconsistencies across systems need to be understood and addressed, and researchers need to be cautious when generalising findings from samples of providers using a single computing system.
Misleading Performance Claims in Parallel Computations

Energy Technology Data Exchange (ETDEWEB)

Bailey, David H.

2009-05-29

In a previous humorous note entitled 'Twelve Ways to Fool the Masses,' I outlined twelve common ways in which performance figures for technical computer systems can be distorted. In this paper and accompanying conference talk, I give a reprise of these twelve 'methods' and give some actual examples that have appeared in peer-reviewed literature in years past. I then propose guidelines for reporting performance, the adoption of which would raise the level of professionalism and reduce the level of confusion, not only in the world of device simulation but also in the larger arena of technical computing.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Science.gov (United States)

Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Directory of Open Access Journals (Sweden)

Anwar S. Shatil

2015-01-01

Full Text Available With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1 inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2 highlight their main advantages; 3 discuss when it may (and may not be advisable to use them; 4 review some of their potential problems and barriers to access; and finally 5 give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc., a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
High-performance secure multi-party computation for data mining applications

DEFF Research Database (Denmark)

Bogdanov, Dan; Niitsoo, Margus; Toft, Tomas

2012-01-01

Secure multi-party computation (MPC) is a technique well suited for privacy-preserving data mining. Even with the recent progress in two-party computation techniques such as fully homomorphic encryption, general MPC remains relevant as it has shown promising performance metrics in real...... operations such as multiplication and comparison. Secondly, the confidential processing of financial data requires the use of more complex primitives, including a secure division operation. This paper describes new protocols in the Sharemind model for secure multiplication, share conversion, equality, bit...
LANSCE target system performance

International Nuclear Information System (INIS)

Russell, G.J.; Gilmore, J.S.; Robinson, H.; Legate, G.L.; Bridge, A.; Sanchez, R.J.; Brewton, R.J.; Woods, R.; Hughes, H.G. III

1989-01-01

We measured neutron beam fluxes at LANSCE using gold foil activation techniques. We did an extensive computer simulation of the as-built LANSCE Target/Moderator/Reflector/Shield geometry. We used this mockup in a Monte Carlo calculation to predict LANSCE neutronic performance for comparison with measured results. For neutron beam fluxes at 1 eV, the ratio of measured data to calculated varies from ∼0.6-0.9. The computed 1 eV neutron leakage at the moderator surface is 3.9 x 10 10 n/eV-sr-s-μA for LANSCE high-intensity water moderators. The corresponding values for the LANSCE high-resolution water moderator and the liquid hydrogen moderator are 3.3 and 2.9 x 10 10 , respectively. LANSCE predicted moderator intensities (per proton) for a tungsten target are essentially the same as ISIS predicted moderator intensities for a depleted uranium target. The calculated LANSCE steady state unperturbed thermal (E 13 n/cm 2 -s. The unique LANSCE split-target/flux-trap-moderator system is performing exceedingly well. The system has operated without a target or moderator change for over three years at nominal proton currents of ∼25 μA of 800-MeV protons. (author)
Environment Modules on the Peregrine System | High-Performance Computing |

Science.gov (United States)

NREL Environment Modules on the Peregrine System Environment Modules on the Peregrine System Peregrine uses environment modules to easily manage software environments. Environment modules facilitate modules commands set up a basic environment for the default compilers, tools and libraries, such as the
Automated validation of a computer operating system

Science.gov (United States)

Dervage, M. M.; Milberg, B. A.

1970-01-01

Programs apply selected input/output loads to complex computer operating system and measure performance of that system under such loads. Technique lends itself to checkout of computer software designed to monitor automated complex industrial systems.
Software on the Peregrine System | High-Performance Computing | NREL

Science.gov (United States)

on the Peregrine System Software on the Peregrine System NREL maintains a variety of applications environment modules for use on Peregrine. Applications View list of software applications by name and research area/discipline. Libraries View list of software libraries available for linking and loading
A high performance hierarchical storage management system for the Canadian tier-1 centre at TRIUMF

International Nuclear Information System (INIS)

Deatrich, D C; Liu, S X; Tafirout, R

2010-01-01

We describe in this paper the design and implementation of Tapeguy, a high performance non-proprietary Hierarchical Storage Management (HSM) system which is interfaced to dCache for efficient tertiary storage operations. The system has been successfully implemented at the Canadian Tier-1 Centre at TRIUMF. The ATLAS experiment will collect a large amount of data (approximately 3.5 Petabytes each year). An efficient HSM system will play a crucial role in the success of the ATLAS Computing Model which is driven by intensive large-scale data analysis activities that will be performed on the Worldwide LHC Computing Grid infrastructure continuously. Tapeguy is Perl-based. It controls and manages data and tape libraries. Its architecture is scalable and includes Dataset Writing control, a Read-back Queuing mechanism and I/O tape drive load balancing as well as on-demand allocation of resources. A central MySQL database records metadata information for every file and transaction (for audit and performance evaluation), as well as an inventory of library elements. Tapeguy Dataset Writing was implemented to group files which are close in time and of similar type. Optional dataset path control dynamically allocates tape families and assign tapes to it. Tape flushing is based on various strategies: time, threshold or external callbacks mechanisms. Tapeguy Read-back Queuing reorders all read requests by using an elevator algorithm, avoiding unnecessary tape loading and unloading. Implementation of priorities will guarantee file delivery to all clients in a timely manner.
ExaGeoStat: A High Performance Unified Framework for Geostatistics on Manycore Systems

KAUST Repository

Abdulah, Sameh

2017-08-09

We present ExaGeoStat, a high performance framework for geospatial statistics in climate and environment modeling. In contrast to simulation based on partial differential equations derived from first-principles modeling, ExaGeoStat employs a statistical model based on the evaluation of the Gaussian log-likelihood function, which operates on a large dense covariance matrix. Generated by the parametrizable Matern covariance function, the resulting matrix is symmetric and positive definite. The computational tasks involved during the evaluation of the Gaussian log-likelihood function become daunting as the number n of geographical locations grows, as O(n2) storage and O(n3) operations are required. While many approximation methods have been devised from the side of statistical modeling to ameliorate these polynomial complexities, we are interested here in the complementary approach of evaluating the exact algebraic result by exploiting advances in solution algorithms and many-core computer architectures. Using state-of-the-art high performance dense linear algebra libraries associated with various leading edge parallel architectures (Intel KNLs, NVIDIA GPUs, and distributed-memory systems), ExaGeoStat raises the game for statistical applications from climate and environmental science. ExaGeoStat provides a reference evaluation of statistical parameters, with which to assess the validity of the various approaches based on approximation. The framework takes a first step in the merger of large-scale data analytics and extreme computing for geospatial statistical applications, to be followed by additional complexity reducing improvements from the solver side that can be implemented under the same interface. Thus, a single uncompromised statistical model can ultimately be executed in a wide variety of emerging exascale environments.

Additive Manufacturing and High-Performance Computing: a Disruptive Latent Technology

Science.gov (United States)

Goodwin, Bruce

2015-03-01

This presentation will discuss the relationship between recent advances in Additive Manufacturing (AM) technology, High-Performance Computing (HPC) simulation and design capabilities, and related advances in Uncertainty Quantification (UQ), and then examines their impacts upon national and international security. The presentation surveys how AM accelerates the fabrication process, while HPC combined with UQ provides a fast track for the engineering design cycle. The combination of AM and HPC/UQ almost eliminates the engineering design and prototype iterative cycle, thereby dramatically reducing cost of production and time-to-market. These methods thereby present significant benefits for US national interests, both civilian and military, in an age of austerity. Finally, considering cyber security issues and the advent of the ``cloud,'' these disruptive, currently latent technologies may well enable proliferation and so challenge both nuclear and non-nuclear aspects of international security.
ON-BOARD COMPUTER SYSTEM FOR KITSAT-1 AND 2

Directory of Open Access Journals (Sweden)

H. S. Kim

1996-06-01

Full Text Available KITSAT-1 and 2 are microsatellites weighting 50kg and all the on-board data are processed by the on-board computer system. Hence, these on-board computers require to be highly reliable and be designed with tight power consumption, mass and size constraints. On-board computer(OBC systems for KITSAT-1 and 2 are also designed with a simple flexible hardware for reliability and software takes more responsibility than hardware. KITSAT-1 and 2 on-board computer system consist of OBC 186 as the primary OBC and OBC80 as its backup. OBC186 runs spacecraft operating system (SCOS which has real-time multi-tasking capability. Since their launch, OBC186 and OBC80 have been operating successfully until today. In this paper, we describe the development of OBC186 hardware and software and analyze its in-orbit operation performance.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

Science.gov (United States)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
Highly reliable computer network for real time system

International Nuclear Information System (INIS)

Mohammed, F.A.; Omar, A.A.; Ayad, N.M.A.; Madkour, M.A.I.; Ibrahim, M.K.

1988-01-01

Many of computer networks have been studied different trends regarding the network architecture and the various protocols that govern data transfers and guarantee a reliable communication among all a hierarchical network structure has been proposed to provide a simple and inexpensive way for the realization of a reliable real-time computer network. In such architecture all computers in the same level are connected to a common serial channel through intelligent nodes that collectively control data transfers over the serial channel. This level of computer network can be considered as a local area computer network (LACN) that can be used in nuclear power plant control system since it has geographically dispersed subsystems. network expansion would be straight the common channel for each added computer (HOST). All the nodes are designed around a microprocessor chip to provide the required intelligence. The node can be divided into two sections namely a common section that interfaces with serial data channel and a private section to interface with the host computer. This part would naturally tend to have some variations in the hardware details to match the requirements of individual host computers. fig 7
A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing.

Energy Technology Data Exchange (ETDEWEB)

James, Conrad D.; Schiess, Adrian B.; Howell, Jamie; Baca, Michael J.; Partridge, L. Donald; Finnegan, Patrick Sean; Wolfley, Steven L.; Dagel, Daryl James; Spahn, Olga Blum; Harper, Jason C.; Pohl, Kenneth Roy; Mickel, Patrick R.; Lohn, Andrew; Marinella, Matthew

2013-10-01

The human brain (volume=1200cm3) consumes 20W and is capable of performing > 10^16 operations/s. Current supercomputer technology has reached 1015 operations/s, yet it requires 1500m^3 and 3MW, giving the brain a 10^12 advantage in operations/s/W/cm^3. Thus, to reach exascale computation, two achievements are required: 1) improved understanding of computation in biological tissue, and 2) a paradigm shift towards neuromorphic computing where hardware circuits mimic properties of neural tissue. To address 1), we will interrogate corticostriatal networks in mouse brain tissue slices, specifically with regard to their frequency filtering capabilities as a function of input stimulus. To address 2), we will instantiate biological computing characteristics such as multi-bit storage into hardware devices with future computational and memory applications. Resistive memory devices will be modeled, designed, and fabricated in the MESA facility in consultation with our internal and external collaborators.
Interactive Data Exploration for High-Performance Fluid Flow Computations through Porous Media

KAUST Repository

Perovic, Nevena

2014-09-01

© 2014 IEEE. Huge data advent in high-performance computing (HPC) applications such as fluid flow simulations usually hinders the interactive processing and exploration of simulation results. Such an interactive data exploration not only allows scientiest to \\'play\\' with their data but also to visualise huge (distributed) data sets in both an efficient and easy way. Therefore, we propose an HPC data exploration service based on a sliding window concept, that enables researches to access remote data (available on a supercomputer or cluster) during simulation runtime without exceeding any bandwidth limitations between the HPC back-end and the user front-end.
High-reliability computing for the smarter planet

International Nuclear Information System (INIS)

Quinn, Heather M.; Graham, Paul; Manuzzato, Andrea; Dehon, Andre

2010-01-01

The geometric rate of improvement of transistor size and integrated circuit performance, known as Moore's Law, has been an engine of growth for our economy, enabling new products and services, creating new value and wealth, increasing safety, and removing menial tasks from our daily lives. Affordable, highly integrated components have enabled both life-saving technologies and rich entertainment applications. Anti-lock brakes, insulin monitors, and GPS-enabled emergency response systems save lives. Cell phones, internet appliances, virtual worlds, realistic video games, and mp3 players enrich our lives and connect us together. Over the past 40 years of silicon scaling, the increasing capabilities of inexpensive computation have transformed our society through automation and ubiquitous communications. In this paper, we will present the concept of the smarter planet, how reliability failures affect current systems, and methods that can be used to increase the reliable adoption of new automation in the future. We will illustrate these issues using a number of different electronic devices in a couple of different scenarios. Recently IBM has been presenting the idea of a 'smarter planet.' In smarter planet documents, IBM discusses increased computer automation of roadways, banking, healthcare, and infrastructure, as automation could create more efficient systems. A necessary component of the smarter planet concept is to ensure that these new systems have very high reliability. Even extremely rare reliability problems can easily escalate to problematic scenarios when implemented at very large scales. For life-critical systems, such as automobiles, infrastructure, medical implantables, and avionic systems, unmitigated failures could be dangerous. As more automation moves into these types of critical systems, reliability failures will need to be managed. As computer automation continues to increase in our society, the need for greater radiation reliability is necessary
High-reliability computing for the smarter planet

Energy Technology Data Exchange (ETDEWEB)

Quinn, Heather M [Los Alamos National Laboratory; Graham, Paul [Los Alamos National Laboratory; Manuzzato, Andrea [UNIV OF PADOVA; Dehon, Andre [UNIV OF PENN; Carter, Nicholas [INTEL CORPORATION

2010-01-01

The geometric rate of improvement of transistor size and integrated circuit performance, known as Moore's Law, has been an engine of growth for our economy, enabling new products and services, creating new value and wealth, increasing safety, and removing menial tasks from our daily lives. Affordable, highly integrated components have enabled both life-saving technologies and rich entertainment applications. Anti-lock brakes, insulin monitors, and GPS-enabled emergency response systems save lives. Cell phones, internet appliances, virtual worlds, realistic video games, and mp3 players enrich our lives and connect us together. Over the past 40 years of silicon scaling, the increasing capabilities of inexpensive computation have transformed our society through automation and ubiquitous communications. In this paper, we will present the concept of the smarter planet, how reliability failures affect current systems, and methods that can be used to increase the reliable adoption of new automation in the future. We will illustrate these issues using a number of different electronic devices in a couple of different scenarios. Recently IBM has been presenting the idea of a 'smarter planet.' In smarter planet documents, IBM discusses increased computer automation of roadways, banking, healthcare, and infrastructure, as automation could create more efficient systems. A necessary component of the smarter planet concept is to ensure that these new systems have very high reliability. Even extremely rare reliability problems can easily escalate to problematic scenarios when implemented at very large scales. For life-critical systems, such as automobiles, infrastructure, medical implantables, and avionic systems, unmitigated failures could be dangerous. As more automation moves into these types of critical systems, reliability failures will need to be managed. As computer automation continues to increase in our society, the need for greater radiation reliability is
Information Technology Service Management with Cloud Computing Approach to Improve Administration System and Online Learning Performance

Directory of Open Access Journals (Sweden)

Wilianto Wilianto

2015-10-01

Full Text Available This work discusses the development of information technology service management using cloud computing approach to improve the performance of administration system and online learning at STMIK IBBI Medan, Indonesia. The network topology is modeled and simulated for system administration and online learning. The same network topology is developed in cloud computing using Amazon AWS architecture. The model is designed and modeled using Riverbed Academic Edition Modeler to obtain values of the parameters: delay, load, CPU utilization, and throughput. The simu- lation results are the following. For network topology 1, without cloud computing, the average delay is 54 ms, load 110 000 bits/s, CPU utilization 1.1%, and throughput 440 bits/s. With cloud computing, the average delay is 45 ms, load 2 800 bits/s, CPU utilization 0.03%, and throughput 540 bits/s. For network topology 2, without cloud computing, the average delay is 39 ms, load 3 500 bits/s, CPU utilization 0.02%, and throughput database server 1 400 bits/s. With cloud computing, the average delay is 26 ms, load 5 400 bits/s, CPU utilization email server 0.0001%, FTP server 0.001%, HTTP server 0.0002%, throughput email server 85 bits/s, FTP server 100 bits/sec, and HTTP server 95 bits/s. Thus, the delay, the load, and the CPU utilization decrease; but, the throughput increases. Information technology service management with cloud computing approach has better performance.
Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

Science.gov (United States)

Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.
Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

Directory of Open Access Journals (Sweden)

Daniel Pierre Bovet

2008-01-01

Full Text Available Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.
Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

Directory of Open Access Journals (Sweden)

Betti Emiliano

2008-01-01

Full Text Available Abstract Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.
High Performance Gigabit Ethernet Switches for DAQ Systems

CERN Document Server

Barczyk, Artur

2005-01-01

Commercially available high performance Gigabit Ethernet (GbE) switches are optimized mostly for Internet and standard LAN application traffic. DAQ systems on the other hand usually make use of very specific traffic patterns, with e.g. deterministic arrival times. Industry's accepted loss-less limit of 99.999% may be still unacceptably high for DAQ purposes, as e.g. in the case of the LHCb readout system. In addition, even switches passing this criteria under random traffic can show significantly higher loss rates if subject to our traffic pattern, mainly due to buffer memory limitations. We have evaluated the performance of several switches, ranging from "pizza-box" devices with 24 or 48 ports up to chassis based core switches in a test-bed capable to emulate realistic traffic patterns as expected in the readout system of our experiment. The results obtained in our tests have been used to refine and parametrize our packet level simulation of the complete LHCb readout network. In this paper we report on the...
Guide to improving the performance of a manipulator system for nuclear fuel handling through computer controls. Final report

International Nuclear Information System (INIS)

Evans, J.M. Jr.; Albus, J.S.; Barbera, A.J.; Rosenthal, R.; Truitt, W.B.

1975-11-01

The Office of Developmental Automation and Control Technology of the Institute for Computer Sciences and Technology of the National Bureau of Standards provides advising services, standards and guidelines on interface and computer control systems, and performance specifications for the procurement and use of computer controlled manipulators and other computer based automation systems. These outputs help other agencies and industry apply this technology to increase productivity and improve work quality by removing men from hazardous environments. In FY 74 personnel from the Oak Ridge National Laboratory visited NBS to discuss the feasibility of using computer control techniques to improve the operation of remote control manipulators in nuclear fuel reprocessing. Subsequent discussions led to an agreement for NBS to develop a conceptual design for such a computer control system for the PaR Model 3000 manipulator in the Thorium Uranium Recycle Facility (TURF) at ORNL. This report provides the required analysis and conceptual design. Complete computer programs are included for testing of computer interfaces and for actual robot control in both point-to-point and continuous path modes
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy.

Science.gov (United States)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-19

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3D-MIP platform when a larger number of cores is available.
Design and applications of Computed Industrial Tomographic Imaging System (CITIS)

International Nuclear Information System (INIS)

Ramakrishna, G.S.; Umesh Kumar; Datta, S.S.; Rao, S.M.

1996-01-01

Computed tomographic imaging is an advanced technique for nondestructive testing (NDT) and examination. For the first time in India a computed aided tomography system has been indigenously developed in BARC for testing industrial components and was successfully demonstrated. The system in addition to Computed Tomography (CT) can also perform Digital Radiography (DR) to serve as a powerful tool for NDT applications. It has wider applications in the fields of nuclear, space and allied fields. The authors have developed a computed industrial tomographic imaging system with Cesium 137 gamma radiation source for nondestructive examination of engineering and industrial specimens. This presentation highlights the design and development of a prototype system and its software for image reconstruction, simulation and display. The paper also describes results obtained with several tests specimens, current development and possibility of using neutrons as well as high energy x-rays in computed tomography. (author)
A high-performance digital control system for TCV

International Nuclear Information System (INIS)

Lister, J.B.; Dutch, M.J.; Milne, P.G.; Means, R.W.

1997-10-01

The TCV hybrid analogue-digital plasma control system has been superseded by a high performance Digital Plasma Control System, DPCS, made possible by recent advances in off the shelf technology. We discuss the basic requirements for such a control system and present the design and specifications which were laid down. The nominal and final performances are presented and the complete design is given in detail. The integration of the new system into the current operation of the TCV tokamak is described. The procurement of this system has required close collaboration between the end-users and two commercial suppliers with one of the latter taking full responsibility for the system integration. The impact of this approach on the design and commissioning costs for the TCV project is presented. New possibilities offered by this new system are discussed, including possible work relevant to ITER plasma control development. (author) 3 figs., 5 refs
A high-performance digital control system for TCV

Energy Technology Data Exchange (ETDEWEB)

Lister, J.B.; Dutch, M.J. [Ecole Polytechnique Federale, Lausanne (Switzerland). Centre de Recherche en Physique des Plasma (CRPP); Milne, P.G. [Pentland System Ltd., Livingstone (United Kingdom); Means, R.W. [HNC Software Inc., San Diego, CA (United States)

1997-10-01

The TCV hybrid analogue-digital plasma control system has been superseded by a high performance Digital Plasma Control System, DPCS, made possible by recent advances in off the shelf technology. We discuss the basic requirements for such a control system and present the design and specifications which were laid down. The nominal and final performances are presented and the complete design is given in detail. The integration of the new system into the current operation of the TCV tokamak is described. The procurement of this system has required close collaboration between the end-users and two commercial suppliers with one of the latter taking full responsibility for the system integration. The impact of this approach on the design and commissioning costs for the TCV project is presented. New possibilities offered by this new system are discussed, including possible work relevant to ITER plasma control development. (author) 3 figs., 5 refs.
Computer network for electric power control systems. Chubu denryoku (kabu) denryoku keito seigyoyo computer network

Energy Technology Data Exchange (ETDEWEB)

Tsuneizumi, T. (Chubu Electric Power Co. Inc., Nagoya (Japan)); Shimomura, S.; Miyamura, N. (Fuji Electric Co. Ltd., Tokyo (Japan))

1992-06-03

A computer network for electric power control system was developed that is applied with the open systems interconnection (OSI), an international standard for communications protocol. In structuring the OSI network, a direct session layer was accessed from the operation functions when high-speed small-capacity information is transmitted. File transfer, access and control having a function of collectively transferring large-capacity data were applied when low-speed large-capacity information is transmitted. A verification test for the realtime computer network (RCN) mounting regulation was conducted according to a verification model using a mini-computer, and a result that can satisfy practical performance was obtained. For application interface, kernel, health check and two-route transmission functions were provided as a connection control function, so were transmission verification function and late arrival abolishing function. In system mounting pattern, dualized communication server (CS) structure was adopted. A hardware structure may include a system to have the CS function contained in a host computer and a separate installation system. 5 figs., 6 tabs.
A Robust and Fast System for CTC Computer-Aided Detection of Colorectal Lesions

Directory of Open Access Journals (Sweden)

Gareth Beddoe

2010-01-01

Full Text Available We present a complete, end-to-end computer-aided detection (CAD system for identifying lesions in the colon, imaged with computed tomography (CT. This system includes facilities for colon segmentation, candidate generation, feature analysis, and classification. The algorithms have been designed to offer robust performance to variation in image data and patient preparation. By utilizing efficient 2D and 3D processing, software optimizations, multi-threading, feature selection, and an optimized cascade classifier, the CAD system quickly determines a set of detection marks. The colon CAD system has been validated on the largest set of data to date, and demonstrates excellent performance, in terms of its high sensitivity, low false positive rate, and computational efficiency.

Understanding and Improving the Performance Consistency of Distributed Computing Systems

NARCIS (Netherlands)

Yigitbasi, M.N.

2012-01-01

With the increasing adoption of distributed systems in both academia and industry, and with the increasing computational and storage requirements of distributed applications, users inevitably demand more from these systems. Moreover, users also depend on these systems for latency and throughput
PISA and High-Performing Education Systems: Explaining Singapore's Education Success

Science.gov (United States)

Deng, Zongyi; Gopinathan, S.

2016-01-01

Singapore's remarkable performance in Programme for International Student Assessment (PISA) has placed it among the world's high-performing education systems (HPES). In the literature on HPES, its "secret formula" for education success is explained in terms of teacher quality, school leadership, system characteristics and educational…
Annual Performance Assessment of Complex Fenestration Systems in Sunny Climates Using Advanced Computer Simulations

Directory of Open Access Journals (Sweden)

Chantal Basurto

2015-12-01

Full Text Available Complex Fenestration Systems (CFS are advanced daylighting systems that are placed on the upper part of a window to improve the indoor daylight distribution within rooms. Due to their double function of daylight redirection and solar protection, they are considered as a solution to mitigate the unfavorable effects due to the admission of direct sunlight in buildings located in prevailing sunny climates (risk of glare and overheating. Accordingly, an adequate assessment of their performance should include an annual evaluation of the main aspects relevant to the use of daylight in such regions: the indoor illuminance distribution, thermal comfort, and visual comfort of the occupant’s. Such evaluation is possible with the use of computer simulations combined with the bi-directional scattering distribution function (BSDF data of these systems. This study explores the use of available methods to assess the visible and thermal annual performance of five different CFS using advanced computer simulations. To achieve results, an on-site daylight monitoring was carried out in a building located in a predominantly sunny climate location, and the collected data was used to create and calibrate a virtual model used to carry-out the simulations. The results can be employed to select the CFS, which improves visual and thermal interior environment for the occupants.
Direct numerical simulation of reactor two-phase flows enabled by high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Fang, Jun; Cambareri, Joseph J.; Brown, Cameron S.; Feng, Jinyong; Gouws, Andre; Li, Mengnan; Bolotnov, Igor A.

2018-04-01

Nuclear reactor two-phase flows remain a great engineering challenge, where the high-resolution two-phase flow database which can inform practical model development is still sparse due to the extreme reactor operation conditions and measurement difficulties. Owing to the rapid growth of computing power, the direct numerical simulation (DNS) is enjoying a renewed interest in investigating the related flow problems. A combination between DNS and an interface tracking method can provide a unique opportunity to study two-phase flows based on first principles calculations. More importantly, state-of-the-art high-performance computing (HPC) facilities are helping unlock this great potential. This paper reviews the recent research progress of two-phase flow DNS related to reactor applications. The progress in large-scale bubbly flow DNS has been focused not only on the sheer size of those simulations in terms of resolved Reynolds number, but also on the associated advanced modeling and analysis techniques. Specifically, the current areas of active research include modeling of sub-cooled boiling, bubble coalescence, as well as the advanced post-processing toolkit for bubbly flow simulations in reactor geometries. A novel bubble tracking method has been developed to track the evolution of bubbles in two-phase bubbly flow. Also, spectral analysis of DNS database in different geometries has been performed to investigate the modulation of the energy spectrum slope due to bubble-induced turbulence. In addition, the single-and two-phase analysis results are presented for turbulent flows within the pressurized water reactor (PWR) core geometries. The related simulations are possible to carry out only with the world leading HPC platforms. These simulations are allowing more complex turbulence model development and validation for use in 3D multiphase computational fluid dynamics (M-CFD) codes.
Development of high performance scientific components for interoperability of computing packages

Energy Technology Data Exchange (ETDEWEB)

Gulabani, Teena Pratap [Iowa State Univ., Ames, IA (United States)

2008-01-01

Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achieved by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components. Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications.
Coal-fired high performance power generating system. Final report

Energy Technology Data Exchange (ETDEWEB)

NONE

1995-08-31

As a result of the investigations carried out during Phase 1 of the Engineering Development of Coal-Fired High-Performance Power Generation Systems (Combustion 2000), the UTRC-led Combustion 2000 Team is recommending the development of an advanced high performance power generation system (HIPPS) whose high efficiency and minimal pollutant emissions will enable the US to use its abundant coal resources to satisfy current and future demand for electric power. The high efficiency of the power plant, which is the key to minimizing the environmental impact of coal, can only be achieved using a modern gas turbine system. Minimization of emissions can be achieved by combustor design, and advanced air pollution control devices. The commercial plant design described herein is a combined cycle using either a frame-type gas turbine or an intercooled aeroderivative with clean air as the working fluid. The air is heated by a coal-fired high temperature advanced furnace (HITAF). The best performance from the cycle is achieved by using a modern aeroderivative gas turbine, such as the intercooled FT4000. A simplified schematic is shown. In the UTRC HIPPS, the conversion efficiency for the heavy frame gas turbine version will be 47.4% (HHV) compared to the approximately 35% that is achieved in conventional coal-fired plants. This cycle is based on a gas turbine operating at turbine inlet temperatures approaching 2,500 F. Using an aeroderivative type gas turbine, efficiencies of over 49% could be realized in advanced cycle configuration (Humid Air Turbine, or HAT). Performance of these power plants is given in a table.
DJFS: Providing Highly Reliable and High‐Performance File System with Small‐Sized

Directory of Open Access Journals (Sweden)

Junghoon Kim

2017-11-01

Full Text Available File systems and applications try to implement their own update protocols to guarantee data consistency, which is one of the most crucial aspects of computing systems. However, we found that the storage devices are substantially under‐utilized when preserving data consistency because they generate massive storage write traffic with many disk cache flush operations and force‐unit‐access (FUA commands. In this paper, we present DJFS (Delta‐Journaling File System that provides both a high level of performance and data consistency for different applications. We made three technical contributions to achieve our goal. First, to remove all storage accesses with disk cache flush operations and FUA commands, DJFS uses small‐sized NVRAM for a file system journal. Second, to reduce the access latency and space requirements of NVRAM, DJFS attempts to journal compress the differences in the modified blocks. Finally, to relieve explicit checkpointing overhead, DJFS aggressively reflects the checkpoint transactions to file system area in the unit of the specified region. Our evaluation on TPC‐C SQLite benchmark shows that, using our novel optimization schemes, DJFS outperforms Ext4 by up to 64.2 times with only 128 MB of NVRAM.
Contributing to the design of run-time systems dedicated to high performance computing; Contribution a l'elaboration d'environnements de programmation dedies au calcul scientifique hautes performances

Energy Technology Data Exchange (ETDEWEB)

Perache, M

2006-10-15

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
High Performance Embedded System for Real-Time Pattern Matching

CERN Document Server

Sotiropoulou, Calliope Louisa; The ATLAS collaboration; Gkaitatzis, Stamatios; Citraro, Saverio; Giannetti, Paola; Dell'Orso, Mauro

2016-01-01

We present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics (HEP) and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton-proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The design uses the flexibility of Field Programmable Gate Arrays (FPGAs) and the powerful Associative Memory Chip (ASIC) to achieve real-time performance. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain.
System-level tools and reconfigurable computing for next-generation HWIL systems

Science.gov (United States)

Stark, Derek; McAulay, Derek; Cantle, Allan J.; Devlin, Malachy

2001-08-01

Previous work has been presented on the creation of computing architectures called DIME, which addressed the particular computing demands of hardware in the loop systems. These demands include low latency, high data rates and interfacing. While it is essential to have a capable platform for handling and processing of the data streams, the tools must also complement this so that a system's engineer is able to construct their final system. The paper will present the work in the area of integration of system level design tools, such as MATLAB and SIMULINK, with a reconfigurable computing platform. This will demonstrate how algorithms can be implemented and simulated in a familiar rapid application development environment before they are automatically transposed for downloading directly to the computing platform. This complements the established control tools, which handle the configuration and control of the processing systems leading to a tool suite for system development and implementation. As the development tools have evolved the core-processing platform has also been enhanced. These improved platforms are based on dynamically reconfigurable computing, utilizing FPGA technologies, and parallel processing methods that more than double the performance and data bandwidth capabilities. This offers support for the processing of images in Infrared Scene Projectors with 1024 X 1024 resolutions at 400 Hz frame rates. The processing elements will be using the latest generation of FPGAs, which implies that the presented systems will be rated in terms of Tera (1012) operations per second.
Statistical physics of fracture: scientific discovery through high-performance computing

International Nuclear Information System (INIS)

Kumar, Phani; Nukala, V V; Simunovic, Srdan; Mills, Richard T

2006-01-01

The paper presents the state-of-the-art algorithmic developments for simulating the fracture of disordered quasi-brittle materials using discrete lattice systems. Large scale simulations are often required to obtain accurate scaling laws; however, due to computational complexity, the simulations using the traditional algorithms were limited to small system sizes. We have developed two algorithms: a multiple sparse Cholesky downdating scheme for simulating 2D random fuse model systems, and a block-circulant preconditioner for simulating 2D random fuse model systems. Using these algorithms, we were able to simulate fracture of largest ever lattice system sizes (L = 1024 in 2D, and L = 64 in 3D) with extensive statistical sampling. Our recent simulations on 1024 processors of Cray-XT3 and IBM Blue-Gene/L have further enabled us to explore fracture of 3D lattice systems of size L = 200, which is a significant computational achievement. These largest ever numerical simulations have enhanced our understanding of physics of fracture; in particular, we analyze damage localization and its deviation from percolation behavior, scaling laws for damage density, universality of fracture strength distribution, size effect on the mean fracture strength, and finally the scaling of crack surface roughness
High performance embedded system for real-time pattern matching

Energy Technology Data Exchange (ETDEWEB)

Sotiropoulou, C.-L., E-mail: c.sotiropoulou@cern.ch [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Luciano, P. [University of Cassino and Southern Lazio, Gaetano di Biasio 43, Cassino 03043 (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Gkaitatzis, S. [Aristotle University of Thessaloniki, 54124 Thessaloniki (Greece); Citraro, S. [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Giannetti, P. [INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Dell' Orso, M. [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy)

2017-02-11

In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton–proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device. - Highlights: • A high performance embedded system for real-time pattern matching is proposed. • It is based on a system developed for High Energy Physics experiment triggers. • It mimics the operation of the human brain (cognitive image processing). • The process can be executed on 2D and 3D, black and white or grayscale images. • The implementation uses FPGAs and custom designed associative memory (AM) chips.
High performance embedded system for real-time pattern matching

International Nuclear Information System (INIS)

Sotiropoulou, C.-L.; Luciano, P.; Gkaitatzis, S.; Citraro, S.; Giannetti, P.; Dell'Orso, M.

2017-01-01

In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton–proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device. - Highlights: • A high performance embedded system for real-time pattern matching is proposed. • It is based on a system developed for High Energy Physics experiment triggers. • It mimics the operation of the human brain (cognitive image processing). • The process can be executed on 2D and 3D, black and white or grayscale images. • The implementation uses FPGAs and custom designed associative memory (AM) chips.
Building highly available control system applications with Advanced Telecom Computing Architecture and open standards

International Nuclear Information System (INIS)

Kazakov, Artem; Furukawa, Kazuro

2010-01-01

Requirements for modern and future control systems for large projects like International Linear Collider demand high availability for control system components. Recently telecom industry came up with a great open hardware specification - Advanced Telecom Computing Architecture (ATCA). This specification is aimed for better reliability, availability and serviceability. Since its first market appearance in 2004, ATCA platform has shown tremendous growth and proved to be stable and well represented by a number of vendors. ATCA is an industry standard for highly available systems. On the other hand Service Availability Forum, a consortium of leading communications and computing companies, describes interaction between hardware and software. SAF defines a set of specifications such as Hardware Platform Interface, Application Interface Specification. SAF specifications provide extensive description of highly available systems, services and their interfaces. Originally aimed for telecom applications, these specifications can be used for accelerator controls software as well. This study describes benefits of using these specifications and their possible adoption to accelerator control systems. It is demonstrated how EPICS Redundant IOC was extended using Hardware Platform Interface specification, which made it possible to utilize benefits of the ATCA platform.
Robust and High Order Computational Method for Parachute and Air Delivery and MAV System

Science.gov (United States)

2017-11-01

numerical algorithms and develop a computational platform forthe study of the dynamic system involving highly complex geometric interface immersed in...students in their summer internship. Results Dissemination: Our research project has produced two publications in the Journal of Fluid and Structure, one...publication in the AIAA journal , one in Communication in Computational Physics, along with several related publications in other journals . Two other
High-Performance Modeling of Carbon Dioxide Sequestration by Coupling Reservoir Simulation and Molecular Dynamics

KAUST Repository

Bao, Kai; Yan, Mi; Allen, Rebecca; Salama, Amgad; Lu, Ligang; Jordan, Kirk E.; Sun, Shuyu; Keyes, David E.

2015-01-01

The present work describes a parallel computational framework for carbon dioxide (CO2) sequestration simulation by coupling reservoir simulation and molecular dynamics (MD) on massively parallel high-performance-computing (HPC) systems
The ongoing investigation of high performance parallel computing in HEP

CERN Document Server

Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

1993-01-01

Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.
Development of high-performance solar LED lighting system

KAUST Repository

Huang, B.J.; Wu, M.S.; Hsu, P.C.; Chen, J.W.; Chen, K.Y.

2010-01-01

The present study developed a high-performance charge/discharge controller for stand-alone solar LED lighting system by incorporating an nMPPO system design, a PWM battery charge control, and a PWM battery discharge control to directly drive the LED. The MPPT controller can then be removed from the stand-alone solar system and the charged capacity of the battery increases 9.7%. For LED driven by PWM current directly from battery, a reliability test for the light decay of LED lamps was performed continuously for 13,200 h. It has shown that the light decay of PWM-driven LED is the same as that of constant-current driven LED. The switching energy loss of the MOSFET in the PWM battery discharge control is less than 1%. Three solar-powered LED lighting systems (18 W, 100 W and 150 W LED) were designed and built. The long-term outdoor field test results have shown that the system performance is satisfactory with the control system developed in the present study. The loss of load probability for the 18 W solar LED system is 14.1% in winter and zero in summer. For the 100 W solar LED system, the loss of load probability is 3.6% in spring. © 2009 Elsevier Ltd. All rights reserved.
Development of high-performance solar LED lighting system

International Nuclear Information System (INIS)

Huang, B.J.; Wu, M.S.; Hsu, P.C.; Chen, J.W.; Chen, K.Y.

2010-01-01

The present study developed a high-performance charge/discharge controller for stand-alone solar LED lighting system by incorporating an nMPPO system design, a PWM battery charge control, and a PWM battery discharge control to directly drive the LED. The MPPT controller can then be removed from the stand-alone solar system and the charged capacity of the battery increases 9.7%. For LED driven by PWM current directly from battery, a reliability test for the light decay of LED lamps was performed continuously for 13,200 h. It has shown that the light decay of PWM-driven LED is the same as that of constant-current driven LED. The switching energy loss of the MOSFET in the PWM battery discharge control is less than 1%. Three solar-powered LED lighting systems (18 W, 100 W and 150 W LED) were designed and built. The long-term outdoor field test results have shown that the system performance is satisfactory with the control system developed in the present study. The loss of load probability for the 18 W solar LED system is 14.1% in winter and zero in summer. For the 100 W solar LED system, the loss of load probability is 3.6% in spring.
Development of high-performance solar LED lighting system

KAUST Repository

Huang, B.J.

2010-08-01

The present study developed a high-performance charge/discharge controller for stand-alone solar LED lighting system by incorporating an nMPPO system design, a PWM battery charge control, and a PWM battery discharge control to directly drive the LED. The MPPT controller can then be removed from the stand-alone solar system and the charged capacity of the battery increases 9.7%. For LED driven by PWM current directly from battery, a reliability test for the light decay of LED lamps was performed continuously for 13,200 h. It has shown that the light decay of PWM-driven LED is the same as that of constant-current driven LED. The switching energy loss of the MOSFET in the PWM battery discharge control is less than 1%. Three solar-powered LED lighting systems (18 W, 100 W and 150 W LED) were designed and built. The long-term outdoor field test results have shown that the system performance is satisfactory with the control system developed in the present study. The loss of load probability for the 18 W solar LED system is 14.1% in winter and zero in summer. For the 100 W solar LED system, the loss of load probability is 3.6% in spring. © 2009 Elsevier Ltd. All rights reserved.

The ACP [Advanced Computer Program] multiprocessor system at Fermilab

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1986-09-01

The Advanced Computer Program at Fermilab has developed a multiprocessor system which is easy to use and uniquely cost effective for many high energy physics problems. The system is based on single board computers which cost under $2000 each to build including 2 Mbytes of on board memory. These standard VME modules each run experiment reconstruction code in Fortran at speeds approaching that of a VAX 11/780. Two versions have been developed: one uses Motorola's 68020 32 bit microprocessor, the other runs with AT and T's 32100. both include the corresponding floating point coprocessor chip. The first system, when fully configured, uses 70 each of the two types of processors. A 53 processor system has been operated for several months with essentially no down time by computer operators in the Fermilab Computer Center, performing at nearly the capacity of 6 CDC Cyber 175 mainframe computers. The VME crates in which the processing ''nodes'' sit are connected via a high speed ''Branch Bus'' to one or more MicroVAX computers which act as hosts handling system resource management and all I/O in offline applications. An interface from Fastbus to the Branch Bus has been developed for online use which has been tested error free at 20 Mbytes/sec for 48 hours. ACP hardware modules are now available commercially. A major package of software, including a simulator that runs on any VAX, has been developed. It allows easy migration of existing programs to this multiprocessor environment. This paper describes the ACP Multiprocessor System and early experience with it at Fermilab and elsewhere
High-throughput computational search for strengthening precipitates in alloys

International Nuclear Information System (INIS)

Kirklin, S.; Saal, James E.; Hegde, Vinay I.; Wolverton, C.

2016-01-01

The search for high-strength alloys and precipitation hardened systems has largely been accomplished through Edisonian trial and error experimentation. Here, we present a novel strategy using high-throughput computational approaches to search for promising precipitate/alloy systems. We perform density functional theory (DFT) calculations of an extremely large space of ∼200,000 potential compounds in search of effective strengthening precipitates for a variety of different alloy matrices, e.g., Fe, Al, Mg, Ni, Co, and Ti. Our search strategy involves screening phases that are likely to produce coherent precipitates (based on small lattice mismatch) and are composed of relatively common alloying elements. When combined with the Open Quantum Materials Database (OQMD), we can computationally screen for precipitates that either have a stable two-phase equilibrium with the host matrix, or are likely to precipitate as metastable phases. Our search produces (for the structure types considered) nearly all currently known high-strength precipitates in a variety of fcc, bcc, and hcp matrices, thus giving us confidence in the strategy. In addition, we predict a number of new, currently-unknown precipitate systems that should be explored experimentally as promising high-strength alloy chemistries.
Command vector memory systems: high performance at low cost

OpenAIRE

Corbal San Adrián, Jesús; Espasa Sans, Roger; Valero Cortés, Mateo

1998-01-01

The focus of this paper is on designing both a low cost and high performance, high bandwidth vector memory system that takes advantage of modern commodity SDRAM memory chips. To successfully extract the full bandwidth from SDRAM parts, we propose a new memory system organization based on sending commands to the memory system as opposed to sending individual addresses. A command specifies, in a few bytes, a request for multiple independent memory words. A command is similar to a burst found in...
Distributed computer controls for accelerator systems

Science.gov (United States)

Moore, T. L.

1989-04-01

A distributed control system has been designed and installed at the Lawrence Livermore National Laboratory Multiuser Tandem Facility using an extremely modular approach in hardware and software. The two tiered, geographically organized design allowed total system implantation within four months with a computer and instrumentation cost of approximately $100k. Since the system structure is modular, application to a variety of facilities is possible. Such a system allows rethinking of operational style of the facilities, making possible highly reproducible and unattended operation. The impact of industry standards, i.e., UNIX, CAMAC, and IEEE-802.3, and the use of a graphics-oriented controls software suite allowed the effective implementation of the system. The definition, design, implementation, operation and total system performance will be discussed.
Distributed computer controls for accelerator systems

International Nuclear Information System (INIS)

Moore, T.L.

1989-01-01

A distributed control system has been designed and installed at the Lawrence Livermore National Laboratory Multiuser Tandem Facility using an extremely modular approach in hardware and software. The two tiered, geographically organized design allowed total system implantation within four months with a computer and instrumentation cost of approximately $100k. Since the system structure is modular, application to a variety of facilities is possible. Such a system allows rethinking of operational style of the facilities, making possible highly reproducible and unattended operation. The impact of industry standards, i.e., UNIX, CAMAC, and IEEE-802.3, and the use of a graphics-oriented controls software suite allowed the effective implementation of the system. The definition, design, implementation, operation and total system performance will be discussed. (orig.)
Distributed computer controls for accelerator systems

International Nuclear Information System (INIS)

Moore, T.L.

1988-09-01

A distributed control system has been designed and installed at the Lawrence Livermore National Laboratory Multi-user Tandem Facility using an extremely modular approach in hardware and software. The two tiered, geographically organized design allowed total system implementation with four months with a computer and instrumentation cost of approximately $100K. Since the system structure is modular, application to a variety of facilities is possible. Such a system allows rethinking and operational style of the facilities, making possible highly reproducible and unattended operation. The impact of industry standards, i.e., UNIX, CAMAC, and IEEE-802.3, and the use of a graphics-oriented controls software suite allowed the efficient implementation of the system. The definition, design, implementation, operation and total system performance will be discussed. 3 refs
Fault tolerant computer control for a Maglev transportation system

Science.gov (United States)

Lala, Jaynarayan H.; Nagle, Gail A.; Anagnostopoulos, George

1994-01-01

Magnetically levitated (Maglev) vehicles operating on dedicated guideways at speeds of 500 km/hr are an emerging transportation alternative to short-haul air and high-speed rail. They have the potential to offer a service significantly more dependable than air and with less operating cost than both air and high-speed rail. Maglev transportation derives these benefits by using magnetic forces to suspend a vehicle 8 to 200 mm above the guideway. Magnetic forces are also used for propulsion and guidance. The combination of high speed, short headways, stringent ride quality requirements, and a distributed offboard propulsion system necessitates high levels of automation for the Maglev control and operation. Very high levels of safety and availability will be required for the Maglev control system. This paper describes the mission scenario, functional requirements, and dependability and performance requirements of the Maglev command, control, and communications system. A distributed hierarchical architecture consisting of vehicle on-board computers, wayside zone computers, a central computer facility, and communication links between these entities was synthesized to meet the functional and dependability requirements on the maglev. Two variations of the basic architecture are described: the Smart Vehicle Architecture (SVA) and the Zone Control Architecture (ZCA). Preliminary dependability modeling results are also presented.
Development of a computer-aided digital reactivity computer system for PWRs

International Nuclear Information System (INIS)

Chung, S.-K.; Sung, K.-Y.; Kim, D.; Cho, D.-Y.

1993-01-01

Reactor physics tests at initial startup and after reloading are performed to verify nuclear design and to ensure safety operation. Two kinds of reactivity computers, analog and digital, have been widely used in the pressurized water reactor (PWR) core physics test. The test data of both reactivity computers are displayed only on the strip chart recorder, and these data are managed by hand so that the accuracy of the test results depends on operator expertise and experiences. This paper describes the development of the computer-aided digital reactivity computer system (DRCS), which is enhanced by system management software and an improved system for the application of the PWR core physics test
LHCb: LHCb Muon System Performance at High Luminosity

CERN Multimedia

Pinci, D

2013-01-01

The LHCb detector was conceived to operate with an average Luminosity of $2 \\times 10^{32}$ cm$^{-2}$ s$^{-1}$. During the last year of LHC run, the whole apparatus has shown to be able to perfectly acquire and manage data produced at a Luminosity as high as $4 \\times 10^{32}$ cm$^{-2}$ s$^{-1}$. In these conditions, all sub-detectors operated at average particle rates higher than the design ones and in particular the Multi-Wire Proportional Chambers equipping the Muon System had to sustain a particle rate as high as 250 kHz/cm$^{2}$. In order to study the possibility of increasing the Luminosity of operation of the whole experiment several tests were performed. The effective beam Luminosity at the interaction point of LHCb was increased in several steps up to $10^{33}$ cm$^{-2}$ s$^{-1}$ and in each step the behavior of all the detectors in the Muon System was recorded. The data analysis has allowed to study the performance of the Muon System as a function of the LHC Luminosity and the results are r...
SAME4HPC: A Promising Approach in Building a Scalable and Mobile Environment for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Karthik, Rajasekar [ORNL

2014-01-01

In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack with Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.
IMPROVING THE PERFORMANCE OF THE LINEAR SYSTEMS SOLVERS USING CUDA

Directory of Open Access Journals (Sweden)

BOGDAN OANCEA

2012-05-01

Full Text Available Parallel computing can offer an enormous advantage regarding the performance for very large applications in almost any field: scientific computing, computer vision, databases, data mining, and economics. GPUs are high performance many-core processors that can obtain very high FLOP rates. Since the first idea of using GPU for general purpose computing, things have evolved and now there are several approaches to GPU programming: CUDA from NVIDIA and Stream from AMD. CUDA is now a popular programming model for general purpose computations on GPU for C/C++ programmers. A great number of applications were ported to CUDA programming model and they obtain speedups of orders of magnitude comparing to optimized CPU implementations. In this paper we present an implementation of a library for solving linear systems using the CCUDA framework. We present the results of performance tests and show that using GPU one can obtain speedups of about of approximately 80 times comparing with a CPU implementation.
A Correlated Model for Evaluating Performance and Energy of Cloud System Given System Reliability

Directory of Open Access Journals (Sweden)

Hongli Zhang

2015-01-01

Full Text Available The serious issue of energy consumption for high performance computing systems has attracted much attention. Performance and energy-saving have become important measures of a computing system. In the cloud computing environment, the systems usually allocate various resources (such as CPU, Memory, Storage, etc. on multiple virtual machines (VMs for executing tasks. Therefore, the problem of resource allocation for running VMs should have significant influence on both system performance and energy consumption. For different processor utilizations assigned to the VM, there exists the tradeoff between energy consumption and task completion time when a given task is executed by the VMs. Moreover, the hardware failure, software failure and restoration characteristics also have obvious influences on overall performance and energy. In this paper, a correlated model is built to analyze both performance and energy in the VM execution environment given the reliability restriction, and an optimization model is presented to derive the most effective solution of processor utilization for the VM. Then, the tradeoff between energy-saving and task completion time is studied and balanced when the VMs execute given tasks. Numerical examples are illustrated to build the performance-energy correlated model and evaluate the expected values of task completion time and consumed energy.
Development of low-cost high-performance multispectral camera system at Banpil

Science.gov (United States)

Oduor, Patrick; Mizuno, Genki; Olah, Robert; Dutta, Achyut K.

2014-05-01

Banpil Photonics (Banpil) has developed a low-cost high-performance multispectral camera system for Visible to Short- Wave Infrared (VIS-SWIR) imaging for the most demanding high-sensitivity and high-speed military, commercial and industrial applications. The 640x512 pixel InGaAs uncooled camera system is designed to provide a compact, smallform factor to within a cubic inch, high sensitivity needing less than 100 electrons, high dynamic range exceeding 190 dB, high-frame rates greater than 1000 frames per second (FPS) at full resolution, and low power consumption below 1W. This is practically all the feature benefits highly desirable in military imaging applications to expand deployment to every warfighter, while also maintaining a low-cost structure demanded for scaling into commercial markets. This paper describes Banpil's development of the camera system including the features of the image sensor with an innovation integrating advanced digital electronics functionality, which has made the confluence of high-performance capabilities on the same imaging platform practical at low cost. It discusses the strategies employed including innovations of the key components (e.g. focal plane array (FPA) and Read-Out Integrated Circuitry (ROIC)) within our control while maintaining a fabless model, and strategic collaboration with partners to attain additional cost reductions on optics, electronics, and packaging. We highlight the challenges and potential opportunities for further cost reductions to achieve a goal of a sub-$1000 uncooled high-performance camera system. Finally, a brief overview of emerging military, commercial and industrial applications that will benefit from this high performance imaging system and their forecast cost structure is presented.
14 CFR 415.123 - Computing systems and software.

Science.gov (United States)

2010-01-01

... 14 Aeronautics and Space 4 2010-01-01 2010-01-01 false Computing systems and software. 415.123... Launch Vehicle From a Non-Federal Launch Site § 415.123 Computing systems and software. (a) An applicant's safety review document must describe all computing systems and software that perform a safety...
Computer proficiency questionnaire: assessing low and high computer proficient seniors.

Science.gov (United States)

Boot, Walter R; Charness, Neil; Czaja, Sara J; Sharit, Joseph; Rogers, Wendy A; Fisk, Arthur D; Mitzner, Tracy; Lee, Chin Chin; Nair, Sankaran

2015-06-01

Computers and the Internet have the potential to enrich the lives of seniors and aid in the performance of important tasks required for independent living. A prerequisite for reaping these benefits is having the skills needed to use these systems, which is highly dependent on proper training. One prerequisite for efficient and effective training is being able to gauge current levels of proficiency. We developed a new measure (the Computer Proficiency Questionnaire, or CPQ) to measure computer proficiency in the domains of computer basics, printing, communication, Internet, calendaring software, and multimedia use. Our aim was to develop a measure appropriate for individuals with a wide range of proficiencies from noncomputer users to extremely skilled users. To assess the reliability and validity of the CPQ, a diverse sample of older adults, including 276 older adults with no or minimal computer experience, was recruited and asked to complete the CPQ. The CPQ demonstrated excellent reliability (Cronbach's α = .98), with subscale reliabilities ranging from .86 to .97. Age, computer use, and general technology use all predicted CPQ scores. Factor analysis revealed three main factors of proficiency related to Internet and e-mail use; communication and calendaring; and computer basics. Based on our findings, we also developed a short-form CPQ (CPQ-12) with similar properties but 21 fewer questions. The CPQ and CPQ-12 are useful tools to gauge computer proficiency for training and research purposes, even among low computer proficient older adults. © The Author 2013. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

Science.gov (United States)

Guerrero, Ginés D.; Imbernón, Baldomero; García, José M.

2014-01-01

Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC) platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs) has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO). This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor. PMID:25025055
Selection of a computer code for Hanford low-level waste engineered-system performance assessment

International Nuclear Information System (INIS)

McGrail, B.P.; Mahoney, L.A.

1995-10-01

Planned performance assessments for the proposed disposal of low-level waste (LLW) glass produced from remediation of wastes stored in underground tanks at Hanford, Washington will require calculations of radionuclide release rates from the subsurface disposal facility. These calculations will be done with the aid of computer codes. Currently available computer codes were ranked in terms of the feature sets implemented in the code that match a set of physical, chemical, numerical, and functional capabilities needed to assess release rates from the engineered system. The needed capabilities were identified from an analysis of the important physical and chemical process expected to affect LLW glass corrosion and the mobility of radionuclides. The highest ranked computer code was found to be the ARES-CT code developed at PNL for the US Department of Energy for evaluation of and land disposal sites
Development and Performance of the Modularized, High-performance Computing and Hybrid-architecture Capable GEOS-Chem Chemical Transport Model

Science.gov (United States)

Long, M. S.; Yantosca, R.; Nielsen, J.; Linford, J. C.; Keller, C. A.; Payer Sulprizio, M.; Jacob, D. J.

2014-12-01

The GEOS-Chem global chemical transport model (CTM), used by a large atmospheric chemistry research community, has been reengineered to serve as a platform for a range of computational atmospheric chemistry science foci and applications. Development included modularization for coupling to general circulation and Earth system models (ESMs) and the adoption of co-processor capable atmospheric chemistry solvers. This was done using an Earth System Modeling Framework (ESMF) interface that operates independently of GEOS-Chem scientific code to permit seamless transition from the GEOS-Chem stand-alone serial CTM to deployment as a coupled ESM module. In this manner, the continual stream of updates contributed by the CTM user community is automatically available for broader applications, which remain state-of-science and directly referenceable to the latest version of the standard GEOS-Chem CTM. These developments are now available as part of the standard version of the GEOS-Chem CTM. The system has been implemented as an atmospheric chemistry module within the NASA GEOS-5 ESM. The coupled GEOS-5/GEOS-Chem system was tested for weak and strong scalability and performance with a tropospheric oxidant-aerosol simulation. Results confirm that the GEOS-Chem chemical operator scales efficiently for any number of processes. Although inclusion of atmospheric chemistry in ESMs is computationally expensive, the excellent scalability of the chemical operator means that the relative cost goes down with increasing number of processes, making fine-scale resolution simulations possible.
Tackling some of the most intricate geophysical challenges via high-performance computing

Science.gov (United States)

Khosronejad, A.

2016-12-01

Recently, world has been witnessing significant enhancements in computing power of supercomputers. Computer clusters in conjunction with the advanced mathematical algorithms has set the stage for developing and applying powerful numerical tools to tackle some of the most intricate geophysical challenges that today`s engineers face. One such challenge is to understand how turbulent flows, in real-world settings, interact with (a) rigid and/or mobile complex bed bathymetry of waterways and sea-beds in the coastal areas; (b) objects with complex geometry that are fully or partially immersed; and (c) free-surface of waterways and water surface waves in the coastal area. This understanding is especially important because the turbulent flows in real-world environments are often bounded by geometrically complex boundaries, which dynamically deform and give rise to multi-scale and multi-physics transport phenomena, and characterized by multi-lateral interactions among various phases (e.g. air/water/sediment phases). Herein, I present some of the multi-scale and multi-physics geophysical fluid mechanics processes that I have attempted to study using an in-house high-performance computational model, the so-called VFS-Geophysics. More specifically, I will present the simulation results of turbulence/sediment/solute/turbine interactions in real-world settings. Parts of the simulations I present are performed to gain scientific insights into the processes such as sand wave formation (A. Khosronejad, and F. Sotiropoulos, (2014), Numerical simulation of sand waves in a turbulent open channel flow, Journal of Fluid Mechanics, 753:150-216), while others are carried out to predict the effects of climate change and large flood events on societal infrastructures ( A. Khosronejad, et al., (2016), Large eddy simulation of turbulence and solute transport in a forested headwater stream, Journal of Geophysical Research:, doi: 10.1002/2014JF003423).
Computational fluid dynamics (CFD) assisted performance evaluation of the Twincer (TM) disposable high-dose dry powder inhaler

NARCIS (Netherlands)

de Boer, Anne H.; Hagedoorn, Paul; Woolhouse, Robert; Wynn, Ed

Objectives To use computational fluid dynamics (CFD) for evaluating and understanding the performance of the high-dose disposable Twincer (TM) dry powder inhaler, as well as to learn the effect of design modifications on dose entrainment, powder dispersion and retention behaviour. Methods Comparison

Some links on this page may take you to non-federal websites. Their policies may differ from this site.