high-performance computing application: Topics by WorldWideScience.org

Sample records for high-performance computing application

DURIP: High Performance Computing in Biomathematics Applications

Science.gov (United States)

2017-05-10

Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied
High Performance Computing Software Applications for Space Situational Awareness

Science.gov (United States)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
High-performance computing using FPGAs

CERN Document Server

Benkrid, Khaled

2013-01-01

This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community. The book includes: Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation. Seven architecture chapters which...
RAPPORT: running scientific high-performance computing applications on the cloud.

Science.gov (United States)

Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

2013-01-28

Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael; HLRS 2017

2018-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High-performance computing for airborne applications

International Nuclear Information System (INIS)

Quinn, Heather M.; Manuzatto, Andrea; Fairbanks, Tom; Dallmann, Nicholas; Desgeorges, Rose

2010-01-01

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.
High performance parallel computing of flows in complex geometries: II. Applications

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F; Poinsot, T

2009-01-01

Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.
Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

KAUST Repository

Downes, Turlough P.

2013-01-01

As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.
High Performance Computing - Power Application Programming Interface Specification.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H.,; Kelly, Suzanne M.; Pedretti, Kevin; Grant, Ryan; Olivier, Stephen Lecler; Levenhagen, Michael J.; DeBonis, David

2014-08-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High-performance computing in seismology

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-09-01

The scientific, technical, and economic importance of the issues discussed here presents a clear agenda for future research in computational seismology. In this way these problems will drive advances in high-performance computing in the field of seismology. There is a broad community that will benefit from this work, including the petroleum industry, research geophysicists, engineers concerned with seismic hazard mitigation, and governments charged with enforcing a comprehensive test ban treaty. These advances may also lead to new applications for seismological research. The recent application of high-resolution seismic imaging of the shallow subsurface for the environmental remediation industry is an example of this activity. This report makes the following recommendations: (1) focused efforts to develop validated documented software for seismological computations should be supported, with special emphasis on scalable algorithms for parallel processors; (2) the education of seismologists in high-performance computing technologies and methodologies should be improved; (3) collaborations between seismologists and computational scientists and engineers should be increased; (4) the infrastructure for archiving, disseminating, and processing large volumes of seismological data should be improved.
Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2013-01-01

Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the
High-performance scientific computing in the cloud

Science.gov (United States)

Jorissen, Kevin; Vila, Fernando; Rehr, John

2011-03-01

Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.
High Performance Computing in Science and Engineering '14

CERN Document Server

Kröner, Dietmar; Resch, Michael

2015-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
Quantum Accelerators for High-performance Computing Systems

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

2017-11-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.
High performance computing in linear control

International Nuclear Information System (INIS)

Datta, B.N.

1993-01-01

Remarkable progress has been made in both theory and applications of all important areas of control. The theory is rich and very sophisticated. Some beautiful applications of control theory are presently being made in aerospace, biomedical engineering, industrial engineering, robotics, economics, power systems, etc. Unfortunately, the same assessment of progress does not hold in general for computations in control theory. Control Theory is lagging behind other areas of science and engineering in this respect. Nowadays there is a revolution going on in the world of high performance scientific computing. Many powerful computers with vector and parallel processing have been built and have been available in recent years. These supercomputers offer very high speed in computations. Highly efficient software, based on powerful algorithms, has been developed to use on these advanced computers, and has also contributed to increased performance. While workers in many areas of science and engineering have taken great advantage of these hardware and software developments, control scientists and engineers, unfortunately, have not been able to take much advantage of these developments
DOE research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-12-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2003-01-01

This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.
High Performance Computing - Power Application Programming Interface Specification Version 2.0.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ward, H. Lee [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-03-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

High-Performance Java Codes for Computational Fluid Dynamics

Science.gov (United States)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
High Performance Networks From Supercomputing to Cloud Computing

CERN Document Server

Abts, Dennis

2011-01-01

Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati
High Performance Computing - Power Application Programming Interface Specification Version 1.4

Energy Technology Data Exchange (ETDEWEB)

Laros III, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); DeBonis, David [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2016-10-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
High-performance floating-point image computing workstation for medical applications

Science.gov (United States)

Mills, Karl S.; Wong, Gilman K.; Kim, Yongmin

1990-07-01

The medical imaging field relies increasingly on imaging and graphics techniques in diverse applications with needs similar to (or more stringent than) those of the military, industrial and scientific communities. However, most image processing and graphics systems available for use in medical imaging today are either expensive, specialized, or in most cases both. High performance imaging and graphics workstations which can provide real-time results for a number of applications, while maintaining affordability and flexibility, can facilitate the application of digital image computing techniques in many different areas. This paper describes the hardware and software architecture of a medium-cost floating-point image processing and display subsystem for the NeXT computer, and its applications as a medical imaging workstation. Medical imaging applications of the workstation include use in a Picture Archiving and Communications System (PACS), in multimodal image processing and 3-D graphics workstation for a broad range of imaging modalities, and as an electronic alternator utilizing its multiple monitor display capability and large and fast frame buffer. The subsystem provides a 2048 x 2048 x 32-bit frame buffer (16 Mbytes of image storage) and supports both 8-bit gray scale and 32-bit true color images. When used to display 8-bit gray scale images, up to four different 256-color palettes may be used for each of four 2K x 2K x 8-bit image frames. Three of these image frames can be used simultaneously to provide pixel selectable region of interest display. A 1280 x 1024 pixel screen with 1: 1 aspect ratio can be windowed into the frame buffer for display of any portion of the processed image or images. In addition, the system provides hardware support for integer zoom and an 82-color cursor. This subsystem is implemented on an add-in board occupying a single slot in the NeXT computer. Up to three boards may be added to the NeXT for multiple display capability (e
Department of Energy research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-08-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
Quantum Accelerators for High-Performance Computing Systems

OpenAIRE

Britt, Keith A.; Mohiyaddin, Fahd A.; Humble, Travis S.

2017-01-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantu...
Cloud object store for checkpoints of high performance computing applications using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-04-19

Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
High performance parallel computers for science

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

1989-01-01

This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction
Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa

2012-10-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a representative HPC application. © 2012 IEEE.
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

Science.gov (United States)

Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

2012-01-01

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

Science.gov (United States)

Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

2012-01-01

The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.
GPU-based high-performance computing for radiation therapy

International Nuclear Information System (INIS)

Jia, Xun; Jiang, Steve B; Ziegenhein, Peter

2014-01-01

Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. (topical review)
Software Systems for High-performance Quantum Computing

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S [ORNL; Britt, Keith A [ORNL

2016-01-01

Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventional computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.
Inclusive vision for high performance computing at the CSIR

CSIR Research Space (South Africa)

Gazendam, A

2006-02-01

Full Text Available and computationally intensive applications. A number of different technologies and standards were identified as core to the open and distributed high-performance infrastructure envisaged...
A high performance scientific cloud computing environment for materials simulations

Science.gov (United States)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Computational Biology and High Performance Computing 2000

Energy Technology Data Exchange (ETDEWEB)

Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

2000-10-19

The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.
Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2015-01-01

A continuation of Contemporary High Performance Computing: From Petascale toward Exascale, this second volume continues the discussion of HPC flagship systems, major application workloads, facilities, and sponsors. The book includes of figures and pictures that capture the state of existing systems: pictures of buildings, systems in production, floorplans, and many block diagrams and charts to illustrate system design and performance.
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY

International Nuclear Information System (INIS)

FENG, H.; JONES, K.W.; MCGUIGAN, M.; SMITH, G.J.; SPILETIC, J.

2001-01-01

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY.

Energy Technology Data Exchange (ETDEWEB)

FENG,H.; JONES,K.W.; MCGUIGAN,M.; SMITH,G.J.; SPILETIC,J.

2001-10-12

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.
5th International Conference on High Performance Scientific Computing

CERN Document Server

Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

2014-01-01

This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...

3rd International Conference on High Performance Scientific Computing

CERN Document Server

Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

2008-01-01

This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...
6th International Conference on High Performance Scientific Computing

CERN Document Server

Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

2017-01-01

This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...
A High Performance VLSI Computer Architecture For Computer Graphics

Science.gov (United States)

Chin, Chi-Yuan; Lin, Wen-Tai

1988-10-01

A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

Science.gov (United States)

Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

2012-09-01

By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

Science.gov (United States)

Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
Analysis of Application Power and Schedule Composition in a High Performance Computing Environment

Energy Technology Data Exchange (ETDEWEB)

Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Gruchalla, Kenny [National Renewable Energy Lab. (NREL), Golden, CO (United States); Phillips, Caleb [National Renewable Energy Lab. (NREL), Golden, CO (United States); Purkayastha, Avi [National Renewable Energy Lab. (NREL), Golden, CO (United States); Wunder, Nick [National Renewable Energy Lab. (NREL), Golden, CO (United States)

2016-01-05

As the capacity of high performance computing (HPC) systems continues to grow, small changes in energy management have the potential to produce significant energy savings. In this paper, we employ an extensive informatics system for aggregating and analyzing real-time performance and power use data to evaluate energy footprints of jobs running in an HPC data center. We look at the effects of algorithmic choices for a given job on the resulting energy footprints, and analyze application-specific power consumption, and summarize average power use in the aggregate. All of these views reveal meaningful power variance between classes of applications as well as chosen methods for a given job. Using these data, we discuss energy-aware cost-saving strategies based on reordering the HPC job schedule. Using historical job and power data, we present a hypothetical job schedule reordering that: (1) reduces the facility's peak power draw and (2) manages power in conjunction with a large-scale photovoltaic array. Lastly, we leverage this data to understand the practical limits on predicting key power use metrics at the time of submission.
High-performance computing — an overview

Science.gov (United States)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
Computer performance optimization systems, applications, processes

CERN Document Server

Osterhage, Wolfgang W

2013-01-01

Computing power performance was important at times when hardware was still expensive, because hardware had to be put to the best use. Later on this criterion was no longer critical, since hardware had become inexpensive. Meanwhile, however, people have realized that performance again plays a significant role, because of the major drain on system resources involved in developing complex applications. This book distinguishes between three levels of performance optimization: the system level, application level and business processes level. On each, optimizations can be achieved and cost-cutting p
Bringing high-performance computing to the biologist's workbench: approaches, applications, and challenges

International Nuclear Information System (INIS)

Oehmen, C S; Cannon, W R

2008-01-01

Data-intensive and high-performance computing are poised to significantly impact the future of biological research which is increasingly driven by the prevalence of high-throughput experimental methodologies for genome sequencing, transcriptomics, proteomics, and other areas. Large centers such as NIH's National Center for Biotechnology Information, The Institute for Genomic Research, and the DOE's Joint Genome Institute) have made extensive use of multiprocessor architectures to deal with some of the challenges of processing, storing and curating exponentially growing genomic and proteomic datasets, thus enabling users to rapidly access a growing public data source, as well as use analysis tools transparently on high-performance computing resources. Applying this computational power to single-investigator analysis, however, often relies on users to provide their own computational resources, forcing them to endure the learning curve of porting, building, and running software on multiprocessor architectures. Solving the next generation of large-scale biology challenges using multiprocessor machines-from small clusters to emerging petascale machines-can most practically be realized if this learning curve can be minimized through a combination of workflow management, data management and resource allocation as well as intuitive interfaces and compatibility with existing common data formats
Benchmarking high performance computing architectures with CMS’ skeleton framework

Science.gov (United States)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
A High Performance COTS Based Computer Architecture

Science.gov (United States)

Patte, Mathieu; Grimoldi, Raoul; Trautner, Roland

2014-08-01

Using Commercial Off The Shelf (COTS) electronic components for space applications is a long standing idea. Indeed the difference in processing performance and energy efficiency between radiation hardened components and COTS components is so important that COTS components are very attractive for use in mass and power constrained systems. However using COTS components in space is not straightforward as one must account with the effects of the space environment on the COTS components behavior. In the frame of the ESA funded activity called High Performance COTS Based Computer, Airbus Defense and Space and its subcontractor OHB CGS have developed and prototyped a versatile COTS based architecture for high performance processing. The rest of the paper is organized as follows: in a first section we will start by recapitulating the interests and constraints of using COTS components for space applications; then we will briefly describe existing fault mitigation architectures and present our solution for fault mitigation based on a component called the SmartIO; in the last part of the paper we will describe the prototyping activities executed during the HiP CBC project.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Science.gov (United States)

Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Directory of Open Access Journals (Sweden)

Anwar S. Shatil

2015-01-01

Full Text Available With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1 inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2 highlight their main advantages; 3 discuss when it may (and may not be advisable to use them; 4 review some of their potential problems and barriers to access; and finally 5 give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc., a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
High performance computing in Windows Azure cloud

OpenAIRE

Ambruš, Dejan

2013-01-01

High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...
SCEAPI: A unified Restful Web API for High-Performance Computing

Science.gov (United States)

Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi

2017-10-01

The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

Cordes Ben

2009-01-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

2009-03-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
High performance computing and communications: Advancing the frontiers of information technology

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-12-31

This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.
Multi-Language Programming Environments for High Performance Java Computing

OpenAIRE

Vladimir Getov; Paul Gray; Sava Mintchev; Vaidy Sunderam

1999-01-01

Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI) tool which provides ...
A Heterogeneous High-Performance System for Computational and Computer Science

Science.gov (United States)

2016-11-15

expand the research infrastructure at the institution but also to enhance the high -performance computing training provided to both undergraduate and... cloud computing, supercomputing, and the availability of cheap memory and storage led to enormous amounts of data to be sifted through in forensic... High -Performance Computing (HPC) tools that can be integrated with existing curricula and support our research to modernize and dramatically advance

Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

Science.gov (United States)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the
DEISA2: supporting and developing a European high-performance computing ecosystem

International Nuclear Information System (INIS)

Lederer, H

2008-01-01

The DEISA Consortium has deployed and operated the Distributed European Infrastructure for Supercomputing Applications. Through the EU FP7 DEISA2 project (funded for three years as of May 2008), the consortium is continuing to support and enhance the distributed high-performance computing infrastructure and its activities and services relevant for applications enabling, operation, and technologies, as these are indispensable for the effective support of computational sciences for high-performance computing (HPC). The service-provisioning model will be extended from one that supports single projects to one supporting virtual European communities. Collaborative activities will also be carried out with new European and other international initiatives. Of strategic importance is cooperation with the PRACE project, which is preparing for the installation of a limited number of leadership-class Tier-0 supercomputers in Europe. The key role and aim of DEISA will be to deliver a turnkey operational solution for a persistent European HPC ecosystem that will integrate national Tier-1 centers and the new Tier-0 centers
High performance parallel computers for science: New developments at the Fermilab advanced computer program

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs
High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

International Nuclear Information System (INIS)

Pop, Florin

2014-01-01

Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.
Multicore Challenges and Benefits for High Performance Scientific Computing

Directory of Open Access Journals (Sweden)

Ida M.B. Nielsen

2008-01-01

Full Text Available Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.
Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

Energy Technology Data Exchange (ETDEWEB)

Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

2017-07-14

As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.
Multi-Language Programming Environments for High Performance Java Computing

Directory of Open Access Journals (Sweden)

Vladimir Getov

1999-01-01

Full Text Available Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI tool which provides application programmers wishing to use Java with immediate accessibility to existing scientific packages. The JCI tool also facilitates rapid development and reuse of existing code. These benefits are provided at minimal cost to the programmer. While beneficial to the programmer, the additional advantages of mixed‐language programming in terms of application performance and portability are addressed in detail within the context of this paper. In addition, we discuss how the JCI tool is complementing other ongoing projects such as IBM’s High‐Performance Compiler for Java (HPCJ and IceT’s metacomputing environment.
A checkpoint compression study for high-performance computing systems

Energy Technology Data Exchange (ETDEWEB)

Ibtesham, Dewan [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science; Ferreira, Kurt B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States). Scalable System Software Dept.; Arnold, Dorian [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science

2015-02-17

As high-performance computing systems continue to increase in size and complexity, higher failure rates and increased overheads for checkpoint/restart (CR) protocols have raised concerns about the practical viability of CR protocols for future systems. Previously, compression has proven to be a viable approach for reducing checkpoint data volumes and, thereby, reducing CR protocol overhead leading to improved application performance. In this article, we further explore compression-based CR optimization by exploring its baseline performance and scaling properties, evaluating whether improved compression algorithms might lead to even better application performance and comparing checkpoint compression against and alongside other software- and hardware-based optimizations. Our results highlights are: (1) compression is a very viable CR optimization; (2) generic, text-based compression algorithms appear to perform near optimally for checkpoint data compression and faster compression algorithms will not lead to better application performance; (3) compression-based optimizations fare well against and alongside other software-based optimizations; and (4) while hardware-based optimizations outperform software-based ones, they are not as cost effective.
Study of application technology of ultra-high speed computer to the elucidation of complex phenomena

International Nuclear Information System (INIS)

Sekiguchi, Tomotsugu

1996-01-01

The basic design of numerical information library in the decentralized computer network was explained at the first step of constructing the application technology of ultra-high speed computer to the elucidation of complex phenomena. Establishment of the system makes possible to construct the efficient application environment of ultra-high speed computer system to be scalable with the different computing systems. We named the system Ninf (Network Information Library for High Performance Computing). The summary of application technology of library was described as follows: the application technology of library under the distributed environment, numeric constants, retrieval of value, library of special functions, computing library, Ninf library interface, Ninf remote library and registration. By the system, user is able to use the program concentrating the analyzing technology of numerical value with high precision, reliability and speed. (S.Y.)
8th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2015-01-01

Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.
9th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

2016-01-01

High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...
Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

KAUST Repository

Danani, Bob K.

2012-07-01

Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.
BurstMem: A High-Performance Burst Buffer System for Scientific Applications

Energy Technology Data Exchange (ETDEWEB)

Wang, Teng [Auburn University, Auburn, Alabama; Oral, H Sarp [ORNL; Wang, Yandong [Auburn University, Auburn, Alabama; Settlemyer, Bradley W [ORNL; Atchley, Scott [ORNL; Yu, Weikuan [Auburn University, Auburn, Alabama

2014-01-01

The growth of computing power on large-scale sys- tems requires commensurate high-bandwidth I/O system. Many parallel file systems are designed to provide fast sustainable I/O in response to applications soaring requirements. To meet this need, a novel system is imperative to temporarily buffer the bursty I/O and gradually flush datasets to long-term parallel file systems. In this paper, we introduce the design of BurstMem, a high- performance burst buffer system. BurstMem provides a storage framework with efficient storage and communication manage- ment strategies. Our experiments demonstrate that BurstMem is able to speed up the I/O performance of scientific applications by up to 8.5 on leadership computer systems.
Optical interconnection networks for high-performance computing systems

International Nuclear Information System (INIS)

Biberman, Aleksandr; Bergman, Keren

2012-01-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)
A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

Science.gov (United States)

Guerrero, Ginés D.; Imbernón, Baldomero; García, José M.

2014-01-01

Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC) platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs) has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO). This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor. PMID:25025055
NCI's High Performance Computing (HPC) and High Performance Data (HPD) Computing Platform for Environmental and Earth System Data Science

Science.gov (United States)

Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

2015-04-01

The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially
Debugging a high performance computing program

Science.gov (United States)

Gooding, Thomas M.

2013-08-20

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
Ubiquitous Green Computing Techniques for High Demand Applications in Smart Environments

Directory of Open Access Journals (Sweden)

Jose M. Moya

2012-08-01

Full Text Available Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.
Ubiquitous green computing techniques for high demand applications in Smart environments.

Science.gov (United States)

Zapater, Marina; Sanchez, Cesar; Ayala, Jose L; Moya, Jose M; Risco-Martín, José L

2012-01-01

Ubiquitous sensor network deployments, such as the ones found in Smart cities and Ambient intelligence applications, require constantly increasing high computational demands in order to process data and offer services to users. The nature of these applications imply the usage of data centers. Research has paid much attention to the energy consumption of the sensor nodes in WSNs infrastructures. However, supercomputing facilities are the ones presenting a higher economic and environmental impact due to their very high power consumption. The latter problem, however, has been disregarded in the field of smart environment services. This paper proposes an energy-minimization workload assignment technique, based on heterogeneity and application-awareness, that redistributes low-demand computational tasks from high-performance facilities to idle nodes with low and medium resources in the WSN infrastructure. These non-optimal allocation policies reduce the energy consumed by the whole infrastructure and the total execution time.
High Performance Computing Facility Operational Assessment, FY 2010 Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; White, Julia C [ORNL

2010-08-01

Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energy assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools

Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

Science.gov (United States)

Ahn, Sul-Ah; Jung, Youngim

2016-10-01

The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.
Trends in high-performance computing for engineering calculations.

Science.gov (United States)

Giles, M B; Reguly, I

2014-08-13

High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

Science.gov (United States)

Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

2013-01-01

With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
The application of cloud computing to scientific workflows: a study of cost and performance.

Science.gov (United States)

Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S

2013-01-28

The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.
High performance statistical computing with parallel R: applications to biology and climate modelling

International Nuclear Information System (INIS)

Samatova, Nagiza F; Branstetter, Marcia; Ganguly, Auroop R; Hettich, Robert; Khan, Shiraj; Kora, Guruprasad; Li, Jiangtian; Ma, Xiaosong; Pan, Chongle; Shoshani, Arie; Yoginath, Srikanth

2006-01-01

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines
A Performance/Cost Evaluation for a GPU-Based Drug Discovery Application on Volunteer Computing

Directory of Open Access Journals (Sweden)

Ginés D. Guerrero

2014-01-01

Full Text Available Bioinformatics is an interdisciplinary research field that develops tools for the analysis of large biological databases, and, thus, the use of high performance computing (HPC platforms is mandatory for the generation of useful biological knowledge. The latest generation of graphics processing units (GPUs has democratized the use of HPC as they push desktop computers to cluster-level performance. Many applications within this field have been developed to leverage these powerful and low-cost architectures. However, these applications still need to scale to larger GPU-based systems to enable remarkable advances in the fields of healthcare, drug discovery, genome research, etc. The inclusion of GPUs in HPC systems exacerbates power and temperature issues, increasing the total cost of ownership (TCO. This paper explores the benefits of volunteer computing to scale bioinformatics applications as an alternative to own large GPU-based local infrastructures. We use as a benchmark a GPU-based drug discovery application called BINDSURF that their computational requirements go beyond a single desktop machine. Volunteer computing is presented as a cheap and valid HPC system for those bioinformatics applications that need to process huge amounts of data and where the response time is not a critical factor.
InfoMall: An Innovative Strategy for High-Performance Computing and Communications Applications Development.

Science.gov (United States)

Mills, Kim; Fox, Geoffrey

1994-01-01

Describes the InfoMall, a program led by the Northeast Parallel Architectures Center (NPAC) at Syracuse University (New York). The InfoMall features a partnership of approximately 24 organizations offering linked programs in High Performance Computing and Communications (HPCC) technology integration, software development, marketing, education and…
US QCD computational performance studies with PERI

International Nuclear Information System (INIS)

Zhang, Y; Fowler, R; Huck, K; Malony, A; Porterfield, A; Reed, D; Shende, S; Taylor, V; Wu, X

2007-01-01

We report on some of the interactions between two SciDAC projects: The National Computational Infrastructure for Lattice Gauge Theory (USQCD), and the Performance Engineering Research Institute (PERI). Many modern scientific programs consistently report the need for faster computational resources to maintain global competitiveness. However, as the size and complexity of emerging high end computing (HEC) systems continue to rise, achieving good performance on such systems is becoming ever more challenging. In order to take full advantage of the resources, it is crucial to understand the characteristics of relevant scientific applications and the systems these applications are running on. Using tools developed under PERI and by other performance measurement researchers, we studied the performance of two applications, MILC and Chroma, on several high performance computing systems at DOE laboratories. In the case of Chroma, we discuss how the use of C++ and modern software engineering and programming methods are driving the evolution of performance tools
7th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Nagel, Wolfgang; Resch, Michael

2014-01-01

Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.
High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2000-01-01

The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.
A data acquisition computer for high energy physics applications DAFNE:- hardware manual

International Nuclear Information System (INIS)

Barlow, J.; Seller, P.; De-An, W.

1983-07-01

A high performance stand alone computer system based on the Motorola 68000 micro processor has been built at the Rutherford Appleton Laboratory. Although the design was strongly influenced by the requirement to provide a compact data acquisition computer for the high energy physics environment, the system is sufficiently general to find applications in a wider area. It provides colour graphics and tape and disc storage together with access to CAMAC systems. This report is the hardware manual of the data acquisition computer, DAFNE (Data Acquisition For Nuclear Experiments), and as such contains a full description of the hardware structure of the computer system. (author)
Embedded High Performance Scalable Computing Systems

National Research Council Canada - National Science Library

Ngo, David

2003-01-01

The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...
Towards the development of run times leveraging virtualization for high performance computing

International Nuclear Information System (INIS)

Diakhate, F.

2010-12-01

In recent years, there has been a growing interest in using virtualization to improve the efficiency of data centers. This success is rooted in virtualization's excellent fault tolerance and isolation properties, in the overall flexibility it brings, and in its ability to exploit multi-core architectures efficiently. These characteristics also make virtualization an ideal candidate to tackle issues found in new compute cluster architectures. However, in spite of recent improvements in virtualization technology, overheads in the execution of parallel applications remain, which prevent its use in the field of high performance computing. In this thesis, we propose a virtual device dedicated to message passing between virtual machines, so as to improve the performance of parallel applications executed in a cluster of virtual machines. We also introduce a set of techniques facilitating the deployment of virtualized parallel applications. These functionalities have been implemented as part of a runtime system which allows to benefit from virtualization's properties in a way that is as transparent as possible to the user while minimizing performance overheads. (author)
14th annual Results and Review Workshop on High Performance Computing in Science and Engineering

CERN Document Server

Nagel, Wolfgang E; Resch, Michael M; Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2011; High Performance Computing in Science and Engineering '11

2012-01-01

This book presents the state-of-the-art in simulation on supercomputers. Leading researchers present results achieved on systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2011. The reports cover all fields of computational science and engineering, ranging from CFD to computational physics and chemistry, to computer science, with a special emphasis on industrially relevant applications. Presenting results for both vector systems and microprocessor-based systems, the book allows readers to compare the performance levels and usability of various architectures. As HLRS
Overview of Parallel Platforms for Common High Performance Computing

Directory of Open Access Journals (Sweden)

T. Fryza

2012-04-01

Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.
High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

1999-01-01

The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.
Software Applications on the Peregrine System | High-Performance Computing

Science.gov (United States)

Algebraic Modeling System (GAMS) Statistics and analysis High-level modeling system for mathematical reactivity. Gurobi Optimizer Statistics and analysis Solver for mathematical programming LAMMPS Chemistry and , reactivities, and vibrational, electronic and NMR spectra. R Statistical Computing Environment Statistics and
Visualization and Data Analysis for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-09-27

This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
High-performance secure multi-party computation for data mining applications

DEFF Research Database (Denmark)

Bogdanov, Dan; Niitsoo, Margus; Toft, Tomas

2012-01-01

Secure multi-party computation (MPC) is a technique well suited for privacy-preserving data mining. Even with the recent progress in two-party computation techniques such as fully homomorphic encryption, general MPC remains relevant as it has shown promising performance metrics in real...... operations such as multiplication and comparison. Secondly, the confidential processing of financial data requires the use of more complex primitives, including a secure division operation. This paper describes new protocols in the Sharemind model for secure multiplication, share conversion, equality, bit...
High performance computing on vector systems

CERN Document Server

Roller, Sabine

2008-01-01

Presents the developments in high-performance computing and simulation on modern supercomputer architectures. This book covers trends in hardware and software development in general and specifically the vector-based systems and heterogeneous architectures. It presents innovative fields like coupled multi-physics or multi-scale simulations.

High Performance Computing Modernization Program Kerberos Throughput Test Report

Science.gov (United States)

2017-10-26

Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5524--17-9751 High Performance Computing Modernization Program Kerberos Throughput Test ...NUMBER 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 2. REPORT TYPE1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 6. AUTHOR(S) 8. PERFORMING...PAGE 18. NUMBER OF PAGES 17. LIMITATION OF ABSTRACT High Performance Computing Modernization Program Kerberos Throughput Test Report Daniel G. Gdula* and
CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

OpenAIRE

YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

2009-01-01

[ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...
Low-cost, high-performance and efficiency computational photometer design

Science.gov (United States)

Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly

2014-05-01

Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.
Application of High Performance Computing to Earthquake Hazard and Disaster Estimation in Urban Area

Directory of Open Access Journals (Sweden)

Muneo Hori

2018-02-01

Full Text Available Integrated earthquake simulation (IES is a seamless simulation of analyzing all processes of earthquake hazard and disaster. There are two difficulties in carrying out IES, namely, the requirement of large-scale computation and the requirement of numerous analysis models for structures in an urban area, and they are solved by taking advantage of high performance computing (HPC and by developing a system of automated model construction. HPC is a key element in developing IES, as it needs to analyze wave propagation and amplification processes in an underground structure; a model of high fidelity for the underground structure exceeds a degree-of-freedom larger than 100 billion. Examples of IES for Tokyo Metropolis are presented; the numerical computation is made by using K computer, the supercomputer of Japan. The estimation of earthquake hazard and disaster for a given earthquake scenario is made by the ground motion simulation and the urban area seismic response simulation, respectively, for the target area of 10,000 m × 10,000 m.
RISC Processors and High Performance Computing

Science.gov (United States)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Micromagnetics on high-performance workstation and mobile computational platforms

Science.gov (United States)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
Monitoring SLAC High Performance UNIX Computing Systems

International Nuclear Information System (INIS)

Lettsome, Annette K.

2005-01-01

Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia. Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface
Accelerating Scientific Applications using High Performance Dense and Sparse Linear Algebra Kernels on GPUs

KAUST Repository

Abdelfattah, Ahmad

2015-01-15

High performance computing (HPC) platforms are evolving to more heterogeneous configurations to support the workloads of various applications. The current hardware landscape is composed of traditional multicore CPUs equipped with hardware accelerators that can handle high levels of parallelism. Graphical Processing Units (GPUs) are popular high performance hardware accelerators in modern supercomputers. GPU programming has a different model than that for CPUs, which means that many numerical kernels have to be redesigned and optimized specifically for this architecture. GPUs usually outperform multicore CPUs in some compute intensive and massively parallel applications that have regular processing patterns. However, most scientific applications rely on crucial memory-bound kernels and may witness bottlenecks due to the overhead of the memory bus latency. They can still take advantage of the GPU compute power capabilities, provided that an efficient architecture-aware design is achieved. This dissertation presents a uniform design strategy for optimizing critical memory-bound kernels on GPUs. Based on hierarchical register blocking, double buffering and latency hiding techniques, this strategy leverages the performance of a wide range of standard numerical kernels found in dense and sparse linear algebra libraries. The work presented here focuses on matrix-vector multiplication kernels (MVM) as repre- sentative and most important memory-bound operations in this context. Each kernel inherits the benefits of the proposed strategies. By exposing a proper set of tuning parameters, the strategy is flexible enough to suit different types of matrices, ranging from large dense matrices, to sparse matrices with dense block structures, while high performance is maintained. Furthermore, the tuning parameters are used to maintain the relative performance across different GPU architectures. Multi-GPU acceleration is proposed to scale the performance on several devices. The
Commercialization issues and funding opportunities for high-performance optoelectronic computing modules

Science.gov (United States)

Hessenbruch, John M.; Guilfoyle, Peter S.

1997-01-01

Low power, optoelectronic integrated circuits are being developed for high speed switching and data processing applications. These high performance optoelectronic computing modules consist of three primary components: vertical cavity surface emitting lasers, diffractive optical interconnect elements, and detector/amplifier/laser driver arrays. Following the design and fabrication of an HPOC module prototype, selected commercial funding sources will be evaluated to support a product development stage. These include the formation of a strategic alliance with one or more microprocessor or telecommunications vendors, and/or equity investment from one or more venture capital firms.
Integrated State Estimation and Contingency Analysis Software Implementation using High Performance Computing Techniques

Energy Technology Data Exchange (ETDEWEB)

Chen, Yousu; Glaesemann, Kurt R.; Rice, Mark J.; Huang, Zhenyu

2015-12-31

Power system simulation tools are traditionally developed in sequential mode and codes are optimized for single core computing only. However, the increasing complexity in the power grid models requires more intensive computation. The traditional simulation tools will soon not be able to meet the grid operation requirements. Therefore, power system simulation tools need to evolve accordingly to provide faster and better results for grid operations. This paper presents an integrated state estimation and contingency analysis software implementation using high performance computing techniques. The software is able to solve large size state estimation problems within one second and achieve a near-linear speedup of 9,800 with 10,000 cores for contingency analysis application. The performance evaluation is presented to show its effectiveness.
AHPCRC - Army High Performance Computing Research Center

Science.gov (United States)

2010-01-01

computing. Of particular interest is the ability of a distrib- uted jamming network (DJN) to jam signals in all or part of a sensor or communications net...and reasoning, assistive technologies. FRIEDRICH (FRITZ) PRINZ Finmeccanica Professor of Engineering, Robert Bosch Chair, Department of Engineering...High Performance Computing Research Center www.ahpcrc.org BARBARA BRYAN AHPCRC Research and Outreach Manager, HPTi (650) 604-3732 bbryan@hpti.com Ms
Use of high performance computing to examine the effectiveness of aquifer remediation

International Nuclear Information System (INIS)

Tompson, A.F.B.; Ashby, S.F.; Falgout, R.D.; Smith, S.G.; Fogwell, T.W.; Loosmore, G.A.

1994-06-01

Large-scale simulation of fluid flow and chemical migration is being used to study the effectiveness of pump-and-treat restoration of a contaminated, saturated aquifer. A three-element approach focusing on geostatistical representations of heterogeneous aquifers, high-performance computing strategies for simulating flow, migration, and reaction processes in large three-dimensional systems, and highly-resolved simulations of flow and chemical migration in porous formations will be discussed. Results from a preliminary application of this approach to examine pumping behavior at a real, heterogeneous field site will be presented. Future activities will emphasize parallel computations in larger, dynamic, and nonlinear (two-phase) flow problems as well as improved interpretive methods for defining detailed material property distributions
Enabling high performance computational science through combinatorial algorithms

International Nuclear Information System (INIS)

Boman, Erik G; Bozdag, Doruk; Catalyurek, Umit V; Devine, Karen D; Gebremedhin, Assefaw H; Hovland, Paul D; Pothen, Alex; Strout, Michelle Mills

2007-01-01

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation
Enabling high performance computational science through combinatorial algorithms

Energy Technology Data Exchange (ETDEWEB)

Boman, Erik G [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Bozdag, Doruk [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Catalyurek, Umit V [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Devine, Karen D [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Gebremedhin, Assefaw H [Computer Science and Center for Computational Science, Old Dominion University (United States); Hovland, Paul D [Mathematics and Computer Science Division, Argonne National Laboratory (United States); Pothen, Alex [Computer Science and Center for Computational Science, Old Dominion University (United States); Strout, Michelle Mills [Computer Science, Colorado State University (United States)

2007-07-15

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation.
Mastering cloud computing foundations and applications programming

CERN Document Server

Buyya, Rajkumar; Selvi, SThamarai

2013-01-01

Mastering Cloud Computing is designed for undergraduate students learning to develop cloud computing applications. Tomorrow's applications won't live on a single computer but will be deployed from and reside on a virtual server, accessible anywhere, any time. Tomorrow's application developers need to understand the requirements of building apps for these virtual systems, including concurrent programming, high-performance computing, and data-intensive systems. The book introduces the principles of distributed and parallel computing underlying cloud architectures and specifical
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Anon.

1991-03-15

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour.
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

International Nuclear Information System (INIS)

Anon.

1991-01-01

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour
High performance computing environment for multidimensional image analysis.

Science.gov (United States)

Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

2007-07-10

The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup. Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

Energy Technology Data Exchange (ETDEWEB)

Aimone, James Bradley [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Betty, Rita [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2015-03-01

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.
High End Computing Technologies for Earth Science Applications: Trends, Challenges, and Innovations

Science.gov (United States)

Parks, John (Technical Monitor); Biswas, Rupak; Yan, Jerry C.; Brooks, Walter F.; Sterling, Thomas L.

2003-01-01

Earth science applications of the future will stress the capabilities of even the highest performance supercomputers in the areas of raw compute power, mass storage management, and software environments. These NASA mission critical problems demand usable multi-petaflops and exabyte-scale systems to fully realize their science goals. With an exciting vision of the technologies needed, NASA has established a comprehensive program of advanced research in computer architecture, software tools, and device technology to ensure that, in partnership with US industry, it can meet these demanding requirements with reliable, cost effective, and usable ultra-scale systems. NASA will exploit, explore, and influence emerging high end computing architectures and technologies to accelerate the next generation of engineering, operations, and discovery processes for NASA Enterprises. This article captures this vision and describes the concepts, accomplishments, and the potential payoff of the key thrusts that will help meet the computational challenges in Earth science applications.

What Physicists Should Know About High Performance Computing - Circa 2002

Science.gov (United States)

Frederick, Donald

2002-08-01

High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.
Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa; Parashar, Manish; Kim, Hyunjoo; Jordan, Kirk E.; Sachdeva, Vipin; Sexton, James; Jamjoom, Hani; Shae, Zon-Yin; Pencheva, Gergina; Tavakoli, Reza; Wheeler, Mary F.

2012-01-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a
The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC

Science.gov (United States)

Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan

2016-04-01

The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.
Acceleration of FDTD mode solver by high-performance computing techniques.

Science.gov (United States)

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
Accessible high performance computing solutions for near real-time image processing for time critical applications

Science.gov (United States)

Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

2009-09-01

High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

Science.gov (United States)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into
High performance cloud auditing and applications

CERN Document Server

Choi, Baek-Young; Song, Sejun

2014-01-01

This book mainly focuses on cloud security and high performance computing for cloud auditing. The book discusses emerging challenges and techniques developed for high performance semantic cloud auditing, and presents the state of the art in cloud auditing, computing and security techniques with focus on technical aspects and feasibility of auditing issues in federated cloud computing environments. In summer 2011, the United States Air Force Research Laboratory (AFRL) CyberBAT Cloud Security and Auditing Team initiated the exploration of the cloud security challenges and future cloud auditing research directions that are covered in this book. This work was supported by the United States government funds from the Air Force Office of Scientific Research (AFOSR), the AFOSR Summer Faculty Fellowship Program (SFFP), the Air Force Research Laboratory (AFRL) Visiting Faculty Research Program (VFRP), the National Science Foundation (NSF) and the National Institute of Health (NIH). All chapters were partially suppor...
High-performance computing in accelerating structure design and analysis

International Nuclear Information System (INIS)

Li Zenghai; Folwell, Nathan; Ge Lixin; Guetz, Adam; Ivanov, Valentin; Kowalski, Marc; Lee, Lie-Quan; Ng, Cho-Kuen; Schussman, Greg; Stingelin, Lukas; Uplenchwar, Ravindra; Wolf, Michael; Xiao, Liling; Ko, Kwok

2006-01-01

Future high-energy accelerators such as the Next Linear Collider (NLC) will accelerate multi-bunch beams of high current and low emittance to obtain high luminosity, which put stringent requirements on the accelerating structures for efficiency and beam stability. While numerical modeling has been quite standard in accelerator R and D, designing the NLC accelerating structure required a new simulation capability because of the geometric complexity and level of accuracy involved. Under the US DOE Advanced Computing initiatives (first the Grand Challenge and now SciDAC), SLAC has developed a suite of electromagnetic codes based on unstructured grids and utilizing high-performance computing to provide an advanced tool for modeling structures at accuracies and scales previously not possible. This paper will discuss the code development and computational science research (e.g. domain decomposition, scalable eigensolvers, adaptive mesh refinement) that have enabled the large-scale simulations needed for meeting the computational challenges posed by the NLC as well as projects such as the PEP-II and RIA. Numerical results will be presented to show how high-performance computing has made a qualitative improvement in accelerator structure modeling for these accelerators, either at the component level (single cell optimization), or on the scale of an entire structure (beam heating and long-range wakefields)
The ongoing investigation of high performance parallel computing in HEP

CERN Document Server

Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

1993-01-01

Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.
Automatic Energy Schemes for High Performance Applications

Energy Technology Data Exchange (ETDEWEB)

Sundriyal, Vaibhav [Iowa State Univ., Ames, IA (United States)

2013-01-01

Although high-performance computing traditionally focuses on the efficient execution of large-scale applications, both energy and power have become critical concerns when approaching exascale. Drastic increases in the power consumption of supercomputers affect significantly their operating costs and failure rates. In modern microprocessor architectures, equipped with dynamic voltage and frequency scaling (DVFS) and CPU clock modulation (throttling), the power consumption may be controlled in software. Additionally, network interconnect, such as Infiniband, may be exploited to maximize energy savings while the application performance loss and frequency switching overheads must be carefully balanced. This work first studies two important collective communication operations, all-to-all and allgather and proposes energy saving strategies on the per-call basis. Next, it targets point-to-point communications to group them into phases and apply frequency scaling to them to save energy by exploiting the architectural and communication stalls. Finally, it proposes an automatic runtime system which combines both collective and point-to-point communications into phases, and applies throttling to them apart from DVFS to maximize energy savings. The experimental results are presented for NAS parallel benchmark problems as well as for the realistic parallel electronic structure calculations performed by the widely used quantum chemistry package GAMESS. Close to the maximum energy savings were obtained with a substantially low performance loss on the given platform.
High-performance heat pipes for heat recovery applications

Science.gov (United States)

Saaski, E. W.; Hartl, J. H.

1980-01-01

Methods to improve the performance of reflux heat pipes for heat recovery applications were examined both analytically and experimentally. Various models for the estimation of reflux heat pipe transport capacity were surveyed in the literature and compared with experimental data. A high transport capacity reflux heat pipe was developed that provides up to a factor of 10 capacity improvement over conventional open tube designs; analytical models were developed for this device and incorporated into a computer program HPIPE. Good agreement of the model predictions with data for R-11 and benzene reflux heat pipes was obtained.
Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

2014-11-11

has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).
Evaluation of high-performance computing software

Energy Technology Data Exchange (ETDEWEB)

Browne, S.; Dongarra, J. [Univ. of Tennessee, Knoxville, TN (United States); Rowan, T. [Oak Ridge National Lab., TN (United States)

1996-12-31

The absence of unbiased and up to date comparative evaluations of high-performance computing software complicates a user`s search for the appropriate software package. The National HPCC Software Exchange (NHSE) is attacking this problem using an approach that includes independent evaluations of software, incorporation of author and user feedback into the evaluations, and Web access to the evaluations. We are applying this approach to the Parallel Tools Library (PTLIB), a new software repository for parallel systems software and tools, and HPC-Netlib, a high performance branch of the Netlib mathematical software repository. Updating the evaluations with feed-back and making it available via the Web helps ensure accuracy and timeliness, and using independent reviewers produces unbiased comparative evaluations difficult to find elsewhere.
Scalability of DL_POLY on High Performance Computing Platform

CSIR Research Space (South Africa)

Mabakane, Mabule S

2017-12-01

Full Text Available stream_source_info Mabakanea_19979_2017.pdf.txt stream_content_type text/plain stream_size 33716 Content-Encoding UTF-8 stream_name Mabakanea_19979_2017.pdf.txt Content-Type text/plain; charset=UTF-8 SACJ 29(3) December... when using many processors within the compute nodes of the supercomputer. The type of the processors of compute nodes and their memory also play an important role in the overall performance of the parallel application running on a supercomputer. DL...
DEVICE TECHNOLOGY. Nanomaterials in transistors: From high-performance to thin-film applications.

Science.gov (United States)

Franklin, Aaron D

2015-08-14

For more than 50 years, silicon transistors have been continuously shrunk to meet the projections of Moore's law but are now reaching fundamental limits on speed and power use. With these limits at hand, nanomaterials offer great promise for improving transistor performance and adding new applications through the coming decades. With different transistors needed in everything from high-performance servers to thin-film display backplanes, it is important to understand the targeted application needs when considering new material options. Here the distinction between high-performance and thin-film transistors is reviewed, along with the benefits and challenges to using nanomaterials in such transistors. In particular, progress on carbon nanotubes, as well as graphene and related materials (including transition metal dichalcogenides and X-enes), outlines the advances and further research needed to enable their use in transistors for high-performance computing, thin films, or completely new technologies such as flexible and transparent devices. Copyright © 2015, American Association for the Advancement of Science.
GRID computing for experimental high energy physics

International Nuclear Information System (INIS)

Moloney, G.R.; Martin, L.; Seviour, E.; Taylor, G.N.; Moorhead, G.F.

2002-01-01

Full text: The Large Hadron Collider (LHC), to be completed at the CERN laboratory in 2006, will generate 11 petabytes of data per year. The processing of this large data stream requires a large, distributed computing infrastructure. A recent innovation in high performance distributed computing, the GRID, has been identified as an important tool in data analysis for the LHC. GRID computing has actual and potential application in many fields which require computationally intensive analysis of large, shared data sets. The Australian experimental High Energy Physics community has formed partnerships with the High Performance Computing community to establish a GRID node at the University of Melbourne. Through Australian membership of the ATLAS experiment at the LHC, Australian researchers have an opportunity to be involved in the European DataGRID project. This presentation will include an introduction to the GRID, and it's application to experimental High Energy Physics. We will present the results of our studies, including participation in the first LHC data challenge
High Performance Computing Operations Review Report

Energy Technology Data Exchange (ETDEWEB)

Cupps, Kimberly C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2013-12-19

The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.
High Performance Computing in Science and Engineering '08 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2009-01-01

The discussions and plans on all scienti?c, advisory, and political levels to realize an even larger “European Supercomputer” in Germany, where the hardware costs alone will be hundreds of millions Euro – much more than in the past – are getting closer to realization. As part of the strategy, the three national supercomputing centres HLRS (Stuttgart), NIC/JSC (Julic ¨ h) and LRZ (Munich) have formed the Gauss Centre for Supercomputing (GCS) as a new virtual organization enabled by an agreement between the Federal Ministry of Education and Research (BMBF) and the state ministries for research of Baden-Wurttem ¨ berg, Bayern, and Nordrhein-Westfalen. Already today, the GCS provides the most powerful high-performance computing - frastructure in Europe. Through GCS, HLRS participates in the European project PRACE (Partnership for Advances Computing in Europe) and - tends its reach to all European member countries. These activities aligns well with the activities of HLRS in the European HPC infrastructur...
High performance computing in science and engineering '09: transactions of the High Performance Computing Center, Stuttgart (HLRS) 2009

National Research Council Canada - National Science Library

Nagel, Wolfgang E; Kröner, Dietmar; Resch, Michael

2010-01-01

...), NIC/JSC (J¨ u lich), and LRZ (Munich). As part of that strategic initiative, in May 2009 already NIC/JSC has installed the first phase of the GCS HPC Tier-0 resources, an IBM Blue Gene/P with roughly 300.000 Cores, this time in J¨ u lich, With that, the GCS provides the most powerful high-performance computing infrastructure in Europe alread...
International Conference on Modern Mathematical Methods and High Performance Computing in Science and Technology

CERN Document Server

Srivastava, HM; Venturino, Ezio; Resch, Michael; Gupta, Vijay

2016-01-01

The book discusses important results in modern mathematical models and high performance computing, such as applied operations research, simulation of operations, statistical modeling and applications, invisibility regions and regular meta-materials, unmanned vehicles, modern radar techniques/SAR imaging, satellite remote sensing, coding, and robotic systems. Furthermore, it is valuable as a reference work and as a basis for further study and research. All contributing authors are respected academicians, scientists and researchers from around the globe. All the papers were presented at the international conference on Modern Mathematical Methods and High Performance Computing in Science & Technology (M3HPCST 2015), held at Raj Kumar Goel Institute of Technology, Ghaziabad, India, from 27–29 December 2015, and peer-reviewed by international experts. The conference provided an exceptional platform for leading researchers, academicians, developers, engineers and technocrats from a broad range of disciplines ...

The contribution of high-performance computing and modelling for industrial development

CSIR Research Space (South Africa)

Sithole, Happy

2017-10-01

Full Text Available Performance Computing and Modelling for Industrial Development Dr Happy Sithole and Dr Onno Ubbink 2 Strategic context • High-performance computing (HPC) combined with machine Learning and artificial intelligence present opportunities to non...
Lightweight Provenance Service for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

2017-09-09

Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.
The path toward HEP High Performance Computing

International Nuclear Information System (INIS)

Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit
High performance computing and communications: FY 1997 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-12-01

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A provides a detailed look at HPCC Program activities within each agency.
High performance computing and communications: FY 1996 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1995-05-16

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage of the High Performance Computing Act of 1991, signed on December 9, 1991. Twelve federal agencies, in collaboration with scientists and managers from US industry, universities, and research laboratories, have developed the Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1995 and FY 1996. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency.
High performance computing in science and engineering Garching/Munich 2016

Energy Technology Data Exchange (ETDEWEB)

Wagner, Siegfried; Bode, Arndt; Bruechle, Helmut; Brehm, Matthias (eds.)

2016-11-01

Computer simulations are the well-established third pillar of natural sciences along with theory and experimentation. Particularly high performance computing is growing fast and constantly demands more and more powerful machines. To keep pace with this development, in spring 2015, the Leibniz Supercomputing Centre installed the high performance computing system SuperMUC Phase 2, only three years after the inauguration of its sibling SuperMUC Phase 1. Thereby, the compute capabilities were more than doubled. This book covers the time-frame June 2014 until June 2016. Readers will find many examples of outstanding research in the more than 130 projects that are covered in this book, with each one of these projects using at least 4 million core-hours on SuperMUC. The largest scientific communities using SuperMUC in the last two years were computational fluid dynamics simulations, chemistry and material sciences, astrophysics, and life sciences.
Performance Measurements in a High Throughput Computing Environment

CERN Document Server

AUTHOR|(CDS)2145966; Gribaudo, Marco

The IT infrastructures of companies and research centres are implementing new technologies to satisfy the increasing need of computing resources for big data analysis. In this context, resource profiling plays a crucial role in identifying areas where the improvement of the utilisation efficiency is needed. In order to deal with the profiling and optimisation of computing resources, two complementary approaches can be adopted: the measurement-based approach and the model-based approach. The measurement-based approach gathers and analyses performance metrics executing benchmark applications on computing resources. Instead, the model-based approach implies the design and implementation of a model as an abstraction of the real system, selecting only those aspects relevant to the study. This Thesis originates from a project carried out by the author within the CERN IT department. CERN is an international scientific laboratory that conducts fundamental researches in the domain of elementary particle physics. The p...
Power/energy use cases for high performance computing

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hammond, Steven [National Renewable Energy Lab. (NREL), Golden, CO (United States); Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Munch, Kristin [National Renewable Energy Lab. (NREL), Golden, CO (United States)

2013-12-01

Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.
High Performance Computing Facility Operational Assessment 2015: Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Barker, Ashley D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bland, Arthur S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Gary, Jeff D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Hack, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; McNally, Stephen T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Rogers, James H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Smith, Brian E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Straatsma, T. P. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Sukumar, Sreenivas Rangan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Thach, Kevin G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Tichenor, Suzy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Vazhkudai, Sudharshan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Wells, Jack C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility

2016-03-01

Oak Ridge National Laboratory’s (ORNL’s) Leadership Computing Facility (OLCF) continues to surpass its operational target goals: supporting users; delivering fast, reliable systems; creating innovative solutions for high-performance computing (HPC) needs; and managing risks, safety, and security aspects associated with operating one of the most powerful computers in the world. The results can be seen in the cutting-edge science delivered by users and the praise from the research community. Calendar year (CY) 2015 was filled with outstanding operational results and accomplishments: a very high rating from users on overall satisfaction that ties the highest-ever mark set in CY 2014; the greatest number of core-hours delivered to research projects; the largest percentage of capability usage since the OLCF began tracking the metric in 2009; and success in delivering on the allocation of 60, 30, and 10% of core hours offered for the INCITE (Innovative and Novel Computational Impact on Theory and Experiment), ALCC (Advanced Scientific Computing Research Leadership Computing Challenge), and Director’s Discretionary programs, respectively. These accomplishments, coupled with the extremely high utilization rate, represent the fulfillment of the promise of Titan: maximum use by maximum-size simulations. The impact of all of these successes and more is reflected in the accomplishments of OLCF users, with publications this year in notable journals Nature, Nature Materials, Nature Chemistry, Nature Physics, Nature Climate Change, ACS Nano, Journal of the American Chemical Society, and Physical Review Letters, as well as many others. The achievements included in the 2015 OLCF Operational Assessment Report reflect first-ever or largest simulations in their communities; for example Titan enabled engineers in Los Angeles and the surrounding region to design and begin building improved critical infrastructure by enabling the highest-resolution Cybershake map for Southern
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment

Science.gov (United States)

Baxter, C.; Matott, L.

2012-12-01

High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
High Performance Computing Multicast

Science.gov (United States)

2012-02-01

A History of the Virtual Synchrony Replication Model,” in Replication: Theory and Practice, Charron-Bost, B., Pedone, F., and Schiper, A. (Eds...Performance Computing IP / IPv4 Internet Protocol (version 4.0) IPMC Internet Protocol MultiCast LAN Local Area Network MCMD Dr. Multicast MPI
A high performance scientific cloud computing environment for materials simulations

OpenAIRE

Jorissen, Kevin; Vila, Fernando D.; Rehr, John J.

2011-01-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including...
The path toward HEP High Performance Computing

CERN Document Server

Apostolakis, John; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on th...
High-performance computing for structural mechanics and earthquake/tsunami engineering

CERN Document Server

Hori, Muneo; Ohsaki, Makoto

2016-01-01

Huge earthquakes and tsunamis have caused serious damage to important structures such as civil infrastructure elements, buildings and power plants around the globe. To quantitatively evaluate such damage processes and to design effective prevention and mitigation measures, the latest high-performance computational mechanics technologies, which include telascale to petascale computers, can offer powerful tools. The phenomena covered in this book include seismic wave propagation in the crust and soil, seismic response of infrastructure elements such as tunnels considering soil-structure interactions, seismic response of high-rise buildings, seismic response of nuclear power plants, tsunami run-up over coastal towns and tsunami inundation considering fluid-structure interactions. The book provides all necessary information for addressing these phenomena, ranging from the fundamentals of high-performance computing for finite element methods, key algorithms of accurate dynamic structural analysis, fluid flows ...
High performance computing in power and energy systems

Energy Technology Data Exchange (ETDEWEB)

Khaitan, Siddhartha Kumar [Iowa State Univ., Ames, IA (United States); Gupta, Anshul (eds.) [IBM Watson Research Center, Yorktown Heights, NY (United States)

2013-07-01

The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, cascading and security analysis, interaction between hybrid systems (electric, transport, gas, oil, coal, etc.) and so on, to get meaningful information in real time to ensure a secure, reliable and stable power system grid. Advanced research on development and implementation of market-ready leading-edge high-speed enabling technologies and algorithms for solving real-time, dynamic, resource-critical problems will be required for dynamic security analysis targeted towards successful implementation of Smart Grid initiatives. This books aims to bring together some of the latest research developments as well as thoughts on the future research directions of the high performance computing applications in electric power systems planning, operations, security, markets, and grid integration of alternate sources of energy, etc.
Cloud object store for archive storage of high performance computing data using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2015-06-30

Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations

Energy Technology Data Exchange (ETDEWEB)

Pieper, Andreas [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Kreutzer, Moritz [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Alvermann, Andreas, E-mail: alvermann@physik.uni-greifswald.de [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Galgon, Martin [Bergische Universität Wuppertal (Germany); Fehske, Holger [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Hager, Georg [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Lang, Bruno [Bergische Universität Wuppertal (Germany); Wellein, Gerhard [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany)

2016-11-15

We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique the subspace projection onto the target space of wanted eigenvectors is approximated with filter polynomials obtained from Chebyshev expansions of window functions. After the discussion of the conceptual foundations of Chebyshev filter diagonalization we analyze the impact of the choice of the damping kernel, search space size, and filter polynomial degree on the computational accuracy and effort, before we describe the necessary steps towards a parallel high-performance implementation. Because Chebyshev filter diagonalization avoids the need for matrix inversion it can deal with matrices and problem sizes that are presently not accessible with rational function methods based on direct or iterative linear solvers. To demonstrate the potential of Chebyshev filter diagonalization for large-scale problems of this kind we include as an example the computation of the 10{sup 2} innermost eigenpairs of a topological insulator matrix with dimension 10{sup 9} derived from quantum physics applications.
Mixed-Language High-Performance Computing for Plasma Simulations

Directory of Open Access Journals (Sweden)

Quanming Lu

2003-01-01

Full Text Available Java is receiving increasing attention as the most popular platform for distributed computing. However, programmers are still reluctant to embrace Java as a tool for writing scientific and engineering applications due to its still noticeable performance drawbacks compared with other programming languages such as Fortran or C. In this paper, we present a hybrid Java/Fortran implementation of a parallel particle-in-cell (PIC algorithm for plasma simulations. In our approach, the time-consuming components of this application are designed and implemented as Fortran subroutines, while less calculation-intensive components usually involved in building the user interface are written in Java. The two types of software modules have been glued together using the Java native interface (JNI. Our mixed-language PIC code was tested and its performance compared with pure Java and Fortran versions of the same algorithm on a Sun E6500 SMP system and a Linux cluster of Pentium~III machines.
Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons

Directory of Open Access Journals (Sweden)

Ernestina Martel

2018-06-01

Full Text Available Dimensionality reduction represents a critical preprocessing step in order to increase the efficiency and the performance of many hyperspectral imaging algorithms. However, dimensionality reduction algorithms, such as the Principal Component Analysis (PCA, suffer from their computationally demanding nature, becoming advisable for their implementation onto high-performance computer architectures for applications under strict latency constraints. This work presents the implementation of the PCA algorithm onto two different high-performance devices, namely, an NVIDIA Graphics Processing Unit (GPU and a Kalray manycore, uncovering a highly valuable set of tips and tricks in order to take full advantage of the inherent parallelism of these high-performance computing platforms, and hence, reducing the time that is required to process a given hyperspectral image. Moreover, the achieved results obtained with different hyperspectral images have been compared with the ones that were obtained with a field programmable gate array (FPGA-based implementation of the PCA algorithm that has been recently published, providing, for the first time in the literature, a comprehensive analysis in order to highlight the pros and cons of each option.
High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...

Contributing to the design of run-time systems dedicated to high performance computing

International Nuclear Information System (INIS)

Perache, M.

2006-10-01

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

International Nuclear Information System (INIS)

Khaleel, Mohammad A.

2009-01-01

This report is an account of the deliberations and conclusions of the workshop on 'Forefront Questions in Nuclear Science and the Role of High Performance Computing' held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to (1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; (2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; (3) provide nuclear physicists the opportunity to influence the development of high performance computing; and (4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Khaleel, Mohammad A.

2009-10-01

This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda

2016-03-04

Exascale systems are predicted to have approximately 1 billion cores, assuming gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics but has recently been extended to a wider range of problems. Its high arithmetic intensity combined with its linear complexity and asynchronous communication patterns make it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on internode communication. We focus on the communication part only; the efficiency of the computational kernels are beyond the scope of the present study. We develop a performance model that considers the communication patterns of the FMM and observe a good match between our model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization of internode communication in FMM that validates the model against actual measurements of communication time. The ultimate communication model is predictive in an absolute sense; however, on complex systems, this objective is often out of reach or of a difficulty out of proportion to its benefit when there exists a simpler model that is inexpensive and sufficient to guide coding decisions leading to improved scaling. The current model provides such guidance.
Simple, parallel, high-performance virtual machines for extreme computations

International Nuclear Information System (INIS)

Chokoufe Nejad, Bijan; Ohl, Thorsten; Reuter, Jurgen

2014-11-01

We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new processes or complex higher order corrections that are currently out of reach could be evaluated with a VM given enough computing power.
Computer algebra applications

International Nuclear Information System (INIS)

Calmet, J.

1982-01-01

A survey of applications based either on fundamental algorithms in computer algebra or on the use of a computer algebra system is presented. Recent work in biology, chemistry, physics, mathematics and computer science is discussed. In particular, applications in high energy physics (quantum electrodynamics), celestial mechanics and general relativity are reviewed. (Auth.)
A Secure Web Application Providing Public Access to High-Performance Data Intensive Scientific Resources - ScalaBLAST Web Application

International Nuclear Information System (INIS)

Curtis, Darren S.; Peterson, Elena S.; Oehmen, Chris S.

2008-01-01

This work presents the ScalaBLAST Web Application (SWA), a web based application implemented using the PHP script language, MySQL DBMS, and Apache web server under a GNU/Linux platform. SWA is an application built as part of the Data Intensive Computer for Complex Biological Systems (DICCBS) project at the Pacific Northwest National Laboratory (PNNL). SWA delivers accelerated throughput of bioinformatics analysis via high-performance computing through a convenient, easy-to-use web interface. This approach greatly enhances emerging fields of study in biology such as ontology-based homology, and multiple whole genome comparisons which, in the absence of a tool like SWA, require a heroic effort to overcome the computational bottleneck associated with genome analysis. The current version of SWA includes a user account management system, a web based user interface, and a backend process that generates the files necessary for the Internet scientific community to submit a ScalaBLAST parallel processing job on a dedicated cluster
FY 1992 Blue Book: Grand Challenges: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...
High-Performance Operating Systems

DEFF Research Database (Denmark)

Sharp, Robin

1999-01-01

Notes prepared for the DTU course 49421 "High Performance Operating Systems". The notes deal with quantitative and qualitative techniques for use in the design and evaluation of operating systems in computer systems for which performance is an important parameter, such as real-time applications......, communication systems and multimedia systems....
Computational Fluid Dynamics (CFD) Computations With Zonal Navier-Stokes Flow Solver (ZNSFLOW) Common High Performance Computing Scalable Software Initiative (CHSSI) Software

National Research Council Canada - National Science Library

Edge, Harris

1999-01-01

...), computational fluid dynamics (CFD) 6 project. Under the project, a proven zonal Navier-Stokes solver was rewritten for scalable parallel performance on both shared memory and distributed memory high performance computers...
Component-based software for high-performance scientific computing

Energy Technology Data Exchange (ETDEWEB)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.
Component-based software for high-performance scientific computing

International Nuclear Information System (INIS)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly
High-performance computing on GPUs for resistivity logging of oil and gas wells

Science.gov (United States)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
High performance computing and communications: FY 1995 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1994-04-01

The High Performance Computing and Communications (HPCC) Program was formally established following passage of the High Performance Computing Act of 1991 signed on December 9, 1991. Ten federal agencies in collaboration with scientists and managers from US industry, universities, and laboratories have developed the HPCC Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1994 and FY 1995. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency. Although the Department of Education is an official HPCC agency, its current funding and reporting of crosscut activities goes through the Committee on Education and Health Resources, not the HPCC Program. For this reason the Implementation Plan covers nine HPCC agencies.
Interactive Data Exploration for High-Performance Fluid Flow Computations through Porous Media

KAUST Repository

Perovic, Nevena

2014-09-01

© 2014 IEEE. Huge data advent in high-performance computing (HPC) applications such as fluid flow simulations usually hinders the interactive processing and exploration of simulation results. Such an interactive data exploration not only allows scientiest to \\'play\\' with their data but also to visualise huge (distributed) data sets in both an efficient and easy way. Therefore, we propose an HPC data exploration service based on a sliding window concept, that enables researches to access remote data (available on a supercomputer or cluster) during simulation runtime without exceeding any bandwidth limitations between the HPC back-end and the user front-end.
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

International Nuclear Information System (INIS)

Brown, W. Michael; Wang, Peng; Plimpton, Steven J.; Tharrington, Arnold N.

2011-01-01

The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both many-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.
Computational Environments and Analysis methods available on the NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform

Science.gov (United States)

Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.

2014-12-01

The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will
High performance computing network for cloud environment using simulators

OpenAIRE

Singh, N. Ajith; Hemalatha, M.

2012-01-01

Cloud computing is the next generation computing. Adopting the cloud computing is like signing up new form of a website. The GUI which controls the cloud computing make is directly control the hardware resource and your application. The difficulty part in cloud computing is to deploy in real environment. Its' difficult to know the exact cost and it's requirement until and unless we buy the service not only that whether it will support the existing application which is available on traditional...
High performance computer code for molecular dynamics simulations

International Nuclear Information System (INIS)

Levay, I.; Toekesi, K.

2007-01-01

Complete text of publication follows. Molecular Dynamics (MD) simulation is a widely used technique for modeling complicated physical phenomena. Since 2005 we are developing a MD simulations code for PC computers. The computer code is written in C++ object oriented programming language. The aim of our work is twofold: a) to develop a fast computer code for the study of random walk of guest atoms in Be crystal, b) 3 dimensional (3D) visualization of the particles motion. In this case we mimic the motion of the guest atoms in the crystal (diffusion-type motion), and the motion of atoms in the crystallattice (crystal deformation). Nowadays, it is common to use Graphics Devices in intensive computational problems. There are several ways to use this extreme processing performance, but never before was so easy to programming these devices as now. The CUDA (Compute Unified Device) Architecture introduced by nVidia Corporation in 2007 is a very useful for every processor hungry application. A Unified-architecture GPU include 96-128, or more stream processors, so the raw calculation performance is 576(!) GFLOPS. It is ten times faster, than the fastest dual Core CPU [Fig.1]. Our improved MD simulation software uses this new technology, which speed up our software and the code run 10 times faster in the critical calculation code segment. Although the GPU is a very powerful tool, it has a strongly paralleled structure. It means, that we have to create an algorithm, which works on several processors without deadlock. Our code currently uses 256 threads, shared and constant on-chip memory, instead of global memory, which is 100 times slower than others. It is possible to implement the total algorithm on GPU, therefore we do not need to download and upload the data in every iteration. On behalf of maximal throughput, every thread run with the same instructions
High performance graphics processors for medical imaging applications

International Nuclear Information System (INIS)

Goldwasser, S.M.; Reynolds, R.A.; Talton, D.A.; Walsh, E.S.

1989-01-01

This paper describes a family of high- performance graphics processors with special hardware for interactive visualization of 3D human anatomy. The basic architecture expands to multiple parallel processors, each processor using pipelined arithmetic and logical units for high-speed rendering of Computed Tomography (CT), Magnetic Resonance (MR) and Positron Emission Tomography (PET) data. User-selectable display alternatives include multiple 2D axial slices, reformatted images in sagittal or coronal planes and shaded 3D views. Special facilities support applications requiring color-coded display of multiple datasets (such as radiation therapy planning), or dynamic replay of time- varying volumetric data (such as cine-CT or gated MR studies of the beating heart). The current implementation is a single processor system which generates reformatted images in true real time (30 frames per second), and shaded 3D views in a few seconds per frame. It accepts full scale medical datasets in their native formats, so that minimal preprocessing delay exists between data acquisition and display

Unravelling the structure of matter on high-performance computers

International Nuclear Information System (INIS)

Kieu, T.D.; McKellar, B.H.J.

1992-11-01

The various phenomena and the different forms of matter in nature are believed to be the manifestation of only a handful set of fundamental building blocks-the elementary particles-which interact through the four fundamental forces. In the study of the structure of matter at this level one has to consider forces which are not sufficiently weak to be treated as small perturbations to the system, an example of which is the strong force that binds the nucleons together. High-performance computers, both vector and parallel machines, have facilitated the necessary non-perturbative treatments. The principles and the techniques of computer simulations applied to Quantum Chromodynamics are explained examples include the strong interactions, the calculation of the mass of nucleons and their decay rates. Some commercial and special-purpose high-performance machines for such calculations are also mentioned. 3 refs., 2 tabs
Peer-to-peer computing for secure high performance data copying

International Nuclear Information System (INIS)

Hanushevsky, A.; Trunov, A.; Cottrell, L.

2001-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model--if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, the authors present the bbcp architecture, it's various features, and the reasons for their inclusion
Peer-to-Peer Computing for Secure High Performance Data Copying

International Nuclear Information System (INIS)

2002-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model -- if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, we preset the bbcp architecture, it's various features, and the reasons for their inclusion
SCinet Architecture: Featured at the International Conference for High Performance Computing,Networking, Storage and Analysis 2016

Energy Technology Data Exchange (ETDEWEB)

Lyonnais, Marc; Smith, Matt; Mace, Kate P.

2017-02-06

SCinet is the purpose-built network that operates during the International Conference for High Performance Computing,Networking, Storage and Analysis (Super Computing or SC). Created each year for the conference, SCinet brings to life a high-capacity network that supports applications and experiments that are a hallmark of the SC conference. The network links the convention center to research and commercial networks around the world. This resource serves as a platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of applications. Volunteers from academia, government and industry work together to design and deliver the SCinet infrastructure. Industry vendors and carriers donate millions of dollars in equipment and services needed to build and support the local and wide area networks. Planning begins more than a year in advance of each SC conference and culminates in a high intensity installation in the days leading up to the conference. The SCinet architecture for SC16 illustrates a dramatic increase in participation from the vendor community, particularly those that focus on network equipment. Software-Defined Networking (SDN) and Data Center Networking (DCN) are present in nearly all aspects of the design.
High performance graphics processor based computed tomography reconstruction algorithms for nuclear and other large scale applications.

Energy Technology Data Exchange (ETDEWEB)

Jimenez, Edward S. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Orr, Laurel J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Thompson, Kyle R. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2013-09-01

The goal of this work is to develop a fast computed tomography (CT) reconstruction algorithm based on graphics processing units (GPU) that achieves significant improvement over traditional central processing unit (CPU) based implementations. The main challenge in developing a CT algorithm that is capable of handling very large datasets is parallelizing the algorithm in such a way that data transfer does not hinder performance of the reconstruction algorithm. General Purpose Graphics Processing (GPGPU) is a new technology that the Science and Technology (S&T) community is starting to adopt in many fields where CPU-based computing is the norm. GPGPU programming requires a new approach to algorithm development that utilizes massively multi-threaded environments. Multi-threaded algorithms in general are difficult to optimize since performance bottlenecks occur that are non-existent in single-threaded algorithms such as memory latencies. If an efficient GPU-based CT reconstruction algorithm can be developed; computational times could be improved by a factor of 20. Additionally, cost benefits will be realized as commodity graphics hardware could potentially replace expensive supercomputers and high-end workstations. This project will take advantage of the CUDA programming environment and attempt to parallelize the task in such a way that multiple slices of the reconstruction volume are computed simultaneously. This work will also take advantage of the GPU memory by utilizing asynchronous memory transfers, GPU texture memory, and (when possible) pinned host memory so that the memory transfer bottleneck inherent to GPGPU is amortized. Additionally, this work will take advantage of GPU-specific hardware (i.e. fast texture memory, pixel-pipelines, hardware interpolators, and varying memory hierarchy) that will allow for additional performance improvements.
Towards Portable Large-Scale Image Processing with High-Performance Computing.

Science.gov (United States)

Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A

2018-05-03

High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M.F.; Ethier, S.; Wichmann, N.

2009-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores.
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M F; Ethier, S; Wichmann, N

2007-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores
High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...
Service Oriented Architecture for High Level Applications

International Nuclear Information System (INIS)

Chu, P.

2012-01-01

Standalone high level applications often suffer from poor performance and reliability due to lengthy initialization, heavy computation and rapid graphical update. Service-oriented architecture (SOA) is trying to separate the initialization and computation from applications and to distribute such work to various service providers. Heavy computation such as beam tracking will be done periodically on a dedicated server and data will be available to client applications at all time. Industrial standard service architecture can help to improve the performance, reliability and maintainability of the service. Robustness will also be improved by reducing the complexity of individual client applications.
Technologies and tools for high-performance distributed computing. Final report

Energy Technology Data Exchange (ETDEWEB)

Karonis, Nicholas T.

2000-05-01

In this project we studied the practical use of the MPI message-passing interface in advanced distributed computing environments. We built on the existing software infrastructure provided by the Globus Toolkit{trademark}, the MPICH portable implementation of MPI, and the MPICH-G integration of MPICH with Globus. As a result of this project we have replaced MPICH-G with its successor MPICH-G2, which is also an integration of MPICH with Globus. MPICH-G2 delivers significant improvements in message passing performance when compared to its predecessor MPICH-G and was based on superior software design principles resulting in a software base that was much easier to make the functional extensions and improvements we did. Using Globus services we replaced the default implementation of MPI's collective operations in MPICH-G2 with more efficient multilevel topology-aware collective operations which, in turn, led to the development of a new timing methodology for broadcasts [8]. MPICH-G2 was extended to include client/server functionality from the MPI-2 standard [23] to facilitate remote visualization applications and, through the use of MPI idioms, MPICH-G2 provided application-level control of quality-of-service parameters as well as application-level discovery of underlying Grid-topology information. Finally, MPICH-G2 was successfully used in a number of applications including an award-winning record-setting computation in numerical relativity. In the sections that follow we describe in detail the accomplishments of this project, we present experimental results quantifying the performance improvements, and conclude with a discussion of our applications experiences. This project resulted in a significant increase in the utility of MPICH-G2.
High-performance computational fluid dynamics: a custom-code approach

International Nuclear Information System (INIS)

Fannon, James; Náraigh, Lennon Ó; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain

2016-01-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing. (paper)
High-performance computational fluid dynamics: a custom-code approach

Science.gov (United States)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
High-Performance Networking

CERN Multimedia

CERN. Geneva

2003-01-01

The series will start with an historical introduction about what people saw as high performance message communication in their time and how that developed to the now to day known "standard computer network communication". It will be followed by a far more technical part that uses the High Performance Computer Network standards of the 90's, with 1 Gbit/sec systems as introduction for an in depth explanation of the three new 10 Gbit/s network and interconnect technology standards that exist already or emerge. If necessary for a good understanding some sidesteps will be included to explain important protocols as well as some necessary details of concerned Wide Area Network (WAN) standards details including some basics of wavelength multiplexing (DWDM). Some remarks will be made concerning the rapid expanding applications of networked storage.
FY 1993 Blue Book: Grand Challenges 1993: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...
High-Performance Computing Paradigm and Infrastructure

CERN Document Server

Yang, Laurence T

2006-01-01

With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream
Rapid Prototyping of High Performance Signal Processing Applications

Science.gov (United States)

Sane, Nimish

Advances in embedded systems for digital signal processing (DSP) are enabling many scientific projects and commercial applications. At the same time, these applications are key to driving advances in many important kinds of computing platforms. In this region of high performance DSP, rapid prototyping is critical for faster time-to-market (e.g., in the wireless communications industry) or time-to-science (e.g., in radio astronomy). DSP system architectures have evolved from being based on application specific integrated circuits (ASICs) to incorporate reconfigurable off-the-shelf field programmable gate arrays (FPGAs), the latest multiprocessors such as graphics processing units (GPUs), or heterogeneous combinations of such devices. We, thus, have a vast design space to explore based on performance trade-offs, and expanded by the multitude of possibilities for target platforms. In order to allow systematic design space exploration, and develop scalable and portable prototypes, model based design tools are increasingly used in design and implementation of embedded systems. These tools allow scalable high-level representations, model based semantics for analysis and optimization, and portable implementations that can be verified at higher levels of abstractions and targeted toward multiple platforms for implementation. The designer can experiment using such tools at an early stage in the design cycle, and employ the latest hardware at later stages. In this thesis, we have focused on dataflow-based approaches for rapid DSP system prototyping. This thesis contributes to various aspects of dataflow-based design flows and tools as follows: 1. We have introduced the concept of topological patterns, which exploits commonly found repetitive patterns in DSP algorithms to allow scalable, concise, and parameterizable representations of large scale dataflow graphs in high-level languages. We have shown how an underlying design tool can systematically exploit a high
A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing.

Energy Technology Data Exchange (ETDEWEB)

James, Conrad D.; Schiess, Adrian B.; Howell, Jamie; Baca, Michael J.; Partridge, L. Donald; Finnegan, Patrick Sean; Wolfley, Steven L.; Dagel, Daryl James; Spahn, Olga Blum; Harper, Jason C.; Pohl, Kenneth Roy; Mickel, Patrick R.; Lohn, Andrew; Marinella, Matthew

2013-10-01

The human brain (volume=1200cm3) consumes 20W and is capable of performing > 10^16 operations/s. Current supercomputer technology has reached 1015 operations/s, yet it requires 1500m^3 and 3MW, giving the brain a 10^12 advantage in operations/s/W/cm^3. Thus, to reach exascale computation, two achievements are required: 1) improved understanding of computation in biological tissue, and 2) a paradigm shift towards neuromorphic computing where hardware circuits mimic properties of neural tissue. To address 1), we will interrogate corticostriatal networks in mouse brain tissue slices, specifically with regard to their frequency filtering capabilities as a function of input stimulus. To address 2), we will instantiate biological computing characteristics such as multi-bit storage into hardware devices with future computational and memory applications. Resistive memory devices will be modeled, designed, and fabricated in the MESA facility in consultation with our internal and external collaborators.
Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

Science.gov (United States)

Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.
Performance comparison between Java and JNI for optimal implementation of computational micro-kernels

OpenAIRE

Halli , Nassim; Charles , Henri-Pierre; Méhaut , Jean-François

2015-01-01

International audience; General purpose CPUs used in high performance computing (HPC) support a vector instruction set and an out-of-order engine dedicated to increase the instruction level parallelism. Hence, related optimizations are currently critical to improve the performance of applications requiring numerical computation. Moreover, the use of a Java run-time environment such as the HotSpot Java Virtual Machine (JVM) in high performance computing is a promising alternative. It benefits ...

Nuclear forces and high-performance computing: The perfect match

International Nuclear Information System (INIS)

Luu, T; Walker-Loud, A

2009-01-01

High-performance computing is now enabling the calculation of certain hadronic interaction parameters directly from Quantum Chromodynamics, the quantum field theory that governs the behavior of quarks and gluons and is ultimately responsible for the nuclear strong force. In this paper we briefly describe the state of the field and show how other aspects of hadronic interactions will be ascertained in the near future. We give estimates of computational requirements needed to obtain these goals, and outline a procedure for incorporating these results into the broader nuclear physics community.
Optical Thermal Characterization Enables High-Performance Electronics Applications

Energy Technology Data Exchange (ETDEWEB)

2016-02-01

NREL developed a modeling and experimental strategy to characterize thermal performance of materials. The technique provides critical data on thermal properties with relevance for electronics packaging applications. Thermal contact resistance and bulk thermal conductivity were characterized for new high-performance materials such as thermoplastics, boron-nitride nanosheets, copper nanowires, and atomically bonded layers. The technique is an important tool for developing designs and materials that enable power electronics packaging with small footprint, high power density, and low cost for numerous applications.
Performance Issues in High Performance Fortran Implementations of Sensor-Based Applications

Directory of Open Access Journals (Sweden)

David R. O'hallaron

1997-01-01

Full Text Available Applications that get their inputs from sensors are an important and often overlooked application domain for High Performance Fortran (HPF. Such sensor-based applications typically perform regular operations on dense arrays, and often have latency and through put requirements that can only be achieved with parallel machines. This article describes a study of sensor-based applications, including the fast Fourier transform, synthetic aperture radar imaging, narrowband tracking radar processing, multibaseline stereo imaging, and medical magnetic resonance imaging. The applications are written in a dialect of HPF developed at Carnegie Mellon, and are compiled by the Fx compiler for the Intel Paragon. The main results of the study are that (1 it is possible to realize good performance for realistic sensor-based applications written in HPF and (2 the performance of the applications is determined by the performance of three core operations: independent loops (i.e., loops with no dependences between iterations, reductions, and index permutations. The article discusses the implications for HPF implementations and introduces some simple tests that implementers and users can use to measure the efficiency of the loops, reductions, and index permutations generated by an HPF compiler.
NINJA: Java for High Performance Numerical Computing

Directory of Open Access Journals (Sweden)

José E. Moreira

2002-01-01

Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.
FY 1995 Blue Book: High Performance Computing and Communications: Technology for the National Information Infrastructure

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program was created to accelerate the development of future generations of high performance computers...
High-Performance Computer Modeling of the Cosmos-Iridium Collision

Energy Technology Data Exchange (ETDEWEB)

Olivier, S; Cook, K; Fasenfest, B; Jefferson, D; Jiang, M; Leek, J; Levatin, J; Nikolaev, S; Pertica, A; Phillion, D; Springer, K; De Vries, W

2009-08-28

This paper describes the application of a new, integrated modeling and simulation framework, encompassing the space situational awareness (SSA) enterprise, to the recent Cosmos-Iridium collision. This framework is based on a flexible, scalable architecture to enable efficient simulation of the current SSA enterprise, and to accommodate future advancements in SSA systems. In particular, the code is designed to take advantage of massively parallel, high-performance computer systems available, for example, at Lawrence Livermore National Laboratory. We will describe the application of this framework to the recent collision of the Cosmos and Iridium satellites, including (1) detailed hydrodynamic modeling of the satellite collision and resulting debris generation, (2) orbital propagation of the simulated debris and analysis of the increased risk to other satellites (3) calculation of the radar and optical signatures of the simulated debris and modeling of debris detection with space surveillance radar and optical systems (4) determination of simulated debris orbits from modeled space surveillance observations and analysis of the resulting orbital accuracy, (5) comparison of these modeling and simulation results with Space Surveillance Network observations. We will also discuss the use of this integrated modeling and simulation framework to analyze the risks and consequences of future satellite collisions and to assess strategies for mitigating or avoiding future incidents, including the addition of new sensor systems, used in conjunction with the Space Surveillance Network, for improving space situational awareness.
Human Computer Music Performance

OpenAIRE

Dannenberg, Roger B.

2012-01-01

Human Computer Music Performance (HCMP) is the study of music performance by live human performers and real-time computer-based performers. One goal of HCMP is to create a highly autonomous artificial performer that can fill the role of a human, especially in a popular music setting. This will require advances in automated music listening and understanding, new representations for music, techniques for music synchronization, real-time human-computer communication, music generation, sound synt...
Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

Science.gov (United States)

Sun, Xian-He

1997-01-01

Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm
The high performance cluster computing system for BES offline data analysis

International Nuclear Information System (INIS)

Sun Yongzhao; Xu Dong; Zhang Shaoqiang; Yang Ting

2004-01-01

A high performance cluster computing system (EPCfarm) is introduced, which used for BES offline data analysis. The setup and the characteristics of the hardware and software of EPCfarm are described. The PBS, a queue management package, and the performance of EPCfarm is presented also. (authors)
Development of three-dimensional neoclassical transport simulation code with high performance Fortran on a vector-parallel computer

International Nuclear Information System (INIS)

Satake, Shinsuke; Okamoto, Masao; Nakajima, Noriyoshi; Takamaru, Hisanori

2005-11-01

A neoclassical transport simulation code (FORTEC-3D) applicable to three-dimensional configurations has been developed using High Performance Fortran (HPF). Adoption of computing techniques for parallelization and a hybrid simulation model to the δf Monte-Carlo method transport simulation, including non-local transport effects in three-dimensional configurations, makes it possible to simulate the dynamism of global, non-local transport phenomena with a self-consistent radial electric field within a reasonable computation time. In this paper, development of the transport code using HPF is reported. Optimization techniques in order to achieve both high vectorization and parallelization efficiency, adoption of a parallel random number generator, and also benchmark results, are shown. (author)
Thinking processes used by high-performing students in a computer programming task

Directory of Open Access Journals (Sweden)

Marietjie Havenga

2011-07-01

Full Text Available Computer programmers must be able to understand programming source code and write programs that execute complex tasks to solve real-world problems. This article is a trans- disciplinary study at the intersection of computer programming, education and psychology. It outlines the role of mental processes in the process of programming and indicates how successful thinking processes can support computer science students in writing correct and well-defined programs. A mixed methods approach was used to better understand the thinking activities and programming processes of participating students. Data collection involved both computer programs and students’ reflective thinking processes recorded in their journals. This enabled analysis of psychological dimensions of participants’ thinking processes and their problem-solving activities as they considered a programming problem. Findings indicate that the cognitive, reflective and psychological processes used by high-performing programmers contributed to their success in solving a complex programming problem. Based on the thinking processes of high performers, we propose a model of integrated thinking processes, which can support computer programming students. Keywords: Computer programming, education, mixed methods research, thinking processes. Disciplines: Computer programming, education, psychology
Linear Subpixel Learning Algorithm for Land Cover Classification from WELD using High Performance Computing

Science.gov (United States)

Ganguly, S.; Kumar, U.; Nemani, R. R.; Kalia, S.; Michaelis, A.

2017-12-01

In this work, we use a Fully Constrained Least Squares Subpixel Learning Algorithm to unmix global WELD (Web Enabled Landsat Data) to obtain fractions or abundances of substrate (S), vegetation (V) and dark objects (D) classes. Because of the sheer nature of data and compute needs, we leveraged the NASA Earth Exchange (NEX) high performance computing architecture to optimize and scale our algorithm for large-scale processing. Subsequently, the S-V-D abundance maps were characterized into 4 classes namely, forest, farmland, water and urban areas (with NPP-VIIRS - national polar orbiting partnership visible infrared imaging radiometer suite nighttime lights data) over California, USA using Random Forest classifier. Validation of these land cover maps with NLCD (National Land Cover Database) 2011 products and NAFD (North American Forest Dynamics) static forest cover maps showed that an overall classification accuracy of over 91% was achieved, which is a 6% improvement in unmixing based classification relative to per-pixel based classification. As such, abundance maps continue to offer an useful alternative to high-spatial resolution data derived classification maps for forest inventory analysis, multi-class mapping for eco-climatic models and applications, fast multi-temporal trend analysis and for societal and policy-relevant applications needed at the watershed scale.
Usage of super high speed computer for clarification of complex phenomena

International Nuclear Information System (INIS)

Sekiguchi, Tomotsugu; Sato, Mitsuhisa; Nakata, Hideki; Tatebe, Osami; Takagi, Hiromitsu

1999-01-01

This study aims at construction of an efficient super high speed computer system application environment in response to parallel distributed system with easy transplantation to different computer system and different number by conducting research and development on super high speed computer application technology required for elucidation of complicated phenomenon in elucidation of complicated phenomenon of nuclear power field due to computed scientific method. In order to realize such environment, the Electrotechnical Laboratory has conducted development on Ninf, a network numerical information library. This Ninf system can supply a global network infrastructure for worldwide computing with high performance on further wide range distributed network (G.K.)
Spatial Processing of Urban Acoustic Wave Fields from High-Performance Computations

National Research Council Canada - National Science Library

Ketcham, Stephen A; Wilson, D. K; Cudney, Harley H; Parker, Michael W

2007-01-01

.... The objective of this work is to develop spatial processing techniques for acoustic wave propagation data from three-dimensional high-performance computations to quantify scattering due to urban...
Development of high performance scientific components for interoperability of computing packages

Energy Technology Data Exchange (ETDEWEB)

Gulabani, Teena Pratap [Iowa State Univ., Ames, IA (United States)

2008-01-01

Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achieved by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components. Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications.
MaMR: High-performance MapReduce programming model for material cloud applications

Science.gov (United States)

Jing, Weipeng; Tong, Danyu; Wang, Yangang; Wang, Jingyuan; Liu, Yaqiu; Zhao, Peng

2017-02-01

With the increasing data size in materials science, existing programming models no longer satisfy the application requirements. MapReduce is a programming model that enables the easy development of scalable parallel applications to process big data on cloud computing systems. However, this model does not directly support the processing of multiple related data, and the processing performance does not reflect the advantages of cloud computing. To enhance the capability of workflow applications in material data processing, we defined a programming model for material cloud applications that supports multiple different Map and Reduce functions running concurrently based on hybrid share-memory BSP called MaMR. An optimized data sharing strategy to supply the shared data to the different Map and Reduce stages was also designed. We added a new merge phase to MapReduce that can efficiently merge data from the map and reduce modules. Experiments showed that the model and framework present effective performance improvements compared to previous work.
Application of cluster computing in materials science

International Nuclear Information System (INIS)

Kuzmin, A.

2006-01-01

Solution of many problems in materials science requires that high performance computing (HPC) be used. Therefore, a cluster computer, Latvian Super-cluster (LASC), was constructed at the Institute of Solid State Physics of the University of Latvia in 2002. The LASC is used for advanced research in the fields of quantum chemistry, solid state physics and nano materials. In this work we overview currently available computational technologies and exemplify their application by interpretation of x-ray absorption spectra for nano-sized ZnO. (author)
Towards a Scalable and Adaptive Application Support Platform for Large-Scale Distributed E-Sciences in High-Performance Network Environments

Energy Technology Data Exchange (ETDEWEB)

Wu, Chase Qishi [New Jersey Inst. of Technology, Newark, NJ (United States); Univ. of Memphis, TN (United States); Zhu, Michelle Mengxia [Southern Illinois Univ., Carbondale, IL (United States)

2016-06-06

The advent of large-scale collaborative scientific applications has demonstrated the potential for broad scientific communities to pool globally distributed resources to produce unprecedented data acquisition, movement, and analysis. System resources including supercomputers, data repositories, computing facilities, network infrastructures, storage systems, and display devices have been increasingly deployed at national laboratories and academic institutes. These resources are typically shared by large communities of users over Internet or dedicated networks and hence exhibit an inherent dynamic nature in their availability, accessibility, capacity, and stability. Scientific applications using either experimental facilities or computation-based simulations with various physical, chemical, climatic, and biological models feature diverse scientific workflows as simple as linear pipelines or as complex as a directed acyclic graphs, which must be executed and supported over wide-area networks with massively distributed resources. Application users oftentimes need to manually configure their computing tasks over networks in an ad hoc manner, hence significantly limiting the productivity of scientists and constraining the utilization of resources. The success of these large-scale distributed applications requires a highly adaptive and massively scalable workflow platform that provides automated and optimized computing and networking services. This project is to design and develop a generic Scientific Workflow Automation and Management Platform (SWAMP), which contains a web-based user interface specially tailored for a target application, a set of user libraries, and several easy-to-use computing and networking toolkits for application scientists to conveniently assemble, execute, monitor, and control complex computing workflows in heterogeneous high-performance network environments. SWAMP will enable the automation and management of the entire process of scientific
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

Science.gov (United States)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
High performance simulation for the Silva project using the tera computer

International Nuclear Information System (INIS)

Bergeaud, V.; La Hargue, J.P.; Mougery, F.; Boulet, M.; Scheurer, B.; Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A.

2003-01-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)

High performance simulation for the Silva project using the tera computer

Energy Technology Data Exchange (ETDEWEB)

Bergeaud, V.; La Hargue, J.P.; Mougery, F. [CS Communication and Systemes, 92 - Clamart (France); Boulet, M.; Scheurer, B. [CEA Bruyeres-le-Chatel, 91 - Bruyeres-le-Chatel (France); Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A. [CEA Saclay, 91 - Gif sur Yvette (France)

2003-07-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
High performance computing applied to simulation of the flow in pipes; Computacao de alto desempenho aplicada a simulacao de escoamento em dutos

Energy Technology Data Exchange (ETDEWEB)

Cozin, Cristiane; Lueders, Ricardo; Morales, Rigoberto E.M. [Universidade Tecnologica Federal do Parana (UTFPR), Curitiba, PR (Brazil). Dept. de Engenharia Mecanica

2008-07-01

In recent years, computer cluster has emerged as a real alternative to solution of problems which require high performance computing. Consequently, the development of new applications has been driven. Among them, flow simulation represents a real computational burden specially for large systems. This work presents a study of using parallel computing for numerical fluid flow simulation in pipelines. A mathematical flow model is numerically solved. In general, this procedure leads to a tridiagonal system of equations suitable to be solved by a parallel algorithm. In this work, this is accomplished by a parallel odd-oven reduction method found in the literature which is implemented on Fortran programming language. A computational platform composed by twelve processors was used. Many measures of CPU times for different tridiagonal system sizes and number of processors were obtained, highlighting the communication time between processors as an important issue to be considered when evaluating the performance of parallel applications. (author)
Scalability of DL_POLY on High Performance Computing Platform

Directory of Open Access Journals (Sweden)

Mabule Samuel Mabakane

2017-12-01

Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.
Towards OpenVL: Improving Real-Time Performance of Computer Vision Applications

Science.gov (United States)

Shen, Changsong; Little, James J.; Fels, Sidney

Meeting constraints for real-time performance is a main issue for computer vision, especially for embedded computer vision systems. This chapter presents our progress on our open vision library (OpenVL), a novel software architecture to address efficiency through facilitating hardware acceleration, reusability, and scalability for computer vision systems. A logical image understanding pipeline is introduced to allow parallel processing. We also discuss progress on our middleware—vision library utility toolkit (VLUT)—that enables applications to operate transparently over a heterogeneous collection of hardware implementations. OpenVL works as a state machine,with an event-driven mechanismto provide users with application-level interaction. Various explicit or implicit synchronization and communication methods are supported among distributed processes in the logical pipelines. The intent of OpenVL is to allow users to quickly and easily recover useful information from multiple scenes, in a cross-platform, cross-language manner across various software environments and hardware platforms. To validate the critical underlying concepts of OpenVL, a human tracking system and a local positioning system are implemented and described. The novel architecture separates the specification of algorithmic details from the underlying implementation, allowing for different components to be implemented on an embedded system without recompiling code.
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto

2013-09-10

State-of-the art computers need high performance transistors, which consume ultra-low power resulting in longer battery lifetime. Billions of transistors are integrated neatly using matured silicon fabrication process to maintain the performance per cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today\\'s computers. One limitation is silicon\\'s rigidity and brittleness. Here we show a generic batch process to convert high performance silicon electronics into flexible and semi-transparent one while retaining its performance, process compatibility, integration density and cost. We demonstrate high-k/metal gate stack based p-type metal oxide semiconductor field effect transistors on 4 inch silicon fabric released from bulk silicon (100) wafers with sub-threshold swing of 80 mV dec(-1) and on/off ratio of near 10(4) within 10% device uniformity with a minimum bending radius of 5 mm and an average transmittance of similar to 7% in the visible spectrum.
High-performance silicon photonics technology for telecommunications applications.

Science.gov (United States)

Yamada, Koji; Tsuchizawa, Tai; Nishi, Hidetaka; Kou, Rai; Hiraki, Tatsurou; Takeda, Kotaro; Fukuda, Hiroshi; Ishikawa, Yasuhiko; Wada, Kazumi; Yamamoto, Tsuyoshi

2014-04-01

By way of a brief review of Si photonics technology, we show that significant improvements in device performance are necessary for practical telecommunications applications. In order to improve device performance in Si photonics, we have developed a Si-Ge-silica monolithic integration platform, on which compact Si-Ge-based modulators/detectors and silica-based high-performance wavelength filters are monolithically integrated. The platform features low-temperature silica film deposition, which cannot damage Si-Ge-based active devices. Using this platform, we have developed various integrated photonic devices for broadband telecommunications applications.
High-performance silicon photonics technology for telecommunications applications

International Nuclear Information System (INIS)

Yamada, Koji; Tsuchizawa, Tai; Nishi, Hidetaka; Kou, Rai; Hiraki, Tatsurou; Takeda, Kotaro; Fukuda, Hiroshi; Yamamoto, Tsuyoshi; Ishikawa, Yasuhiko; Wada, Kazumi

2014-01-01

By way of a brief review of Si photonics technology, we show that significant improvements in device performance are necessary for practical telecommunications applications. In order to improve device performance in Si photonics, we have developed a Si-Ge-silica monolithic integration platform, on which compact Si-Ge–based modulators/detectors and silica-based high-performance wavelength filters are monolithically integrated. The platform features low-temperature silica film deposition, which cannot damage Si-Ge–based active devices. Using this platform, we have developed various integrated photonic devices for broadband telecommunications applications. (review)
High-performance silicon photonics technology for telecommunications applications

Science.gov (United States)

Yamada, Koji; Tsuchizawa, Tai; Nishi, Hidetaka; Kou, Rai; Hiraki, Tatsurou; Takeda, Kotaro; Fukuda, Hiroshi; Ishikawa, Yasuhiko; Wada, Kazumi; Yamamoto, Tsuyoshi

2014-04-01

By way of a brief review of Si photonics technology, we show that significant improvements in device performance are necessary for practical telecommunications applications. In order to improve device performance in Si photonics, we have developed a Si-Ge-silica monolithic integration platform, on which compact Si-Ge-based modulators/detectors and silica-based high-performance wavelength filters are monolithically integrated. The platform features low-temperature silica film deposition, which cannot damage Si-Ge-based active devices. Using this platform, we have developed various integrated photonic devices for broadband telecommunications applications.
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai; Shae, Zon Yin; Zhang, Xiangliang; Jamjoom, Hani T.; Fong, Liana

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies
High energy physics and cloud computing

International Nuclear Information System (INIS)

Cheng Yaodong; Liu Baoxu; Sun Gongxing; Chen Gang

2011-01-01

High Energy Physics (HEP) has been a strong promoter of computing technology, for example WWW (World Wide Web) and the grid computing. In the new era of cloud computing, HEP has still a strong demand, and major international high energy physics laboratories have launched a number of projects to research on cloud computing technologies and applications. It describes the current developments in cloud computing and its applications in high energy physics. Some ongoing projects in the institutes of high energy physics, Chinese Academy of Sciences, including cloud storage, virtual computing clusters, and BESⅢ elastic cloud, are also described briefly in the paper. (authors)
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda; Yokota, Rio; Keyes, David E.

2016-01-01

model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization
Development of Desktop Computing Applications and Engineering Tools on GPUs

DEFF Research Database (Denmark)

Sørensen, Hans Henrik Brandenborg; Glimberg, Stefan Lemvig; Hansen, Toke Jansen

(GPUs) for high-performance computing applications and software tools in science and engineering, inverse problems, visualization, imaging, dynamic optimization. The goals are to contribute to the development of new state-of-the-art mathematical models and algorithms for maximum throughout performance...
10th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2017-01-01

This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.
HIGH PERFORMANCE PHOTOGRAMMETRIC PROCESSING ON COMPUTER CLUSTERS

Directory of Open Access Journals (Sweden)

V. N. Adrov

2012-07-01

Full Text Available Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.
HPCToolkit: performance tools for scientific computing

Energy Technology Data Exchange (ETDEWEB)

Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M [Department of Computer Science, Rice University, Houston, TX 77005 (United States)

2008-07-15

As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei.
HPCToolkit: performance tools for scientific computing

International Nuclear Information System (INIS)

Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M

2008-01-01

As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei
Direct numerical simulation of reactor two-phase flows enabled by high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Fang, Jun; Cambareri, Joseph J.; Brown, Cameron S.; Feng, Jinyong; Gouws, Andre; Li, Mengnan; Bolotnov, Igor A.

2018-04-01

Nuclear reactor two-phase flows remain a great engineering challenge, where the high-resolution two-phase flow database which can inform practical model development is still sparse due to the extreme reactor operation conditions and measurement difficulties. Owing to the rapid growth of computing power, the direct numerical simulation (DNS) is enjoying a renewed interest in investigating the related flow problems. A combination between DNS and an interface tracking method can provide a unique opportunity to study two-phase flows based on first principles calculations. More importantly, state-of-the-art high-performance computing (HPC) facilities are helping unlock this great potential. This paper reviews the recent research progress of two-phase flow DNS related to reactor applications. The progress in large-scale bubbly flow DNS has been focused not only on the sheer size of those simulations in terms of resolved Reynolds number, but also on the associated advanced modeling and analysis techniques. Specifically, the current areas of active research include modeling of sub-cooled boiling, bubble coalescence, as well as the advanced post-processing toolkit for bubbly flow simulations in reactor geometries. A novel bubble tracking method has been developed to track the evolution of bubbles in two-phase bubbly flow. Also, spectral analysis of DNS database in different geometries has been performed to investigate the modulation of the energy spectrum slope due to bubble-induced turbulence. In addition, the single-and two-phase analysis results are presented for turbulent flows within the pressurized water reactor (PWR) core geometries. The related simulations are possible to carry out only with the world leading HPC platforms. These simulations are allowing more complex turbulence model development and validation for use in 3D multiphase computational fluid dynamics (M-CFD) codes.
High-Performance Integrated Virtual Environment (HIVE) Tools and Applications for Big Data Analysis.

Science.gov (United States)

Simonyan, Vahan; Mazumder, Raja

2014-09-30

The High-performance Integrated Virtual Environment (HIVE) is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS) data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
High-Performance Integrated Virtual Environment (HIVE Tools and Applications for Big Data Analysis

Directory of Open Access Journals (Sweden)

Vahan Simonyan

2014-09-01

Full Text Available The High-performance Integrated Virtual Environment (HIVE is a high-throughput cloud-based infrastructure developed for the storage and analysis of genomic and associated biological data. HIVE consists of a web-accessible interface for authorized users to deposit, retrieve, share, annotate, compute and visualize Next-generation Sequencing (NGS data in a scalable and highly efficient fashion. The platform contains a distributed storage library and a distributed computational powerhouse linked seamlessly. Resources available through the interface include algorithms, tools and applications developed exclusively for the HIVE platform, as well as commonly used external tools adapted to operate within the parallel architecture of the system. HIVE is composed of a flexible infrastructure, which allows for simple implementation of new algorithms and tools. Currently, available HIVE tools include sequence alignment and nucleotide variation profiling tools, metagenomic analyzers, phylogenetic tree-building tools using NGS data, clone discovery algorithms, and recombination analysis algorithms. In addition to tools, HIVE also provides knowledgebases that can be used in conjunction with the tools for NGS sequence and metadata analysis.
Advances in Computer Science, Engineering & Applications : Proceedings of the Second International Conference on Computer Science, Engineering & Applications

CERN Document Server

Zizka, Jan; Nagamalai, Dhinaharan

2012-01-01

The International conference series on Computer Science, Engineering & Applications (ICCSEA) aims to bring together researchers and practitioners from academia and industry to focus on understanding computer science, engineering and applications and to establish new collaborations in these areas. The Second International Conference on Computer Science, Engineering & Applications (ICCSEA-2012), held in Delhi, India, during May 25-27, 2012 attracted many local and international delegates, presenting a balanced mixture of intellect and research both from the East and from the West. Upon a strenuous peer-review process the best submissions were selected leading to an exciting, rich and a high quality technical conference program, which featured high-impact presentations in the latest developments of various areas of computer science, engineering and applications research.

Advances in Computer Science, Engineering & Applications : Proceedings of the Second International Conference on Computer Science, Engineering & Applications

CERN Document Server

Zizka, Jan; Nagamalai, Dhinaharan

2012-01-01

The International conference series on Computer Science, Engineering & Applications (ICCSEA) aims to bring together researchers and practitioners from academia and industry to focus on understanding computer science, engineering and applications and to establish new collaborations in these areas. The Second International Conference on Computer Science, Engineering & Applications (ICCSEA-2012), held in Delhi, India, during May 25-27, 2012 attracted many local and international delegates, presenting a balanced mixture of intellect and research both from the East and from the West. Upon a strenuous peer-review process the best submissions were selected leading to an exciting, rich and a high quality technical conference program, which featured high-impact presentations in the latest developments of various areas of computer science, engineering and applications research.
Computational design for a wide-angle cermet-based solar selective absorber for high temperature applications

International Nuclear Information System (INIS)

Sakurai, Atsushi; Tanikawa, Hiroya; Yamada, Makoto

2014-01-01

The purpose of this study is to computationally design a wide-angle cermet-based solar selective absorber for high temperature applications by using a characteristic matrix method and a genetic algorithm. The present study investigates a solar selective absorber with tungsten–silica (W–SiO 2 ) cermet. Multilayer structures of 1, 2, 3, and 4 layers and a wide range of metal volume fractions are optimized. The predicted radiative properties show good solar performance, i.e., thermal emittances, especially beyond 2 μm, are quite low, in contrast, solar absorptance levels are successfully high with wide angular range, so that solar photons are effectively absorbed and infrared radiative heat loss can be decreased. -- Highlights: • Electromagnetic simulation of radiative properties by characteristic matrix method. • Optimization for multilayered W–SiO 2 cermet-based absorber by a Genetic Algorithm. • We propose a successfully high solar performance of solar selective absorber
Progress in a novel architecture for high performance processing

Science.gov (United States)

Zhang, Zhiwei; Liu, Meng; Liu, Zijun; Du, Xueliang; Xie, Shaolin; Ma, Hong; Ding, Guangxin; Ren, Weili; Zhou, Fabiao; Sun, Wenqin; Wang, Huijuan; Wang, Donglin

2018-04-01

The high performance processing (HPP) is an innovative architecture which targets on high performance computing with excellent power efficiency and computing performance. It is suitable for data intensive applications like supercomputing, machine learning and wireless communication. An example chip with four application-specific integrated circuit (ASIC) cores which is the first generation of HPP cores has been taped out successfully under Taiwan Semiconductor Manufacturing Company (TSMC) 40 nm low power process. The innovative architecture shows great energy efficiency over the traditional central processing unit (CPU) and general-purpose computing on graphics processing units (GPGPU). Compared with MaPU, HPP has made great improvement in architecture. The chip with 32 HPP cores is being developed under TSMC 16 nm field effect transistor (FFC) technology process and is planed to use commercially. The peak performance of this chip can reach 4.3 teraFLOPS (TFLOPS) and its power efficiency is up to 89.5 gigaFLOPS per watt (GFLOPS/W).
Export Controls: Implementation of the 1998 Legislative Mandate for High Performance Computers

National Research Council Canada - National Science Library

1999-01-01

We found that most of the 938 proposed exports of high performance computers to civilian end users in countries of concern from February 3, 1998, when procedures implementing the 1998 authorization...
Construction of Blaze at the University of Illinois at Chicago: A Shared, High-Performance, Visual Computer for Next-Generation Cyberinfrastructure-Accelerated Scientific, Engineering, Medical and Public Policy Research

Energy Technology Data Exchange (ETDEWEB)

Brown, Maxine D. [Acting Director, EVL; Leigh, Jason [PI

2014-02-17

The Blaze high-performance visual computing system serves the high-performance computing research and education needs of University of Illinois at Chicago (UIC). Blaze consists of a state-of-the-art, networked, computer cluster and ultra-high-resolution visualization system called CAVE2(TM) that is currently not available anywhere in Illinois. This system is connected via a high-speed 100-Gigabit network to the State of Illinois' I-WIRE optical network, as well as to national and international high speed networks, such as the Internet2, and the Global Lambda Integrated Facility. This enables Blaze to serve as an on-ramp to national cyberinfrastructure, such as the National Science Foundation’s Blue Waters petascale computer at the National Center for Supercomputing Applications at the University of Illinois at Chicago and the Department of Energy’s Argonne Leadership Computing Facility (ALCF) at Argonne National Laboratory. DOE award # DE-SC005067, leveraged with NSF award #CNS-0959053 for “Development of the Next-Generation CAVE Virtual Environment (NG-CAVE),” enabled us to create a first-of-its-kind high-performance visual computing system. The UIC Electronic Visualization Laboratory (EVL) worked with two U.S. companies to advance their commercial products and maintain U.S. leadership in the global information technology economy. New applications are being enabled with the CAVE2/Blaze visual computing system that is advancing scientific research and education in the U.S. and globally, and help train the next-generation workforce.
Computer science of the high performance; Informatica del alto rendimiento

Energy Technology Data Exchange (ETDEWEB)

Moraleda, A.

2008-07-01

The high performance computing is taking shape as a powerful accelerator of the process of innovation, to drastically reduce the waiting times for access to the results and the findings in a growing number of processes and activities as complex and important as medicine, genetics, pharmacology, environment, natural resources management or the simulation of complex processes in a wide variety of industries. (Author)
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

CERN Document Server

Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2014-01-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

Science.gov (United States)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

2015-05-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).
Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

International Nuclear Information System (INIS)

Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad; Knight, Robert

2015-01-01

Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). (paper)
Business Models of High Performance Computing Centres in Higher Education in Europe

Science.gov (United States)

Eurich, Markus; Calleja, Paul; Boutellier, Roman

2013-01-01

High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…
Confabulation Based Real-time Anomaly Detection for Wide-area Surveillance Using Heterogeneous High Performance Computing Architecture

Science.gov (United States)

2015-06-01

CONFABULATION BASED REAL-TIME ANOMALY DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE SYRACUSE...DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-12-1-0251 5b. GRANT...processors including graphic processor units (GPUs) and Intel Xeon Phi processors. Experimental results showed significant speedups, which can enable
Using high performance interconnects in a distributed computing and mass storage environment

International Nuclear Information System (INIS)

Ernst, M.

1994-01-01

Detector Collaborations of the HERA Experiments typically involve more than 500 physicists from a few dozen institutes. These physicists require access to large amounts of data in a fully transparent manner. Important issues include Distributed Mass Storage Management Systems in a Distributed and Heterogeneous Computing Environment. At the very center of a distributed system, including tens of CPUs and network attached mass storage peripherals are the communication links. Today scientists are witnessing an integration of computing and communication technology with the open-quote network close-quote becoming the computer. This contribution reports on a centrally operated computing facility for the HERA Experiments at DESY, including Symmetric Multiprocessor Machines (84 Processors), presently more than 400 GByte of magnetic disk and 40 TB of automoted tape storage, tied together by a HIPPI open-quote network close-quote. Focussing on the High Performance Interconnect technology, details will be provided about the HIPPI based open-quote Backplane close-quote configured around a 20 Gigabit/s Multi Media Router and the performance and efficiency of the related computer interfaces
Proposal for grid computing for nuclear applications

International Nuclear Information System (INIS)

Faridah Mohamad Idris; Wan Ahmad Tajuddin Wan Abdullah; Zainol Abidin Ibrahim; Zukhaimira Zolkapli

2013-01-01

Full-text: The use of computer clusters for computational sciences including computational physics is vital as it provides computing power to crunch big numbers at a faster rate. In compute intensive applications that requires high resolution such as Monte Carlo simulation, the use of computer clusters in a grid form that supplies computational power to any nodes within the grid that needs computing power, has now become a necessity. In this paper, we described how the clusters running on a specific application could use resources within the grid, to run the applications to speed up the computing process. (author)
Green computing: efficient energy management of multiprocessor streaming applications via model checking

NARCIS (Netherlands)

Ahmad, W.

2017-01-01

Streaming applications such as virtual reality, video conferencing, and face detection, impose high demands on a system’s performance and battery life. With the advancement in mobile computing, these applications are increasingly implemented on battery-constrained platforms, such as gaming consoles,
Field-programmable custom computing technology architectures, tools, and applications

CERN Document Server

Luk, Wayne; Pocek, Ken

2000-01-01

Field-Programmable Custom Computing Technology: Architectures, Tools, and Applications brings together in one place important contributions and up-to-date research results in this fast-moving area. In seven selected chapters, the book describes the latest advances in architectures, design methods, and applications of field-programmable devices for high-performance reconfigurable systems. The contributors to this work were selected from the leading researchers and practitioners in the field. It will be valuable to anyone working or researching in the field of custom computing technology. It serves as an excellent reference, providing insight into some of the most challenging issues being examined today.
Autonomic Management of Application Workflows on Hybrid Computing Infrastructure

Directory of Open Access Journals (Sweden)

Hyunjoo Kim

2011-01-01

Full Text Available In this paper, we present a programming and runtime framework that enables the autonomic management of complex application workflows on hybrid computing infrastructures. The framework is designed to address system and application heterogeneity and dynamics to ensure that application objectives and constraints are satisfied. The need for such autonomic system and application management is becoming critical as computing infrastructures become increasingly heterogeneous, integrating different classes of resources from high-end HPC systems to commodity clusters and clouds. For example, the framework presented in this paper can be used to provision the appropriate mix of resources based on application requirements and constraints. The framework also monitors the system/application state and adapts the application and/or resources to respond to changing requirements or environment. To demonstrate the operation of the framework and to evaluate its ability, we employ a workflow used to characterize an oil reservoir executing on a hybrid infrastructure composed of TeraGrid nodes and Amazon EC2 instances of various types. Specifically, we show how different applications objectives such as acceleration, conservation and resilience can be effectively achieved while satisfying deadline and budget constraints, using an appropriate mix of dynamically provisioned resources. Our evaluations also demonstrate that public clouds can be used to complement and reinforce the scheduling and usage of traditional high performance computing infrastructure.
Computing in high energy physics

International Nuclear Information System (INIS)

Hertzberger, L.O.; Hoogland, W.

1986-01-01

This book deals with advanced computing applications in physics, and in particular in high energy physics environments. The main subjects covered are networking; vector and parallel processing; and embedded systems. Also examined are topics such as operating systems, future computer architectures and commercial computer products. The book presents solutions that are foreseen as coping, in the future, with computing problems in experimental and theoretical High Energy Physics. In the experimental environment the large amounts of data to be processed offer special problems on-line as well as off-line. For on-line data reduction, embedded special purpose computers, which are often used for trigger applications are applied. For off-line processing, parallel computers such as emulator farms and the cosmic cube may be employed. The analysis of these topics is therefore a main feature of this volume
An Application Development Platform for Neuromorphic Computing

Energy Technology Data Exchange (ETDEWEB)

Dean, Mark [University of Tennessee (UT); Chan, Jason [University of Tennessee (UT); Daffron, Christopher [University of Tennessee (UT); Disney, Adam [University of Tennessee (UT); Reynolds, John [University of Tennessee (UT); Rose, Garrett [University of Tennessee (UT); Plank, James [University of Tennessee (UT); Birdwell, John Douglas [University of Tennessee (UT); Schuman, Catherine D [ORNL

2016-01-01

Dynamic Adaptive Neural Network Arrays (DANNAs) are neuromorphic computing systems developed as a hardware based approach to the implementation of neural networks. They feature highly adaptive and programmable structural elements, which model arti cial neural networks with spiking behavior. We design them to solve problems using evolutionary optimization. In this paper, we highlight the current hardware and software implementations of DANNA, including their features, functionalities and performance. We then describe the development of an Application Development Platform (ADP) to support efficient application implementation and testing of DANNA based solutions. We conclude with future directions.
Comprehensive Simulation Lifecycle Management for High Performance Computing Modeling and Simulation, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — There are significant logistical barriers to entry-level high performance computing (HPC) modeling and simulation (M IllinoisRocstar) sets up the infrastructure for...
How to build a high-performance compute cluster for the Grid

CERN Document Server

Reinefeld, A

2001-01-01

The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage resources at geographically distributed sites. Currently, much effort is spent on the design of Grid management software (Datagrid, Globus, etc.), while the effective integration of computing nodes has been largely neglected up to now. This is the focus of our work. We present a framework for a high- performance cluster that can be used as a reliable computing node in the Grid. We outline the cluster architecture, the management of distributed data and the seamless integration of the cluster into the Grid environment. (11 refs).

High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

International Nuclear Information System (INIS)

Bouchard, Kristofer E.

2016-01-01

A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
Cloud computing applications for biomedical science: A perspective.

Science.gov (United States)

Navale, Vivek; Bourne, Philip E

2018-06-01

Biomedical research has become a digital data-intensive endeavor, relying on secure and scalable computing, storage, and network infrastructure, which has traditionally been purchased, supported, and maintained locally. For certain types of biomedical applications, cloud computing has emerged as an alternative to locally maintained traditional computing approaches. Cloud computing offers users pay-as-you-go access to services such as hardware infrastructure, platforms, and software for solving common biomedical computational problems. Cloud computing services offer secure on-demand storage and analysis and are differentiated from traditional high-performance computing by their rapid availability and scalability of services. As such, cloud services are engineered to address big data problems and enhance the likelihood of data and analytics sharing, reproducibility, and reuse. Here, we provide an introductory perspective on cloud computing to help the reader determine its value to their own research.
A CAD Open Platform for High Performance Reconfigurable Systems in the EXTRA Project

NARCIS (Netherlands)

Rabozzi, M.; Brondolin, R.; Natale, G.; Del Sozzo, E.; Huebner, M.; Brokalakis, A.; Ciobanu, C.; Stroobandt, D.; Santambrogio, M.D.; Hübner, M.; Reis, R.; Stan, M.; Voros, N.

2017-01-01

As the power wall has become one of the main limiting factors for the performance of general purpose processors, the trend in High Performance Computing (HPC) is moving towards application-specific accelerators in order to meet the stringent performance requirements for exascale computing while
Evaluation of External Memory Access Performance on a High-End FPGA Hybrid Computer

Directory of Open Access Journals (Sweden)

Konstantinos Kalaitzis

2016-10-01

Full Text Available The motivation of this research was to evaluate the main memory performance of a hybrid super computer such as the Convey HC-x, and ascertain how the controller performs in several access scenarios, vis-à-vis hand-coded memory prefetches. Such memory patterns are very useful in stencil computations. The theoretical bandwidth of the memory of the Convey is compared with the results of our measurements. The accurate study of the memory subsystem is particularly useful for users when they are developing their application-specific personality. Experiments were performed to measure the bandwidth between the coprocessor and the memory subsystem. The experiments aimed mainly at measuring the reading access speed of the memory from Application Engines (FPGAs. Different ways of accessing data were used in order to find the most efficient way to access memory. This way was proposed for future work in the Convey HC-x. When performing a series of accesses to memory, non-uniform latencies occur. The Memory Controller of the Convey HC-x in the coprocessor attempts to cover this latency. We measure memory efficiency as a ratio of the number of memory accesses and the number of execution cycles. The result of this measurement converges to one in most cases. In addition, we performed experiments with hand-coded memory accesses. The analysis of the experimental results shows how the memory subsystem and Memory Controllers work. From this work we conclude that the memory controllers do an excellent job, largely because (transparently to the user they seem to cache large amounts of data, and hence hand-coding is not needed in most situations.
High-Performance Energy Applications and Systems

Energy Technology Data Exchange (ETDEWEB)

Miller, Barton [Univ. of Wisconsin, Madison, WI (United States)

2014-01-01

The Paradyn project has a history of developing algorithms, techniques, and software that push the cutting edge of tool technology for high-end computing systems. Under this funding, we are working on a three-year agenda to make substantial new advances in support of new and emerging Petascale systems. The overall goal for this work is to address the steady increase in complexity of these petascale systems. Our work covers two key areas: (1) The analysis, instrumentation and control of binary programs. Work in this area falls under the general framework of the Dyninst API tool kits. (2) Infrastructure for building tools and applications at extreme scale. Work in this area falls under the general framework of the MRNet scalability framework. Note that work done under this funding is closely related to work done under a contemporaneous grant, “Foundational Tools for Petascale Computing”, SC0003922/FG02-10ER25940, UW PRJ27NU.
Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

Science.gov (United States)

Caulfield, H John; Dolev, Shlomi; Green, William M J

2009-08-01

The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.
Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

Science.gov (United States)

Ivanovic, Pavle; Richter, Harald

2018-01-01

High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.
Survey of computer codes applicable to waste facility performance evaluations

International Nuclear Information System (INIS)

Alsharif, M.; Pung, D.L.; Rivera, A.L.; Dole, L.R.

1988-01-01

This study is an effort to review existing information that is useful to develop an integrated model for predicting the performance of a radioactive waste facility. A summary description of 162 computer codes is given. The identified computer programs address the performance of waste packages, waste transport and equilibrium geochemistry, hydrological processes in unsaturated and saturated zones, and general waste facility performance assessment. Some programs also deal with thermal analysis, structural analysis, and special purposes. A number of these computer programs are being used by the US Department of Energy, the US Nuclear Regulatory Commission, and their contractors to analyze various aspects of waste package performance. Fifty-five of these codes were identified as being potentially useful on the analysis of low-level radioactive waste facilities located above the water table. The code summaries include authors, identification data, model types, and pertinent references. 14 refs., 5 tabs
Improving the Eco-Efficiency of High Performance Computing Clusters Using EECluster

Directory of Open Access Journals (Sweden)

Alberto Cocaña-Fernández

2016-03-01

Full Text Available As data and supercomputing centres increase their performance to improve service quality and target more ambitious challenges every day, their carbon footprint also continues to grow, and has already reached the magnitude of the aviation industry. Also, high power consumptions are building up to a remarkable bottleneck for the expansion of these infrastructures in economic terms due to the unavailability of sufficient energy sources. A substantial part of the problem is caused by current energy consumptions of High Performance Computing (HPC clusters. To alleviate this situation, we present in this work EECluster, a tool that integrates with multiple open-source Resource Management Systems to significantly reduce the carbon footprint of clusters by improving their energy efficiency. EECluster implements a dynamic power management mechanism based on Computational Intelligence techniques by learning a set of rules through multi-criteria evolutionary algorithms. This approach enables cluster operators to find the optimal balance between a reduction in the cluster energy consumptions, service quality, and number of reconfigurations. Experimental studies using both synthetic and actual workloads from a real world cluster support the adoption of this tool to reduce the carbon footprint of HPC clusters.
High-performance mass storage system for workstations

Science.gov (United States)

Chiang, T.; Tang, Y.; Gupta, L.; Cooperman, S.

1993-01-01

Reduced Instruction Set Computer (RISC) workstations and Personnel Computers (PC) are very popular tools for office automation, command and control, scientific analysis, database management, and many other applications. However, when using Input/Output (I/O) intensive applications, the RISC workstations and PC's are often overburdened with the tasks of collecting, staging, storing, and distributing data. Also, by using standard high-performance peripherals and storage devices, the I/O function can still be a common bottleneck process. Therefore, the high-performance mass storage system, developed by Loral AeroSys' Independent Research and Development (IR&D) engineers, can offload a RISC workstation of I/O related functions and provide high-performance I/O functions and external interfaces. The high-performance mass storage system has the capabilities to ingest high-speed real-time data, perform signal or image processing, and stage, archive, and distribute the data. This mass storage system uses a hierarchical storage structure, thus reducing the total data storage cost, while maintaining high-I/O performance. The high-performance mass storage system is a network of low-cost parallel processors and storage devices. The nodes in the network have special I/O functions such as: SCSI controller, Ethernet controller, gateway controller, RS232 controller, IEEE488 controller, and digital/analog converter. The nodes are interconnected through high-speed direct memory access links to form a network. The topology of the network is easily reconfigurable to maximize system throughput for various applications. This high-performance mass storage system takes advantage of a 'busless' architecture for maximum expandability. The mass storage system consists of magnetic disks, a WORM optical disk jukebox, and an 8mm helical scan tape to form a hierarchical storage structure. Commonly used files are kept in the magnetic disk for fast retrieval. The optical disks are used as archive
Implementing and developing cloud computing applications

CERN Document Server

Sarna, David E Y

2010-01-01

From small start-ups to major corporations, companies of all sizes have embraced cloud computing for the scalability, reliability, and cost benefits it can provide. It has even been said that cloud computing may have a greater effect on our lives than the PC and dot-com revolutions combined.Filled with comparative charts and decision trees, Implementing and Developing Cloud Computing Applications explains exactly what it takes to build robust and highly scalable cloud computing applications in any organization. Covering the major commercial offerings available, it provides authoritative guidan
Real-time Tsunami Inundation Prediction Using High Performance Computers

Science.gov (United States)

Oishi, Y.; Imamura, F.; Sugawara, D.

2014-12-01

Recently off-shore tsunami observation stations based on cabled ocean bottom pressure gauges are actively being deployed especially in Japan. These cabled systems are designed to provide real-time tsunami data before tsunamis reach coastlines for disaster mitigation purposes. To receive real benefits of these observations, real-time analysis techniques to make an effective use of these data are necessary. A representative study was made by Tsushima et al. (2009) that proposed a method to provide instant tsunami source prediction based on achieving tsunami waveform data. As time passes, the prediction is improved by using updated waveform data. After a tsunami source is predicted, tsunami waveforms are synthesized from pre-computed tsunami Green functions of linear long wave equations. Tsushima et al. (2014) updated the method by combining the tsunami waveform inversion with an instant inversion of coseismic crustal deformation and improved the prediction accuracy and speed in the early stages. For disaster mitigation purposes, real-time predictions of tsunami inundation are also important. In this study, we discuss the possibility of real-time tsunami inundation predictions, which require faster-than-real-time tsunami inundation simulation in addition to instant tsunami source analysis. Although the computational amount is large to solve non-linear shallow water equations for inundation predictions, it has become executable through the recent developments of high performance computing technologies. We conducted parallel computations of tsunami inundation and achieved 6.0 TFLOPS by using 19,000 CPU cores. We employed a leap-frog finite difference method with nested staggered grids of which resolution range from 405 m to 5 m. The resolution ratio of each nested domain was 1/3. Total number of grid points were 13 million, and the time step was 0.1 seconds. Tsunami sources of 2011 Tohoku-oki earthquake were tested. The inundation prediction up to 2 hours after the
Design of massively parallel hardware multi-processors for highly-demanding embedded applications

NARCIS (Netherlands)

Jozwiak, L.; Jan, Y.

2013-01-01

Many new embedded applications require complex computations to be performed to tight schedules, while at the same time demanding low energy consumption and low cost. For implementation of these highly-demanding applications, highly-optimized application-specific multi-processor system-on-a-chip
High-performance insulator structures for accelerator applications

International Nuclear Information System (INIS)

Sampayan, S.E.; Caporaso, G.J.; Sanders, D.M.; Stoddard, R.D.; Trimble, D.O.; Elizondo, J.; Krogh, M.L.; Wieskamp, T.F.

1997-05-01

A new, high gradient insulator technology has been developed for accelerator systems. The concept involves the use of alternating layers of conductors and insulators with periods of order 1 mm or less. These structures perform many times better (about 1.5 to 4 times higher breakdown electric field) than conventional insulators in long pulse, short pulse, and alternating polarity applications. We describe our ongoing studies investigating the degradation of the breakdown electric field resulting from alternate fabrication techniques, the effect of gas pressure, the effect of the insulator-to-electrode interface gap spacing, and the performance of the insulator structure under bi-polar stress
High performance systems

Energy Technology Data Exchange (ETDEWEB)

Vigil, M.B. [comp.

1995-03-01

This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto; Sevilla, Galo T.; Hussain, Muhammad Mustafa

2013-01-01

cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today's computers. One limitation is silicon's rigidity and brittleness. Here we show a generic batch process
Department of Energy Mathematical, Information, and Computational Sciences Division: High Performance Computing and Communications Program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-11-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, The DOE Program in HPCC), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW).
Design and applications of Computed Industrial Tomographic Imaging System (CITIS)

International Nuclear Information System (INIS)

Ramakrishna, G.S.; Umesh Kumar; Datta, S.S.; Rao, S.M.

1996-01-01

Computed tomographic imaging is an advanced technique for nondestructive testing (NDT) and examination. For the first time in India a computed aided tomography system has been indigenously developed in BARC for testing industrial components and was successfully demonstrated. The system in addition to Computed Tomography (CT) can also perform Digital Radiography (DR) to serve as a powerful tool for NDT applications. It has wider applications in the fields of nuclear, space and allied fields. The authors have developed a computed industrial tomographic imaging system with Cesium 137 gamma radiation source for nondestructive examination of engineering and industrial specimens. This presentation highlights the design and development of a prototype system and its software for image reconstruction, simulation and display. The paper also describes results obtained with several tests specimens, current development and possibility of using neutrons as well as high energy x-rays in computed tomography. (author)
Quo vadis: Hydrologic inverse analyses using high-performance computing and a D-Wave quantum annealer

Science.gov (United States)

O'Malley, D.; Vesselinov, V. V.

2017-12-01

Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
Computer technique for evaluating collimator performance

International Nuclear Information System (INIS)

Rollo, F.D.

1975-01-01

A computer program has been developed to theoretically evaluate the overall performance of collimators used with radioisotope scanners and γ cameras. The first step of the program involves the determination of the line spread function (LSF) and geometrical efficiency from the fundamental parameters of the collimator being evaluated. The working equations can be applied to any plane of interest. The resulting LSF is applied to subroutine computer programs which compute corresponding modulation transfer function and contrast efficiency functions. The latter function is then combined with appropriate geometrical efficiency data to determine the performance index function. The overall computer program allows one to predict from the physical parameters of the collimator alone how well the collimator will reproduce various sized spherical voids of activity in the image plane. The collimator performance program can be used to compare the performance of various collimator types, to study the effects of source depth on collimator performance, and to assist in the design of collimators. The theory of the collimator performance equation is discussed, a comparison between the experimental and theoretical LSF values is made, and examples of the application of the technique are presented

Heterogeneous Gpu&Cpu Cluster For High Performance Computing In Cryptography

Directory of Open Access Journals (Sweden)

Michał Marks

2012-01-01

Full Text Available This paper addresses issues associated with distributed computing systems andthe application of mixed GPU&CPU technology to data encryption and decryptionalgorithms. We describe a heterogenous cluster HGCC formed by twotypes of nodes: Intel processor with NVIDIA graphics processing unit and AMDprocessor with AMD graphics processing unit (formerly ATI, and a novel softwareframework that hides the heterogeneity of our cluster and provides toolsfor solving complex scientific and engineering problems. Finally, we present theresults of numerical experiments. The considered case study is concerned withparallel implementations of selected cryptanalysis algorithms. The main goal ofthe paper is to show the wide applicability of the GPU&CPU technology tolarge scale computation and data processing.
A Lightweight, High-performance I/O Management Package for Data-intensive Computing

Energy Technology Data Exchange (ETDEWEB)

Wang, Jun

2011-06-22

Our group has been working with ANL collaborators on the topic bridging the gap between parallel file system and local file system during the course of this project period. We visited Argonne National Lab -- Dr. Robert Ross's group for one week in the past summer 2007. We looked over our current project progress and planned the activities for the incoming years 2008-09. The PI met Dr. Robert Ross several times such as HEC FSIO workshop 08, SC08 and SC10. We explored the opportunities to develop a production system by leveraging our current prototype to (SOGP+PVFS) a new PVFS version. We delivered SOGP+PVFS codes to ANL PVFS2 group in 2008.We also talked about exploring a potential project on developing new parallel programming models and runtime systems for data-intensive scalable computing (DISC). The methodology is to evolve MPI towards DISC by incorporating some functions of Google MapReduce parallel programming model. More recently, we are together exploring how to leverage existing works to perform (1) coordination/aggregation of local I/O operations prior to movement over the WAN, (2) efficient bulk data movement over the WAN, (3) latency hiding techniques for latency-intensive operations. Since 2009, we start applying Hadoop/MapReduce to some HEC applications with LANL scientists John Bent and Salman Habib. Another on-going work is to improve checkpoint performance at I/O forwarding Layer for the Road Runner super computer with James Nuetz and Gary Gridder at LANL. Two senior undergraduates from our research group did summer internships about high-performance file and storage system projects in LANL since 2008 for consecutive three years. Both of them are now pursuing Ph.D. degree in our group and will be 4th year in the PhD program in Fall 2011 and go to LANL to advance two above-mentioned works during this winter break. Since 2009, we have been collaborating with several computer scientists (Gary Grider, John bent, Parks Fields, James Nunez, Hsing
Enabling the ATLAS Experiment at the LHC for High Performance Computing

CERN Document Server

AUTHOR|(CDS)2091107; Ereditato, Antonio

In this thesis, I studied the feasibility of running computer data analysis programs from the Worldwide LHC Computing Grid, in particular large-scale simulations of the ATLAS experiment at the CERN LHC, on current general purpose High Performance Computing (HPC) systems. An approach for integrating HPC systems into the Grid is proposed, which has been implemented and tested on the „Todi” HPC machine at the Swiss National Supercomputing Centre (CSCS). Over the course of the test, more than 500000 CPU-hours of processing time have been provided to ATLAS, which is roughly equivalent to the combined computing power of the two ATLAS clusters at the University of Bern. This showed that current HPC systems can be used to efficiently run large-scale simulations of the ATLAS detector and of the detected physics processes. As a first conclusion of my work, one can argue that, in perspective, running large-scale tasks on a few large machines might be more cost-effective than running on relatively small dedicated com...
A high performance, low power computational platform for complex sensing operations in smart cities

KAUST Repository

Jiang, Jiming; Claudel, Christian

2017-01-01

This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4GHz 802.15.4802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from [1]. The hardware design is under CERN Open Hardware License v1.2.
A high performance, low power computational platform for complex sensing operations in smart cities

KAUST Repository

Jiang, Jiming

2017-02-02

This paper presents a new wireless platform designed for an integrated traffic/flash flood monitoring system. The sensor platform is built around a 32-bit ARM Cortex M4 microcontroller and a 2.4GHz 802.15.4802.15.4 ISM compliant radio module. It can be interfaced with fixed traffic sensors, or receive data from vehicle transponders. This platform is specifically designed for solar-powered, low bandwidth, high computational performance wireless sensor network applications. A self-recovering unit is designed to increase reliability and allow periodic hard resets, an essential requirement for sensor networks. A radio monitoring circuitry is proposed to monitor incoming and outgoing transmissions, simplifying software debugging. We illustrate the performance of this wireless sensor platform on complex problems arising in smart cities, such as traffic flow monitoring, machine-learning-based flash flood monitoring or Kalman-filter based vehicle trajectory estimation. All design files have been uploaded and shared in an open science framework, and can be accessed from [1]. The hardware design is under CERN Open Hardware License v1.2.
High performance parallel computing of flows in complex geometries: I. Methods

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F; Gazaix, M; Poinsot, T

2009-01-01

Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.
High-Precision Computation and Mathematical Physics

International Nuclear Information System (INIS)

Bailey, David H.; Borwein, Jonathan M.

2008-01-01

At the present time, IEEE 64-bit floating-point arithmetic is sufficiently accurate for most scientific applications. However, for a rapidly growing body of important scientific computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion effort. This paper presents a survey of recent applications of these techniques and provides some analysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, scattering amplitudes of quarks, gluons and bosons, nonlinear oscillator theory, Ising theory, quantum field theory and experimental mathematics. We conclude that high-precision arithmetic facilities are now an indispensable component of a modern large-scale scientific computing environment.
High performance hybrid magnetic structure for biotechnology applications

Science.gov (United States)

Humphries, David E [El Cerrito, CA; Pollard, Martin J [El Cerrito, CA; Elkin, Christopher J [San Ramon, CA

2009-02-03

The present disclosure provides a high performance hybrid magnetic structure made from a combination of permanent magnets and ferromagnetic pole materials which are assembled in a predetermined array. The hybrid magnetic structure provides means for separation and other biotechnology applications involving holding, manipulation, or separation of magnetic or magnetizable molecular structures and targets. Also disclosed are further improvements to aspects of the hybrid magnetic structure, including additional elements and for adapting the use of the hybrid magnetic structure for use in biotechnology and high throughput processes.
Information Power Grid: Distributed High-Performance Computing and Large-Scale Data Management for Science and Engineering

Science.gov (United States)

Johnston, William E.; Gannon, Dennis; Nitzberg, Bill

2000-01-01

We use the term "Grid" to refer to distributed, high performance computing and data handling infrastructure that incorporates geographically and organizationally dispersed, heterogeneous resources that are persistent and supported. This infrastructure includes: (1) Tools for constructing collaborative, application oriented Problem Solving Environments / Frameworks (the primary user interfaces for Grids); (2) Programming environments, tools, and services providing various approaches for building applications that use aggregated computing and storage resources, and federated data sources; (3) Comprehensive and consistent set of location independent tools and services for accessing and managing dynamic collections of widely distributed resources: heterogeneous computing systems, storage systems, real-time data sources and instruments, human collaborators, and communications systems; (4) Operational infrastructure including management tools for distributed systems and distributed resources, user services, accounting and auditing, strong and location independent user authentication and authorization, and overall system security services The vision for NASA's Information Power Grid - a computing and data Grid - is that it will provide significant new capabilities to scientists and engineers by facilitating routine construction of information based problem solving environments / frameworks. Such Grids will knit together widely distributed computing, data, instrument, and human resources into just-in-time systems that can address complex and large-scale computing and data analysis problems. Examples of these problems include: (1) Coupled, multidisciplinary simulations too large for single systems (e.g., multi-component NPSS turbomachine simulation); (2) Use of widely distributed, federated data archives (e.g., simultaneous access to metrological, topological, aircraft performance, and flight path scheduling databases supporting a National Air Space Simulation systems}; (3
High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications

KAUST Repository

Abdelfattah, Ahmad

2015-07-25

Leveraging optimization techniques (e.g., register blocking and double buffering) introduced in the context of KBLAS, a Level 2 BLAS high performance library on GPUs, the authors implement dense matrix-vector multiplications within a sparse-block structure. While these optimizations are important for high performance dense kernel executions, they are even more critical when dealing with sparse linear algebra operations. The most time-consuming phase of many multicomponent applications, such as models of reacting flows or petroleum reservoirs, is the solution at each implicit time step of large, sparse spatially structured or unstructured linear systems. The standard method is a preconditioned Krylov solver. The Sparse Matrix-Vector multiplication (SpMV) is, in turn, one of the most time-consuming operations in such solvers. Because there is no data reuse of the elements of the matrix within a single SpMV, kernel performance is limited by the speed at which data can be transferred from memory to registers, making the bus bandwidth the major bottleneck. On the other hand, in case of a multi-species model, the resulting Jacobian has a dense block structure. For contemporary petroleum reservoir simulations, the block size typically ranges from three to a few dozen among different models, and still larger blocks are relevant within adaptively model-refined regions of the domain, though generally the size of the blocks, related to the number of conserved species, is constant over large regions within a given model. This structure can be exploited beyond the convenience of a block compressed row data format, because it offers opportunities to hide the data motion with useful computations. The new SpMV kernel outperforms existing state-of-the-art implementations on single and multi-GPUs using matrices with dense block structure representative of porous media applications with both structured and unstructured multi-component grids.
The Convergence of High Performance Computing and Large Scale Data Analytics

Science.gov (United States)

Duffy, D.; Bowen, M. K.; Thompson, J. H.; Yang, C. P.; Hu, F.; Wills, B.

2015-12-01

As the combinations of remote sensing observations and model outputs have grown, scientists are increasingly burdened with both the necessity and complexity of large-scale data analysis. Scientists are increasingly applying traditional high performance computing (HPC) solutions to solve their "Big Data" problems. While this approach has the benefit of limiting data movement, the HPC system is not optimized to run analytics, which can create problems that permeate throughout the HPC environment. To solve these issues and to alleviate some of the strain on the HPC environment, the NASA Center for Climate Simulation (NCCS) has created the Advanced Data Analytics Platform (ADAPT), which combines both HPC and cloud technologies to create an agile system designed for analytics. Large, commonly used data sets are stored in this system in a write once/read many file system, such as Landsat, MODIS, MERRA, and NGA. High performance virtual machines are deployed and scaled according to the individual scientist's requirements specifically for data analysis. On the software side, the NCCS and GMU are working with emerging commercial technologies and applying them to structured, binary scientific data in order to expose the data in new ways. Native NetCDF data is being stored within a Hadoop Distributed File System (HDFS) enabling storage-proximal processing through MapReduce while continuing to provide accessibility of the data to traditional applications. Once the data is stored within HDFS, an additional indexing scheme is built on top of the data and placed into a relational database. This spatiotemporal index enables extremely fast mappings of queries to data locations to dramatically speed up analytics. These are some of the first steps toward a single unified platform that optimizes for both HPC and large-scale data analysis, and this presentation will elucidate the resulting and necessary exascale architectures required for future systems.
Application of High-performance Visual Analysis Methods to Laser Wakefield Particle Acceleration Data

International Nuclear Information System (INIS)

Rubel, Oliver; Prabhat, Mr.; Wu, Kesheng; Childs, Hank; Meredith, Jeremy; Geddes, Cameron G.R.; Cormier-Michel, Estelle; Ahern, Sean; Weber, Gunther H.; Messmer, Peter; Hagen, Hans; Hamann, Bernd; Bethel, E. Wes

2008-01-01

Our work combines and extends techniques from high-performance scientific data management and visualization to enable scientific researchers to gain insight from extremely large, complex, time-varying laser wakefield particle accelerator simulation data. We extend histogram-based parallel coordinates for use in visual information display as well as an interface for guiding and performing data mining operations, which are based upon multi-dimensional and temporal thresholding and data subsetting operations. To achieve very high performance on parallel computing platforms, we leverage FastBit, a state-of-the-art index/query technology, to accelerate data mining and multi-dimensional histogram computation. We show how these techniques are used in practice by scientific researchers to identify, visualize and analyze a particle beam in a large, time-varying dataset
High Performance Computing Facility Operational Assessment, FY 2011 Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Baker, Ann E [ORNL; Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL

2011-08-01

Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.5 billion core hours in calendar year (CY) 2010 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Scientific achievements by OLCF users range from collaboration with university experimentalists to produce a working supercapacitor that uses atom-thick sheets of carbon materials to finely determining the resolution requirements for simulations of coal gasifiers and their components, thus laying the foundation for development of commercial-scale gasifiers. OLCF users are pushing the boundaries with software applications sustaining more than one petaflop of performance in the quest to illuminate the fundamental nature of electronic devices. Other teams of researchers are working to resolve predictive capabilities of climate models, to refine and validate genome sequencing, and to explore the most fundamental materials in nature - quarks and gluons - and their unique properties. Details of these scientific endeavors - not possible without access to leadership-class computing resources - are detailed in Section 4 of this report and in the INCITE in Review. Effective operations of the OLCF play a key role in the scientific missions and accomplishments of its users. This Operational Assessment Report (OAR) will delineate the policies, procedures, and innovations implemented by the OLCF to continue delivering a petaflop-scale resource for cutting-edge research. The 2010 operational assessment of the OLCF yielded recommendations that have been addressed (Reference Section 1) and
A C++11 implementation of arbitrary-rank tensors for high-performance computing

Science.gov (United States)

Aragón, Alejandro M.

2014-11-01

This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation

Science.gov (United States)

Stocker, John C.; Golomb, Andrew M.

2011-01-01

Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.
Contributing to the design of run-time systems dedicated to high performance computing; Contribution a l'elaboration d'environnements de programmation dedies au calcul scientifique hautes performances

Energy Technology Data Exchange (ETDEWEB)

Perache, M

2006-10-15

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
Human and Robotic Space Mission Use Cases for High-Performance Spaceflight Computing

Science.gov (United States)

Some, Raphael; Doyle, Richard; Bergman, Larry; Whitaker, William; Powell, Wesley; Johnson, Michael; Goforth, Montgomery; Lowry, Michael

2013-01-01

Spaceflight computing is a key resource in NASA space missions and a core determining factor of spacecraft capability, with ripple effects throughout the spacecraft, end-to-end system, and mission. Onboard computing can be aptly viewed as a "technology multiplier" in that advances provide direct dramatic improvements in flight functions and capabilities across the NASA mission classes, and enable new flight capabilities and mission scenarios, increasing science and exploration return. Space-qualified computing technology, however, has not advanced significantly in well over ten years and the current state of the practice fails to meet the near- to mid-term needs of NASA missions. Recognizing this gap, the NASA Game Changing Development Program (GCDP), under the auspices of the NASA Space Technology Mission Directorate, commissioned a study on space-based computing needs, looking out 15-20 years. The study resulted in a recommendation to pursue high-performance spaceflight computing (HPSC) for next-generation missions, and a decision to partner with the Air Force Research Lab (AFRL) in this development.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

International Nuclear Information System (INIS)

Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-01-01

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

Energy Technology Data Exchange (ETDEWEB)

Oelerich, Jan Oliver, E-mail: jan.oliver.oelerich@physik.uni-marburg.de; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-06-15

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.
Implementation of Scientific Computing Applications on the Cell Broadband Engine

Directory of Open Access Journals (Sweden)

Guochun Shi

2009-01-01

Full Text Available The Cell Broadband Engine architecture is a revolutionary processor architecture well suited for many scientific codes. This paper reports on an effort to implement several traditional high-performance scientific computing applications on the Cell Broadband Engine processor, including molecular dynamics, quantum chromodynamics and quantum chemistry codes. The paper discusses data and code restructuring strategies necessary to adapt the applications to the intrinsic properties of the Cell processor and demonstrates performance improvements achieved on the Cell architecture. It concludes with the lessons learned and provides practical recommendations on optimization techniques that are believed to be most appropriate.

3D printed high performance strain sensors for high temperature applications

Science.gov (United States)

Rahman, Md Taibur; Moser, Russell; Zbib, Hussein M.; Ramana, C. V.; Panat, Rahul

2018-01-01

Realization of high temperature physical measurement sensors, which are needed in many of the current and emerging technologies, is challenging due to the degradation of their electrical stability by drift currents, material oxidation, thermal strain, and creep. In this paper, for the first time, we demonstrate that 3D printed sensors show a metamaterial-like behavior, resulting in superior performance such as high sensitivity, low thermal strain, and enhanced thermal stability. The sensors were fabricated using silver (Ag) nanoparticles (NPs), using an advanced Aerosol Jet based additive printing method followed by thermal sintering. The sensors were tested under cyclic strain up to a temperature of 500 °C and showed a gauge factor of 3.15 ± 0.086, which is about 57% higher than that of those available commercially. The sensor thermal strain was also an order of magnitude lower than that of commercial gages for operation up to a temperature of 500 °C. An analytical model was developed to account for the enhanced performance of such printed sensors based on enhanced lateral contraction of the NP films due to the porosity, a behavior akin to cellular metamaterials. The results demonstrate the potential of 3D printing technology as a pathway to realize highly stable and high-performance sensors for high temperature applications.
Optical packet switching in HPC : an analysis of applications performance

NARCIS (Netherlands)

Meyer, Hugo; Sancho, Jose Carlos; Mrdakovic, Milica; Miao, Wang; Calabretta, Nicola

2018-01-01

Optical Packet Switches (OPS) could provide the needed low latency transmissions in today large data centers. OPS can deliver lower latency and higher bandwidth than traditional electrical switches. These features are needed for parallel High Performance Computing (HPC) applications. For this
An Interactive, Web-based High Performance Modeling Environment for Computational Epidemiology.

Science.gov (United States)

Deodhar, Suruchi; Bisset, Keith R; Chen, Jiangzhuo; Ma, Yifei; Marathe, Madhav V

2014-07-01

We present an integrated interactive modeling environment to support public health epidemiology. The environment combines a high resolution individual-based model with a user-friendly web-based interface that allows analysts to access the models and the analytics back-end remotely from a desktop or a mobile device. The environment is based on a loosely-coupled service-oriented-architecture that allows analysts to explore various counter factual scenarios. As the modeling tools for public health epidemiology are getting more sophisticated, it is becoming increasingly hard for non-computational scientists to effectively use the systems that incorporate such models. Thus an important design consideration for an integrated modeling environment is to improve ease of use such that experimental simulations can be driven by the users. This is achieved by designing intuitive and user-friendly interfaces that allow users to design and analyze a computational experiment and steer the experiment based on the state of the system. A key feature of a system that supports this design goal is the ability to start, stop, pause and roll-back the disease propagation and intervention application process interactively. An analyst can access the state of the system at any point in time and formulate dynamic interventions based on additional information obtained through state assessment. In addition, the environment provides automated services for experiment set-up and management, thus reducing the overall time for conducting end-to-end experimental studies. We illustrate the applicability of the system by describing computational experiments based on realistic pandemic planning scenarios. The experiments are designed to demonstrate the system's capability and enhanced user productivity.
Running Batch Jobs on Peregrine | High-Performance Computing | NREL

Science.gov (United States)

and run your application. Users typically create or edit job scripts using a text editor such as vi Using Resource Feature to Request Different Node Types Peregrine has several types of compute nodes , which differ in the amount of memory and number of processor cores. The majority of the nodes have 24
Human performance models for computer-aided engineering

Science.gov (United States)

Elkind, Jerome I. (Editor); Card, Stuart K. (Editor); Hochberg, Julian (Editor); Huey, Beverly Messick (Editor)

1989-01-01

This report discusses a topic important to the field of computational human factors: models of human performance and their use in computer-based engineering facilities for the design of complex systems. It focuses on a particular human factors design problem -- the design of cockpit systems for advanced helicopters -- and on a particular aspect of human performance -- vision and related cognitive functions. By focusing in this way, the authors were able to address the selected topics in some depth and develop findings and recommendations that they believe have application to many other aspects of human performance and to other design domains.
High performance protection circuit for power electronics applications

Energy Technology Data Exchange (ETDEWEB)

Tudoran, Cristian D., E-mail: cristian.tudoran@itim-cj.ro; Dădârlat, Dorin N.; Toşa, Nicoleta; Mişan, Ioan [National Institute for Research and Development of Isotopic and Molecular Technologies, 67-103 Donat, PO 5 Box 700, 400293 Cluj-Napoca (Romania)

2015-12-23

In this paper we present a high performance protection circuit designed for the power electronics applications where the load currents can increase rapidly and exceed the maximum allowed values, like in the case of high frequency induction heating inverters or high frequency plasma generators. The protection circuit is based on a microcontroller and can be adapted for use on single-phase or three-phase power systems. Its versatility comes from the fact that the circuit can communicate with the protected system, having the role of a “sensor” or it can interrupt the power supply for protection, in this case functioning as an external, independent protection circuit.
High Performance Spaceflight Computing (HPSC)

Data.gov (United States)

National Aeronautics and Space Administration — Space-based computing has not kept up with the needs of current and future NASA missions. We are developing a next-generation flight computing system that addresses...
Building and measuring a high performance network architecture

Energy Technology Data Exchange (ETDEWEB)

Kramer, William T.C.; Toole, Timothy; Fisher, Chuck; Dugan, Jon; Wheeler, David; Wing, William R; Nickless, William; Goddard, Gregory; Corbato, Steven; Love, E. Paul; Daspit, Paul; Edwards, Hal; Mercer, Linden; Koester, David; Decina, Basil; Dart, Eli; Paul Reisinger, Paul; Kurihara, Riki; Zekauskas, Matthew J; Plesset, Eric; Wulf, Julie; Luce, Douglas; Rogers, James; Duncan, Rex; Mauth, Jeffery

2001-04-20

Once a year, the SC conferences present a unique opportunity to create and build one of the most complex and highest performance networks in the world. At SC2000, large-scale and complex local and wide area networking connections were demonstrated, including large-scale distributed applications running on different architectures. This project was designed to use the unique opportunity presented at SC2000 to create a testbed network environment and then use that network to demonstrate and evaluate high performance computational and communication applications. This testbed was designed to incorporate many interoperable systems and services and was designed for measurement from the very beginning. The end results were key insights into how to use novel, high performance networking technologies and to accumulate measurements that will give insights into the networks of the future.
High-Precision Computation: Mathematical Physics and Dynamics

International Nuclear Information System (INIS)

Bailey, D.H.; Barrio, R.; Borwein, J.M.

2010-01-01

At the present time, IEEE 64-bit oating-point arithmetic is suficiently accurate for most scientic applications. However, for a rapidly growing body of important scientic computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion e ort. This pa- per presents a survey of recent applications of these techniques and provides someanalysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, studies of the one structure constant, scattering amplitudes of quarks, glu- ons and bosons, nonlinear oscillator theory, experimental mathematics, evaluation of orthogonal polynomials, numerical integration of ODEs, computation of periodic orbits, studies of the splitting of separatrices, detection of strange nonchaotic at- tractors, Ising theory, quantum held theory, and discrete dynamical systems. We conclude that high-precision arithmetic facilities are now an indispensable compo- nent of a modern large-scale scientic computing environment.
High-Precision Computation: Mathematical Physics and Dynamics

Energy Technology Data Exchange (ETDEWEB)

Bailey, D. H.; Barrio, R.; Borwein, J. M.

2010-04-01

At the present time, IEEE 64-bit oating-point arithmetic is suficiently accurate for most scientic applications. However, for a rapidly growing body of important scientic computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion e ort. This pa- per presents a survey of recent applications of these techniques and provides someanalysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, studies of the one structure constant, scattering amplitudes of quarks, glu- ons and bosons, nonlinear oscillator theory, experimental mathematics, evaluation of orthogonal polynomials, numerical integration of ODEs, computation of periodic orbits, studies of the splitting of separatrices, detection of strange nonchaotic at- tractors, Ising theory, quantum held theory, and discrete dynamical systems. We conclude that high-precision arithmetic facilities are now an indispensable compo- nent of a modern large-scale scientic computing environment.
Strengthening LLNL Missions through Laboratory Directed Research and Development in High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Willis, D. K. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-12-01

High performance computing (HPC) has been a defining strength of Lawrence Livermore National Laboratory (LLNL) since its founding. Livermore scientists have designed and used some of the world’s most powerful computers to drive breakthroughs in nearly every mission area. Today, the Laboratory is recognized as a world leader in the application of HPC to complex science, technology, and engineering challenges. Most importantly, HPC has been integral to the National Nuclear Security Administration’s (NNSA’s) Stockpile Stewardship Program—designed to ensure the safety, security, and reliability of our nuclear deterrent without nuclear testing. A critical factor behind Lawrence Livermore’s preeminence in HPC is the ongoing investments made by the Laboratory Directed Research and Development (LDRD) Program in cutting-edge concepts to enable efficient utilization of these powerful machines. Congress established the LDRD Program in 1991 to maintain the technical vitality of the Department of Energy (DOE) national laboratories. Since then, LDRD has been, and continues to be, an essential tool for exploring anticipated needs that lie beyond the planning horizon of our programs and for attracting the next generation of talented visionaries. Through LDRD, Livermore researchers can examine future challenges, propose and explore innovative solutions, and deliver creative approaches to support our missions. The present scientific and technical strengths of the Laboratory are, in large part, a product of past LDRD investments in HPC. Here, we provide seven examples of LDRD projects from the past decade that have played a critical role in building LLNL’s HPC, computer science, mathematics, and data science research capabilities, and describe how they have impacted LLNL’s mission.
Research and Application of New Type of High Performance Titanium Alloy

Directory of Open Access Journals (Sweden)

ZHU Zhishou

2016-06-01

Full Text Available With the continuous extension of the application quantity and range for titanium alloy in the fields of national aviation, space, weaponry, marine and chemical industry, etc., even more critical requirements to the comprehensive mechanical properties, low cost and process technological properties of titanium alloy have been raised. Through the alloying based on the microstructure parameters design, and the comprehensive strengthening and toughening technologies of fine grain strengthening, phase transformation and process control of high toughening, the new type of high performance titanium alloy which has good comprehensive properties of high strength and toughness, anti-fatigue, failure resistance and anti-impact has been researched and manufactured. The new titanium alloy has extended the application quantity and application level in the high end field, realized the industrial upgrading and reforming, and met the application requirements of next generation equipment.
Polymer waveguides for electro-optical integration in data centers and high-performance computers.

Science.gov (United States)

Dangel, Roger; Hofrichter, Jens; Horst, Folkert; Jubin, Daniel; La Porta, Antonio; Meier, Norbert; Soganci, Ibrahim Murat; Weiss, Jonas; Offrein, Bert Jan

2015-02-23

To satisfy the intra- and inter-system bandwidth requirements of future data centers and high-performance computers, low-cost low-power high-throughput optical interconnects will become a key enabling technology. To tightly integrate optics with the computing hardware, particularly in the context of CMOS-compatible silicon photonics, optical printed circuit boards using polymer waveguides are considered as a formidable platform. IBM Research has already demonstrated the essential silicon photonics and interconnection building blocks. A remaining challenge is electro-optical packaging, i.e., the connection of the silicon photonics chips with the system. In this paper, we present a new single-mode polymer waveguide technology and a scalable method for building the optical interface between silicon photonics chips and single-mode polymer waveguides.
A Linux Workstation for High Performance Graphics

Science.gov (United States)

Geist, Robert; Westall, James

2000-01-01

The primary goal of this effort was to provide a low-cost method of obtaining high-performance 3-D graphics using an industry standard library (OpenGL) on PC class computers. Previously, users interested in doing substantial visualization or graphical manipulation were constrained to using specialized, custom hardware most often found in computers from Silicon Graphics (SGI). We provided an alternative to expensive SGI hardware by taking advantage of third-party, 3-D graphics accelerators that have now become available at very affordable prices. To make use of this hardware our goal was to provide a free, redistributable, and fully-compatible OpenGL work-alike library so that existing bodies of code could simply be recompiled. for PC class machines running a free version of Unix. This should allow substantial cost savings while greatly expanding the population of people with access to a serious graphics development and viewing environment. This should offer a means for NASA to provide a spectrum of graphics performance to its scientists, supplying high-end specialized SGI hardware for high-performance visualization while fulfilling the requirements of medium and lower performance applications with generic, off-the-shelf components and still maintaining compatibility between the two.
Building highly available control system applications with Advanced Telecom Computing Architecture and open standards

International Nuclear Information System (INIS)

Kazakov, Artem; Furukawa, Kazuro

2010-01-01

Requirements for modern and future control systems for large projects like International Linear Collider demand high availability for control system components. Recently telecom industry came up with a great open hardware specification - Advanced Telecom Computing Architecture (ATCA). This specification is aimed for better reliability, availability and serviceability. Since its first market appearance in 2004, ATCA platform has shown tremendous growth and proved to be stable and well represented by a number of vendors. ATCA is an industry standard for highly available systems. On the other hand Service Availability Forum, a consortium of leading communications and computing companies, describes interaction between hardware and software. SAF defines a set of specifications such as Hardware Platform Interface, Application Interface Specification. SAF specifications provide extensive description of highly available systems, services and their interfaces. Originally aimed for telecom applications, these specifications can be used for accelerator controls software as well. This study describes benefits of using these specifications and their possible adoption to accelerator control systems. It is demonstrated how EPICS Redundant IOC was extended using Hardware Platform Interface specification, which made it possible to utilize benefits of the ATCA platform.
High Performance Computing and Storage Requirements for Nuclear Physics: Target 2017

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2014-04-30

In April 2014, NERSC, ASCR, and the DOE Office of Nuclear Physics (NP) held a review to characterize high performance computing (HPC) and storage requirements for NP research through 2017. This review is the 12th in a series of reviews held by NERSC and Office of Science program offices that began in 2009. It is the second for NP, and the final in the second round of reviews that covered the six Office of Science program offices. This report is the result of that review
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time. © 2011 Springer-Verlag.
High-performance noncontact thermal diode via asymmetric nanostructures

Science.gov (United States)

Shen, Jiadong; Liu, Xianglei; He, Huan; Wu, Weitao; Liu, Baoan

2018-05-01

Electric diodes, though laying the foundation of modern electronics and information processing industries, suffer from ineffectiveness and even failure at high temperatures. Thermal diodes are promising alternatives to relieve above limitations, but usually possess low rectification ratios, and how to obtain a high-performance thermal rectification effect is still an open question. This paper proposes an efficient contactless thermal diode based on the near-field thermal radiation of asymmetric doped silicon nanostructures. The rectification ratio computed via exact scattering theories is demonstrated to be as high as 10 at a nanoscale gap distance and period, outperforming the counterpart flat-plate diode by more than one order of magnitude. This extraordinary performance mainly lies in the higher forward and lower reverse radiative heat flux within the low frequency band compared with the counterpart flat-plate diode, which is caused by a lower loss and smaller cut-off wavevector of nanostructures for the forward and reversed scheme, respectively. This work opens new routes to realize high performance thermal diodes, and may have wide applications in efficient thermal computing, thermal information processing, and thermal management.
Study on the application of mobile internet cloud computing platform

Science.gov (United States)

Gong, Songchun; Fu, Songyin; Chen, Zheng

2012-04-01

The innovative development of computer technology promotes the application of the cloud computing platform, which actually is the substitution and exchange of a sort of resource service models and meets the needs of users on the utilization of different resources after changes and adjustments of multiple aspects. "Cloud computing" owns advantages in many aspects which not merely reduce the difficulties to apply the operating system and also make it easy for users to search, acquire and process the resources. In accordance with this point, the author takes the management of digital libraries as the research focus in this paper, and analyzes the key technologies of the mobile internet cloud computing platform in the operation process. The popularization and promotion of computer technology drive people to create the digital library models, and its core idea is to strengthen the optimal management of the library resource information through computers and construct an inquiry and search platform with high performance, allowing the users to access to the necessary information resources at any time. However, the cloud computing is able to promote the computations within the computers to distribute in a large number of distributed computers, and hence implement the connection service of multiple computers. The digital libraries, as a typical representative of the applications of the cloud computing, can be used to carry out an analysis on the key technologies of the cloud computing.
Guide to cloud computing for business and technology managers from distributed computing to cloudware applications

CERN Document Server

Kale, Vivek

2014-01-01

Guide to Cloud Computing for Business and Technology Managers: From Distributed Computing to Cloudware Applications unravels the mystery of cloud computing and explains how it can transform the operating contexts of business enterprises. It provides a clear understanding of what cloud computing really means, what it can do, and when it is practical to use. Addressing the primary management and operation concerns of cloudware, including performance, measurement, monitoring, and security, this pragmatic book:Introduces the enterprise applications integration (EAI) solutions that were a first ste

Performability Modelling Tools, Evaluation Techniques and Applications

NARCIS (Netherlands)

Haverkort, Boudewijn R.H.M.

1990-01-01

This thesis deals with three aspects of quantitative evaluation of fault-tolerant and distributed computer and communication systems: performability evaluation techniques, performability modelling tools, and performability modelling applications. Performability modelling is a relatively new
High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Packet-Level Analysis

Science.gov (United States)

2015-09-01

individual fragments using the hash-based method. In general, fragments 6 appear in order and relatively close to each other in the file. A fragment...data product derived from the data model is shown in Fig. 5, a Google Earth12 Keyhole Markup Language (KML) file. This product includes aggregate...System BLOb binary large object FPGA field-programmable gate array HPC high-performance computing IP Internet Protocol KML Keyhole Markup Language
Practical clinical applications of the computer in nuclear medicine

International Nuclear Information System (INIS)

Price, R.R.; Erickson, J.J.; Patton, J.A.; Jones, J.P.; Lagan, J.E.; Rollo, F.D.

1978-01-01

The impact of the computer on the practice of nuclear medicine has been felt primarily in the area of rapid dynamic studies. At this time it is difficult to find a clinic which routinely performs computer processing of static images. The general purpose digital computer is a sophisticated and flexible instrument. The number of applications for which one can use the computer to augment data acquisition, analysis, or display is essentially unlimited. In this light, the purpose of this exhibit is not to describe all possible applications of the computer in nuclear medicine but rather to illustrate those applications which have generally been accepted as practical in the routine clinical environment. Specifically, we have chosen examples of computer augmented cardiac, and renal function studies as well as examples of relative organ blood flow studies. In addition, a short description of basic computer components and terminology along with a few examples of non-imaging applications are presented
Solving Problems in Various Domains by Hybrid Models of High Performance Computations

Directory of Open Access Journals (Sweden)

Yurii Rogozhin

2014-03-01

Full Text Available This work presents a hybrid model of high performance computations. The model is based on membrane system (P~system where some membranes may contain quantum device that is triggered by the data entering the membrane. This model is supposed to take advantages of both biomolecular and quantum paradigms and to overcome some of their inherent limitations. The proposed approach is demonstrated through two selected problems: SAT, and image retrieving.
FY 1996 Blue Book: High Performance Computing and Communications: Foundations for America`s Information Future

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
FY 1997 Blue Book: High Performance Computing and Communications: Advancing the Frontiers of Information Technology

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
2003 Conference for Computing in High Energy and Nuclear Physics

International Nuclear Information System (INIS)

Schalk, T.

2003-01-01

The conference was subdivided into the follow separate tracks. Electronic presentations and/or videos are provided on the main website link. Sessions: Plenary Talks and Panel Discussion; Grid Architecture, Infrastructure, and Grid Security; HENP Grid Applications, Testbeds, and Demonstrations; HENP Computing Systems and Infrastructure; Monitoring; High Performance Networking; Data Acquisition, Triggers and Controls; First Level Triggers and Trigger Hardware; Lattice Gauge Computing; HENP Software Architecture and Software Engineering; Data Management and Persistency; Data Analysis Environment and Visualization; Simulation and Modeling; and Collaboration Tools and Information Systems
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Wirth, Brian D., E-mail: bdwirth@utk.edu [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Nuclear Science and Engineering Directorate, Oak Ridge National Laboratory, Oak Ridge, TN (United States); Hammond, K.D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Krasheninnikov, S.I. [University of California, San Diego, La Jolla, CA (United States); Maroudas, D. [University of Massachusetts, Amherst, Amherst, MA 01003 (United States)

2015-08-15

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification.
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

International Nuclear Information System (INIS)

Wirth, Brian D.; Hammond, K.D.; Krasheninnikov, S.I.; Maroudas, D.

2015-01-01

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

International Nuclear Information System (INIS)

Desai, Ajit; Pettit, Chris; Poirel, Dominique; Sarkar, Abhijit

2017-01-01

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.
High performance computations using dynamical nucleation theory

International Nuclear Information System (INIS)

Windus, T L; Crosby, L D; Kathmann, S M

2008-01-01

Chemists continue to explore the use of very large computations to perform simulations that describe the molecular level physics of critical challenges in science. In this paper, we describe the Dynamical Nucleation Theory Monte Carlo (DNTMC) model - a model for determining molecular scale nucleation rate constants - and its parallel capabilities. The potential for bottlenecks and the challenges to running on future petascale or larger resources are delineated. A 'master-slave' solution is proposed to scale to the petascale and will be developed in the NWChem software. In addition, mathematical and data analysis challenges are described
Contributing to the design of run-time systems dedicated to high performance computing; Contribution a l'elaboration d'environnements de programmation dedies au calcul scientifique hautes performances

Energy Technology Data Exchange (ETDEWEB)

Perache, M

2006-10-15

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
Profiling an application for power consumption during execution on a compute node

Science.gov (United States)

Archer, Charles J; Blocksome, Michael A; Peters, Amanda E; Ratterman, Joseph D; Smith, Brian E

2013-09-17

Methods, apparatus, and products are disclosed for profiling an application for power consumption during execution on a compute node that include: receiving an application for execution on a compute node; identifying a hardware power consumption profile for the compute node, the hardware power consumption profile specifying power consumption for compute node hardware during performance of various processing operations; determining a power consumption profile for the application in dependence upon the application and the hardware power consumption profile for the compute node; and reporting the power consumption profile for the application.
High-performance file I/O in Java : existing approaches and bulk I/O extensions.

Energy Technology Data Exchange (ETDEWEB)

Bonachea, D.; Dickens, P.; Thakur, R.; Mathematics and Computer Science; Univ. of California at Berkeley; Illinois Institute of Technology

2001-07-01

There is a growing interest in using Java as the language for developing high-performance computing applications. To be successful in the high-performance computing domain, however, Java must not only be able to provide high computational performance, but also high-performance I/O. In this paper, we first examine several approaches that attempt to provide high-performance I/O in Java - many of which are not obvious at first glance - and evaluate their performance on two parallel machines, the IBM SP and the SGI Origin2000. We then propose extensions to the Java I/O library that address the deficiencies in the Java I/O API and improve performance dramatically. The extensions add bulk (array) I/O operations to Java, thereby removing much of the overhead currently associated with array I/O in Java. We have implemented the extensions in two ways: in a standard JVM using the Java Native Interface (JNI) and in a high-performance parallel dialect of Java called Titanium. We describe the two implementations and present performance results that demonstrate the benefits of the proposed extensions.
GPU computing and applications

CERN Document Server

See, Simon

2015-01-01

This book presents a collection of state of the art research on GPU Computing and Application. The major part of this book is selected from the work presented at the 2013 Symposium on GPU Computing and Applications held in Nanyang Technological University, Singapore (Oct 9, 2013). Three major domains of GPU application are covered in the book including (1) Engineering design and simulation; (2) Biomedical Sciences; and (3) Interactive & Digital Media. The book also addresses the fundamental issues in GPU computing with a focus on big data processing. Researchers and developers in GPU Computing and Applications will benefit from this book. Training professionals and educators can also benefit from this book to learn the possible application of GPU technology in various areas.
High-Throughput Computing on High-Performance Platforms: A Case Study

Energy Technology Data Exchange (ETDEWEB)

Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Matteo, Turilli [Rutgers University; Angius, Alessio [Rutgers University; Oral, H Sarp [ORNL; De, K [University of Texas at Arlington; Klimentov, A [Brookhaven National Laboratory (BNL); Wells, Jack C. [ORNL; Jha, S [Rutgers University

2017-10-01

The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan -- a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.
Optimal dynamic performance for high-precision actuators/stages

International Nuclear Information System (INIS)

Preissner, C.; Lee, S.-H.; Royston, T. J.; Shu, D.

2002-01-01

System dynamic performance of actuator/stage groups, such as those found in optical instrument positioning systems and other high-precision applications, is dependent upon both individual component behavior and the system configuration. Experimental modal analysis techniques were implemented to determine the six degree of freedom stiffnesses and damping for individual actuator components. These experimental data were then used in a multibody dynamic computer model to investigate the effect of stage group configuration. Running the computer model through the possible stage configurations and observing the predicted vibratory response determined the optimal stage group configuration. Configuration optimization can be performed for any group of stages, provided there is stiffness and damping data available for the constituent pieces
Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Science.gov (United States)

Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

2013-11-05

Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
High performance computing, supercomputing, náročné počítání

Czech Academy of Sciences Publication Activity Database

Okrouhlík, Miloslav

2003-01-01

Roč. 10, č. 5 (2003), s. 429-438 ISSN 1210-2717 R&D Projects: GA ČR GA101/02/0072 Institutional research plan: CEZ:AV0Z2076919 Keywords : high performance computing * vector and parallel computers * programing tools for parellelization Subject RIV: BI - Acoustics
LFK, FORTRAN Application Performance Test

International Nuclear Information System (INIS)

McMahon, F.H.

1991-01-01

1 - Description of program or function: LFK, the Livermore FORTRAN Kernels, is a computer performance test that measures a realistic floating-point performance range for FORTRAN applications. Informally known as the Livermore Loops test, the LFK test may be used as a computer performance test, as a test of compiler accuracy (via checksums) and efficiency, or as a hardware endurance test. The LFK test, which focuses on FORTRAN as used in computational physics, measures the joint performance of the computer CPU, the compiler, and the computational structures in units of Mega-flops/sec or Mflops. A C language version of subroutine KERNEL is also included which executes 24 samples of C numerical computation. The 24 kernels are a hydrodynamics code fragment, a fragment from an incomplete Cholesky conjugate gradient code, the standard inner product function of linear algebra, a fragment from a banded linear equations routine, a segment of a tridiagonal elimination routine, an example of a general linear recurrence equation, an equation of state fragment, part of an alternating direction implicit integration code, an integrate predictor code, a difference predictor code, a first sum, a first difference, a fragment from a two-dimensional particle-in-cell code, a part of a one-dimensional particle-in-cell code, an example of how casually FORTRAN can be written, a Monte Carlo search loop, an example of an implicit conditional computation, a fragment of a two-dimensional explicit hydrodynamics code, a general linear recurrence equation, part of a discrete ordinates transport program, a simple matrix calculation, a segment of a Planck distribution procedure, a two-dimensional implicit hydrodynamics fragment, and determination of the location of the first minimum in an array. 2 - Method of solution: CPU performance rates depend strongly on the maturity of FORTRAN compiler machine code optimization. The LFK test-bed executes the set of 24 kernels three times, resetting the DO

Platform computing

CERN Multimedia

2002-01-01

"Platform Computing releases first grid-enabled workload management solution for IBM eServer Intel and UNIX high performance computing clusters. This Out-of-the-box solution maximizes the performance and capability of applications on IBM HPC clusters" (1/2 page) .
Performance evaluation of scientific programs on advanced architecture computers

International Nuclear Information System (INIS)

Walker, D.W.; Messina, P.; Baille, C.F.

1988-01-01

Recently a number of advanced architecture machines have become commercially available. These new machines promise better cost-performance then traditional computers, and some of them have the potential of competing with current supercomputers, such as the Cray X/MP, in terms of maximum performance. This paper describes an on-going project to evaluate a broad range of advanced architecture computers using a number of complete scientific application programs. The computers to be evaluated include distributed- memory machines such as the NCUBE, INTEL and Caltech/JPL hypercubes, and the MEIKO computing surface, shared-memory, bus architecture machines such as the Sequent Balance and the Alliant, very long instruction word machines such as the Multiflow Trace 7/200 computer, traditional supercomputers such as the Cray X.MP and Cray-2, and SIMD machines such as the Connection Machine. Currently 11 application codes from a number of scientific disciplines have been selected, although it is not intended to run all codes on all machines. Results are presented for two of the codes (QCD and missile tracking), and future work is proposed
International Conference on Frontiers of Intelligent Computing : Theory and Applications

CERN Document Server

Bhateja, Vikrant; Udgata, Siba; Pattnaik, Prasant

2017-01-01

The book is a collection of high-quality peer-reviewed research papers presented at International Conference on Frontiers of Intelligent Computing: Theory and applications (FICTA 2016) held at School of Computer Engineering, KIIT University, Bhubaneswar, India during 16 – 17 September 2016. The book presents theories, methodologies, new ideas, experiences and applications in all areas of intelligent computing and its applications to various engineering disciplines like computer science, electronics, electrical and mechanical engineering.
Requirements for high performance computing for lattice QCD. Report of the ECFA working panel

International Nuclear Information System (INIS)

Jegerlehner, F.; Kenway, R.D.; Martinelli, G.; Michael, C.; Pene, O.; Petersson, B.; Petronzio, R.; Sachrajda, C.T.; Schilling, K.

2000-01-01

This report, prepared at the request of the European Committee for Future Accelerators (ECFA), contains an assessment of the High Performance Computing resources which will be required in coming years by European physicists working in Lattice Field Theory and a review of the scientific opportunities which these resources would open. (orig.)
High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

CERN Document Server

Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

2014-01-01

The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment
Department of Energy: MICS (Mathematical Information, and Computational Sciences Division). High performance computing and communications program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-06-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, {open_quotes}The DOE Program in HPCC{close_quotes}), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW). The information pointed to by the URL is updated frequently, and the interested reader is urged to access the WWW for the latest information.
Applications of symbolic algebraic computation

International Nuclear Information System (INIS)

Brown, W.S.; Hearn, A.C.

1979-01-01

This paper is a survey of applications of systems for symbomic algebraic computation. In most successful applications, calculations that can be taken to a given order by hand are then extended one or two more orders by computer. Furthermore, with a few notable exceptins, these applications also involve numerical computation in some way. Therefore the authors emphasize the interface between symbolic and numerical computation, including: 1. Computations with both symbolic and numerical phases. 2. Data involving both the unpredictible size and shape that typify symbolic computation and the (usually inexact) numerical values that characterize numerical computation. 3. Applications of one field to the other. It is concluded that the fields of symbolic and numerical computation can advance most fruitfully in harmony rather than in competition. (Auth.)
Analysis and Modeling of Social In uence in High Performance Computing Workloads

KAUST Repository

Zheng, Shuai

2011-06-01

High Performance Computing (HPC) is becoming a common tool in many research areas. Social influence (e.g., project collaboration) among increasing users of HPC systems creates bursty behavior in underlying workloads. This bursty behavior is increasingly common with the advent of grid computing and cloud computing. Mining the user bursty behavior is important for HPC workloads prediction and scheduling, which has direct impact on overall HPC computing performance. A representative work in this area is the Mixed User Group Model (MUGM), which clusters users according to the resource demand features of their submissions, such as duration time and parallelism. However, MUGM has some difficulties when implemented in real-world system. First, representing user behaviors by the features of their resource demand is usually difficult. Second, these features are not always available. Third, measuring the similarities among users is not a well-defined problem. In this work, we propose a Social Influence Model (SIM) to identify, analyze, and quantify the level of social influence across HPC users. The advantage of the SIM model is that it finds HPC communities by analyzing user job submission time, thereby avoiding the difficulties of MUGM. An offline algorithm and a fast-converging, computationally-efficient online learning algorithm for identifying social groups are proposed. Both offline and online algorithms are applied on several HPC and grid workloads, including Grid 5000, EGEE 2005 and 2007, and KAUST Supercomputing Lab (KSL) BGP data. From the experimental results, we show the existence of a social graph, which is characterized by a pattern of dominant users and followers. In order to evaluate the effectiveness of identified user groups, we show the pattern discovered by the offline algorithm follows a power-law distribution, which is consistent with those observed in mainstream social networks. We finally conclude the thesis and discuss future directions of our work.
High performance stream computing for particle beam transport simulations

International Nuclear Information System (INIS)

Appleby, R; Bailey, D; Higham, J; Salt, M

2008-01-01

Understanding modern particle accelerators requires simulating charged particle transport through the machine elements. These simulations can be very time consuming due to the large number of particles and the need to consider many turns of a circular machine. Stream computing offers an attractive way to dramatically improve the performance of such simulations by calculating the simultaneous transport of many particles using dedicated hardware. Modern Graphics Processing Units (GPUs) are powerful and affordable stream computing devices. The results of simulations of particle transport through the booster-to-storage-ring transfer line of the DIAMOND synchrotron light source using an NVidia GeForce 7900 GPU are compared to the standard transport code MAD. It is found that particle transport calculations are suitable for stream processing and large performance increases are possible. The accuracy and potential speed gains are compared and the prospects for future work in the area are discussed
Precision ring rolling technique and application in high-performance bearing manufacturing

Directory of Open Access Journals (Sweden)

Hua Lin

2015-01-01

Full Text Available High-performance bearing has significant application in many important industry fields, like automobile, precision machine tool, wind power, etc. Precision ring rolling is an advanced rotary forming technique to manufacture high-performance seamless bearing ring thus can improve the working life of bearing. In this paper, three kinds of precision ring rolling techniques adapt to different dimensional ranges of bearings are introduced, which are cold ring rolling for small-scale bearing, hot radial ring rolling for medium-scale bearing and hot radial-axial ring rolling for large-scale bearing. The forming principles, technological features and forming equipments for three kinds of precision ring rolling techniques are summarized, the technological development and industrial application in China are introduced, and the main technological development trend is described.
FPGAs in High Perfomance Computing: Results from Two LDRD Projects.

Energy Technology Data Exchange (ETDEWEB)

Underwood, Keith D; Ulmer, Craig D.; Thompson, David; Hemmert, Karl Scott

2006-11-01

Field programmable gate arrays (FPGAs) have been used as alternative computational de-vices for over a decade; however, they have not been used for traditional scientific com-puting due to their perceived lack of floating-point performance. In recent years, there hasbeen a surge of interest in alternatives to traditional microprocessors for high performancecomputing. Sandia National Labs began two projects to determine whether FPGAs wouldbe a suitable alternative to microprocessors for high performance scientific computing and,if so, how they should be integrated into the system. We present results that indicate thatFPGAs could have a significant impact on future systems. FPGAs have thepotentialtohave order of magnitude levels of performance wins on several key algorithms; however,there are serious questions as to whether the system integration challenge can be met. Fur-thermore, there remain challenges in FPGA programming and system level reliability whenusing FPGA devices.4 AcknowledgmentArun Rodrigues provided valuable support and assistance in the use of the Structural Sim-ulation Toolkit within an FPGA context. Curtis Janssen and Steve Plimpton provided valu-able insights into the workings of two Sandia applications (MPQC and LAMMPS, respec-tively).5
Application of computers in a Radiological Survey Program

International Nuclear Information System (INIS)

Berven, B.A.; Blair, M.S.; Doane, R.W.; Little, C.A.; Perdue, P.T.

1984-01-01

A brief description of some of the applications of computers in a radiological survey program is presented. It has been our experience that computers and computer software have allowed our staff personnel to more productively use their time by using computers to perform the mechanical acquisition, analyses, and storage of data. It is hoped that other organizations may similarly profit from this experience. This effort will ultimately minimize errors and reduce program costs
A Comparative Study of Multi-material Data Structures for Computational Physics Applications

Energy Technology Data Exchange (ETDEWEB)

Garimella, Rao Veerabhadra [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Robey, Robert W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2017-01-31

The data structures used to represent the multi-material state of a computational physics application can have a drastic impact on the performance of the application. We look at efficient data structures for sparse applications where there may be many materials, but only one or few in most computational cells. We develop simple performance models for use in selecting possible data structures and programming patterns. We verify the analytic models of performance through a small test program of the representative cases.
A computational study of high entropy alloys

Science.gov (United States)

Wang, Yang; Gao, Michael; Widom, Michael; Hawk, Jeff

2013-03-01

As a new class of advanced materials, high-entropy alloys (HEAs) exhibit a wide variety of excellent materials properties, including high strength, reasonable ductility with appreciable work-hardening, corrosion and oxidation resistance, wear resistance, and outstanding diffusion-barrier performance, especially at elevated and high temperatures. In this talk, we will explain our computational approach to the study of HEAs that employs the Korringa-Kohn-Rostoker coherent potential approximation (KKR-CPA) method. The KKR-CPA method uses Green's function technique within the framework of multiple scattering theory and is uniquely designed for the theoretical investigation of random alloys from the first principles. The application of the KKR-CPA method will be discussed as it pertains to the study of structural and mechanical properties of HEAs. In particular, computational results will be presented for AlxCoCrCuFeNi (x = 0, 0.3, 0.5, 0.8, 1.0, 1.3, 2.0, 2.8, and 3.0), and these results will be compared with experimental information from the literature.
High Performance Computing for Solving Fractional Differential Equations with Applications

OpenAIRE

Zhang, Wei

2014-01-01

Fractional calculus is the generalization of integer-order calculus to rational order. This subject has at least three hundred years of history. However, it was traditionally regarded as a pure mathematical field and lacked real world applications for a very long time. In recent decades, fractional calculus has re-attracted the attention of scientists and engineers. For example, many researchers have found that fractional calculus is a useful tool for describing hereditary materials and p...
CloudMC: a cloud computing application for Monte Carlo simulation.

Science.gov (United States)

Miras, H; Jiménez, R; Miras, C; Gomà, C

2013-04-21

This work presents CloudMC, a cloud computing application-developed in Windows Azure®, the platform of the Microsoft® cloud-for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based-the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice.
Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

2016-02-27

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Performance management of high performance computing for medical image processing in Amazon Web Services

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

2016-03-01

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL

Science.gov (United States)

Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

2016-01-01

Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
THE IMPROVEMENT OF COMPUTER NETWORK PERFORMANCE WITH BANDWIDTH MANAGEMENT IN KEMURNIAN II SENIOR HIGH SCHOOL

Directory of Open Access Journals (Sweden)

Bayu Kanigoro

2012-05-01

Full Text Available This research describes the improvement of computer network performance with bandwidth management in Kemurnian II Senior High School. The main issue of this research is the absence of bandwidth division on computer, which makes user who is downloading data, the provided bandwidth will be absorbed by the user. It leads other users do not get the bandwidth. Besides that, it has been done IP address division on each room, such as computer, teacher and administration room for supporting learning process in Kemurnian II Senior High School, so wireless network is needed. The method is location observation and interview with related parties in Kemurnian II Senior High School, the network analysis has run and designed a new topology network including the wireless network along with its configuration and separation bandwidth on microtic router and its limitation. The result is network traffic on Kemurnian II Senior High School can be shared evenly to each user; IX and IIX traffic are separated, which improve the speed on network access at school and the implementation of wireless network.Keywords: Bandwidth Management; Wireless Network

WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

Energy Technology Data Exchange (ETDEWEB)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K. [Cray Inc., St. Paul, MN 55101 (United States); Porter, D. [Minnesota Supercomputing Institute for Advanced Computational Research, Minneapolis, MN USA (United States); O’Neill, B. J.; Nolting, C.; Donnert, J. M. F.; Jones, T. W. [School of Physics and Astronomy, University of Minnesota, Minneapolis, MN 55455 (United States); Edmon, P., E-mail: pjm@cray.com, E-mail: nradclif@cray.com, E-mail: kkandalla@cray.com, E-mail: oneill@astro.umn.edu, E-mail: nolt0040@umn.edu, E-mail: donnert@ira.inaf.it, E-mail: twj@umn.edu, E-mail: dhp@umn.edu, E-mail: pedmon@cfa.harvard.edu [Institute for Theory and Computation, Center for Astrophysics, Harvard University, Cambridge, MA 02138 (United States)

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

International Nuclear Information System (INIS)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O’Neill, B. J.; Nolting, C.; Donnert, J. M. F.; Jones, T. W.; Edmon, P.

2017-01-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

2017-05-15

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
Application of secondary ion mass spectrometry for the characterization of commercial high performance materials

International Nuclear Information System (INIS)

Gritsch, M.

2000-09-01

The industry today offers an uncounted number of high performance materials, that have to meet highest standards. Commercial high performance materials, though often sold in large quantities, still require ongoing research and development to keep up to date with increasing needs and decreasing tolerances. Furthermore, a variety of materials is on the market that are not fully understood in their microstructure, in the way they react under application conditions, and in which mechanisms are responsible for their degradation. Secondary Ion Mass Spectrometry (SIMS) is an analytical method that is now in commercial use for over 30 years. Its main advantages are the very high detection sensitivity (down to ppb), the ability to measure all elements with isotopic sensitivity, the ability of gaining laterally resolved images, and the inherent capability of depth-profiling. These features make it an ideal tool for a wide field of applications within advanced material science. The present work gives an introduction into the principles of SIMS and shows the successful application for the characterization of commercially used high performance materials. Finally, a selected collection of my publications in reviewed journals will illustrate the state of the art in applied materials research and development with dynamic SIMS. All publications focus on the application of dynamic SIMS to analytical questions that stem from questions arising during the production and improvement of high-performance materials. (author)
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

Energy Technology Data Exchange (ETDEWEB)

Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-09-01

As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.
A parallel calibration utility for WRF-Hydro on high performance computers

Science.gov (United States)

Wang, J.; Wang, C.; Kotamarthi, V. R.

2017-12-01

A successful modeling of complex hydrological processes comprises establishing an integrated hydrological model which simulates the hydrological processes in each water regime, calibrates and validates the model performance based on observation data, and estimates the uncertainties from different sources especially those associated with parameters. Such a model system requires large computing resources and often have to be run on High Performance Computers (HPC). The recently developed WRF-Hydro modeling system provides a significant advancement in the capability to simulate regional water cycles more completely. The WRF-Hydro model has a large range of parameters such as those in the input table files — GENPARM.TBL, SOILPARM.TBL and CHANPARM.TBL — and several distributed scaling factors such as OVROUGHRTFAC. These parameters affect the behavior and outputs of the model and thus may need to be calibrated against the observations in order to obtain a good modeling performance. Having a parameter calibration tool specifically for automate calibration and uncertainty estimates of WRF-Hydro model can provide significant convenience for the modeling community. In this study, we developed a customized tool using the parallel version of the model-independent parameter estimation and uncertainty analysis tool, PEST, to enabled it to run on HPC with PBS and SLURM workload manager and job scheduler. We also developed a series of PEST input file templates that are specifically for WRF-Hydro model calibration and uncertainty analysis. Here we will present a flood case study occurred in April 2013 over Midwest. The sensitivity and uncertainties are analyzed using the customized PEST tool we developed.
Monitoring performance of a highly distributed and complex computing infrastructure in LHCb

Science.gov (United States)

Mathe, Z.; Haen, C.; Stagni, F.

2017-10-01

In order to ensure an optimal performance of the LHCb Distributed Computing, based on LHCbDIRAC, it is necessary to be able to inspect the behavior over time of many components: firstly the agents and services on which the infrastructure is built, but also all the computing tasks and data transfers that are managed by this infrastructure. This consists of recording and then analyzing time series of a large number of observables, for which the usage of SQL relational databases is far from optimal. Therefore within DIRAC we have been studying novel possibilities based on NoSQL databases (ElasticSearch, OpenTSDB and InfluxDB) as a result of this study we developed a new monitoring system based on ElasticSearch. It has been deployed on the LHCb Distributed Computing infrastructure for which it collects data from all the components (agents, services, jobs) and allows creating reports through Kibana and a web user interface, which is based on the DIRAC web framework. In this paper we describe this new implementation of the DIRAC monitoring system. We give details on the ElasticSearch implementation within the DIRAC general framework, as well as an overview of the advantages of the pipeline aggregation used for creating a dynamic bucketing of the time series. We present the advantages of using the ElasticSearch DSL high-level library for creating and running queries. Finally we shall present the performances of that system.
High performance in software development

CERN Multimedia

CERN. Geneva; Haapio, Petri; Liukkonen, Juha-Matti

2015-01-01

What are the ingredients of high-performing software? Software development, especially for large high-performance systems, is one the most complex tasks mankind has ever tried. Technological change leads to huge opportunities but challenges our old ways of working. Processing large data sets, possibly in real time or with other tight computational constraints, requires an efficient solution architecture. Efficiency requirements span from the distributed storage and large-scale organization of computation and data onto the lowest level of processor and data bus behavior. Integrating performance behavior over these levels is especially important when the computation is resource-bounded, as it is in numerics: physical simulation, machine learning, estimation of statistical models, etc. For example, memory locality and utilization of vector processing are essential for harnessing the computing power of modern processor architectures due to the deep memory hierarchies of modern general-purpose computers. As a r...
Routing performance analysis and optimization within a massively parallel computer

Science.gov (United States)

Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

2013-04-16

An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.
High temperature superconductors applications in telecommunications

Energy Technology Data Exchange (ETDEWEB)

Kumar, A.A.; Li, J.; Zhang, M.F. [Prairie View A& M Univ., Texas (United States)

1994-12-31

The purpose of this paper is twofold: to discuss high temperature superconductors with specific reference to their employment in telecommunications applications; and to discuss a few of the limitations of the normally employed two-fluid model. While the debate on the actual usage of high temperature superconductors in the design of electronic and telecommunications devices-obvious advantages versus practical difficulties-needs to be settled in the near future, it is of great interest to investigate the parameters and the assumptions that will be employed in such designs. This paper deals with the issue of providing the microwave design engineer with performance data for such superconducting waveguides. The values of conductivity and surface resistance, which are the primary determining factors of a waveguide performance, are computed based on the two-fluid model. A comparison between two models-a theoretical one in terms of microscopic parameters (termed Model A) and an experimental fit in terms of macroscopic parameters (termed Model B)-shows the limitations and the resulting ambiguities of the two-fluid model at high frequencies and at temperatures close to the transition temperature. The validity of the two-fluid model is then discussed. Our preliminary results show that the electrical transport description in the normal and superconducting phases as they are formulated in the two-fluid model needs to be modified to incorporate the new and special features of high temperature superconductors. Parameters describing the waveguide performance-conductivity, surface resistance and attenuation constant-will be computed. Potential applications in communications networks and large scale integrated circuits will be discussed. Some of the ongoing work will be reported. In particular, a brief proposal is made to investigate of the effects of electromagnetic interference and the concomitant notion of electromagnetic compatibility (EMI/EMC) of high T{sub c} superconductors.
High temperature superconductors applications in telecommunications

International Nuclear Information System (INIS)

Kumar, A.A.; Li, J.; Zhang, M.F.

1994-01-01

The purpose of this paper is twofold: to discuss high temperature superconductors with specific reference to their employment in telecommunications applications; and to discuss a few of the limitations of the normally employed two-fluid model. While the debate on the actual usage of high temperature superconductors in the design of electronic and telecommunications devices-obvious advantages versus practical difficulties-needs to be settled in the near future, it is of great interest to investigate the parameters and the assumptions that will be employed in such designs. This paper deals with the issue of providing the microwave design engineer with performance data for such superconducting waveguides. The values of conductivity and surface resistance, which are the primary determining factors of a waveguide performance, are computed based on the two-fluid model. A comparison between two models-a theoretical one in terms of microscopic parameters (termed Model A) and an experimental fit in terms of macroscopic parameters (termed Model B)-shows the limitations and the resulting ambiguities of the two-fluid model at high frequencies and at temperatures close to the transition temperature. The validity of the two-fluid model is then discussed. Our preliminary results show that the electrical transport description in the normal and superconducting phases as they are formulated in the two-fluid model needs to be modified to incorporate the new and special features of high temperature superconductors. Parameters describing the waveguide performance-conductivity, surface resistance and attenuation constant-will be computed. Potential applications in communications networks and large scale integrated circuits will be discussed. Some of the ongoing work will be reported. In particular, a brief proposal is made to investigate of the effects of electromagnetic interference and the concomitant notion of electromagnetic compatibility (EMI/EMC) of high T c superconductors
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

Science.gov (United States)

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
Infrastructure for Multiphysics Software Integration in High Performance Computing-Aided Science and Engineering

Energy Technology Data Exchange (ETDEWEB)

Campbell, Michael T. [Illinois Rocstar LLC, Champaign, IL (United States); Safdari, Masoud [Illinois Rocstar LLC, Champaign, IL (United States); Kress, Jessica E. [Illinois Rocstar LLC, Champaign, IL (United States); Anderson, Michael J. [Illinois Rocstar LLC, Champaign, IL (United States); Horvath, Samantha [Illinois Rocstar LLC, Champaign, IL (United States); Brandyberry, Mark D. [Illinois Rocstar LLC, Champaign, IL (United States); Kim, Woohyun [Illinois Rocstar LLC, Champaign, IL (United States); Sarwal, Neil [Illinois Rocstar LLC, Champaign, IL (United States); Weisberg, Brian [Illinois Rocstar LLC, Champaign, IL (United States)

2016-10-15

The project described in this report constructed and exercised an innovative multiphysics coupling toolkit called the Illinois Rocstar MultiPhysics Application Coupling Toolkit (IMPACT). IMPACT is an open source, flexible, natively parallel infrastructure for coupling multiple uniphysics simulation codes into multiphysics computational systems. IMPACT works with codes written in several high-performance-computing (HPC) programming languages, and is designed from the beginning for HPC multiphysics code development. It is designed to be minimally invasive to the individual physics codes being integrated, and has few requirements on those physics codes for integration. The goal of IMPACT is to provide the support needed to enable coupling existing tools together in unique and innovative ways to produce powerful new multiphysics technologies without extensive modification and rewrite of the physics packages being integrated. There are three major outcomes from this project: 1) construction, testing, application, and open-source release of the IMPACT infrastructure, 2) production of example open-source multiphysics tools using IMPACT, and 3) identification and engagement of interested organizations in the tools and applications resulting from the project. This last outcome represents the incipient development of a user community and application echosystem being built using IMPACT. Multiphysics coupling standardization can only come from organizations working together to define needs and processes that span the space of necessary multiphysics outcomes, which Illinois Rocstar plans to continue driving toward. The IMPACT system, including source code, documentation, and test problems are all now available through the public gitHUB.org system to anyone interested in multiphysics code coupling. Many of the basic documents explaining use and architecture of IMPACT are also attached as appendices to this document. Online HTML documentation is available through the gitHUB site
The effects of laboratory inquire-based experiments and computer simulations on high school students‘ performance and cognitive load in physics teaching

Directory of Open Access Journals (Sweden)

Radulović Branka

2016-01-01

Full Text Available The main goal of this study was to examine the extent to which different teaching instructions focused on the application of laboratory inquire-based experiments (LIBEs and interactive computer based simulations (ICBSs improved understanding of physical contents in high school students, compared to traditional teaching approach. Additionally, the study examined how the applied instructions influenced students’ assessment of invested cognitive load. A convenience sample of this research included 187 high school students. A multiple-choice test of knowledge was used as a measuring instrument for the students’ performance. Each task in the test was followed by the five-point Likert-type scale for the evaluation of invested cognitive load. In addition to descriptive statistics, determination of significant differences in performance and cognitive load as well as the calculation of instructional efficiency of applied instructional design, computed one-factor analysis of variance and Tukey’s post-hoc test. The findings indicate that teaching instructions based on the use of LIBEs and ICBSs equally contribute to an increase in students’ performance and the reduction of cognitive load unlike traditional teaching of Physics. The results obtained by the students from the LIBEs and ICBSs groups for calculated instructional efficiency suggest that the applied teaching strategies represent effective teaching instructions. [Projekat Ministarstva nauke Republike Srbije, br. 179010: The Quality of Education System in Serbia from European Perspective
Polymorphous Computing Architecture (PCA) Application Benchmark 1: Three-Dimensional Radar Data Processing

National Research Council Canada - National Science Library

Lebak, J

2001-01-01

The DARPA Polymorphous Computing Architecture (PCA) program is building advanced computer architectures that can reorganize their computation and communication structures to achieve better overall application performance...
Running Interactive Jobs on Peregrine | High-Performance Computing | NREL

Science.gov (United States)

shell prompt, which allows users to execute commands and scripts as they would on the login nodes. Login performed on the compute nodes rather than on login nodes. This page provides instructions and examples of , start GUIs etc. and the commands will execute on that node instead of on the login node. The -V option
High-performance integrated field-effect transistor-based sensors

Energy Technology Data Exchange (ETDEWEB)

Adzhri, R., E-mail: adzhri@gmail.com [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Md Arshad, M.K., E-mail: mohd.khairuddin@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); School of Microelectronic Engineering (SoME), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Gopinath, Subash C.B., E-mail: subash@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); School of Bioprocess Engineering (SBE), Universiti Malaysia Perlis (UniMAP), Arau, Perlis (Malaysia); Ruslinda, A.R., E-mail: ruslinda@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Fathil, M.F.M., E-mail: faris.fathil@gmail.com [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Ayub, R.M., E-mail: ramzan@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Nor, M. Nuzaihan Mohd, E-mail: m.nuzaihan@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia); Voon, C.H., E-mail: chvoon@unimap.edu.my [Institute of Nano Electronic Engineering (INEE), Universiti Malaysia Perlis (UniMAP), Kangar, Perlis (Malaysia)

2016-04-21

Field-effect transistors (FETs) have succeeded in modern electronics in an era of computers and hand-held applications. Currently, considerable attention has been paid to direct electrical measurements, which work by monitoring changes in intrinsic electrical properties. Further, FET-based sensing systems drastically reduce cost, are compatible with CMOS technology, and ease down-stream applications. Current technologies for sensing applications rely on time-consuming strategies and processes and can only be performed under recommended conditions. To overcome these obstacles, an overview is presented here in which we specifically focus on high-performance FET-based sensor integration with nano-sized materials, which requires understanding the interaction of surface materials with the surrounding environment. Therefore, we present strategies, material depositions, device structures and other characteristics involved in FET-based devices. Special attention was given to silicon and polyaniline nanowires and graphene, which have attracted much interest due to their remarkable properties in sensing applications. - Highlights: • Performance of FET-based biosensors for the detection of biomolecules is presented. • Silicon nanowire, polyaniline and graphene are the highlighted nanoscaled materials as sensing transducers. • The importance of surface material interaction with the surrounding environment is discussed. • Different device structure architectures for ease in fabrication and high sensitivity of sensing are presented.
High-performance integrated field-effect transistor-based sensors

International Nuclear Information System (INIS)

Adzhri, R.; Md Arshad, M.K.; Gopinath, Subash C.B.; Ruslinda, A.R.; Fathil, M.F.M.; Ayub, R.M.; Nor, M. Nuzaihan Mohd; Voon, C.H.

2016-01-01

Field-effect transistors (FETs) have succeeded in modern electronics in an era of computers and hand-held applications. Currently, considerable attention has been paid to direct electrical measurements, which work by monitoring changes in intrinsic electrical properties. Further, FET-based sensing systems drastically reduce cost, are compatible with CMOS technology, and ease down-stream applications. Current technologies for sensing applications rely on time-consuming strategies and processes and can only be performed under recommended conditions. To overcome these obstacles, an overview is presented here in which we specifically focus on high-performance FET-based sensor integration with nano-sized materials, which requires understanding the interaction of surface materials with the surrounding environment. Therefore, we present strategies, material depositions, device structures and other characteristics involved in FET-based devices. Special attention was given to silicon and polyaniline nanowires and graphene, which have attracted much interest due to their remarkable properties in sensing applications. - Highlights: • Performance of FET-based biosensors for the detection of biomolecules is presented. • Silicon nanowire, polyaniline and graphene are the highlighted nanoscaled materials as sensing transducers. • The importance of surface material interaction with the surrounding environment is discussed. • Different device structure architectures for ease in fabrication and high sensitivity of sensing are presented.
A task-based parallelism and vectorized approach to 3D Method of Characteristics (MOC) reactor simulation for high performance computing architectures

Science.gov (United States)

Tramm, John R.; Gunow, Geoffrey; He, Tim; Smith, Kord S.; Forget, Benoit; Siegel, Andrew R.

2016-05-01

In this study we present and analyze a formulation of the 3D Method of Characteristics (MOC) technique applied to the simulation of full core nuclear reactors. Key features of the algorithm include a task-based parallelism model that allows independent MOC tracks to be assigned to threads dynamically, ensuring load balancing, and a wide vectorizable inner loop that takes advantage of modern SIMD computer architectures. The algorithm is implemented in a set of highly optimized proxy applications in order to investigate its performance characteristics on CPU, GPU, and Intel Xeon Phi architectures. Speed, power, and hardware cost efficiencies are compared. Additionally, performance bottlenecks are identified for each architecture in order to determine the prospects for continued scalability of the algorithm on next generation HPC architectures.
Improving engineers' performance with computers

International Nuclear Information System (INIS)

Purvis, E.E. III

1984-01-01

The problem addressed is how to improve the performance of engineers in the design, operation, and maintenance of nuclear power plants. The application of computer science to this problem offers a challenge in maximizing the use of developments outside the nuclear industry and setting priorities to address the most fruitful areas first. Areas of potential benefits include data base management through design, analysis, procurement, construction, operation maintenance, cost, schedule and interface control and planning, and quality engineering on specifications, inspection, and training

Detecting Distributed Scans Using High-Performance Query-DrivenVisualization

Energy Technology Data Exchange (ETDEWEB)

Stockinger, Kurt; Bethel, E. Wes; Campbell, Scott; Dart, Eli; Wu,Kesheng

2006-09-01

Modern forensic analytics applications, like network trafficanalysis, perform high-performance hypothesis testing, knowledgediscovery and data mining on very large datasets. One essential strategyto reduce the time required for these operations is to select only themost relevant data records for a given computation. In this paper, wepresent a set of parallel algorithms that demonstrate how an efficientselection mechanism -- bitmap indexing -- significantly speeds up acommon analysist ask, namely, computing conditional histogram on verylarge datasets. We present a thorough study of the performancecharacteristics of the parallel conditional histogram algorithms. Asacase study, we compute conditional histograms for detecting distributedscans hidden in a dataset consisting of approximately 2.5 billion networkconnection records. We show that these conditional histograms can becomputed on interactive timescale (i.e., in seconds). We also show how toprogressively modify the selection criteria to narrow the analysis andfind the sources of the distributed scans.
Towards Process Support for Migrating Applications to Cloud Computing

DEFF Research Database (Denmark)

Chauhan, Muhammad Aufeef; Babar, Muhammad Ali

2012-01-01

Cloud computing is an active area of research for industry and academia. There are a large number of organizations providing cloud computing infrastructure and services. In order to utilize these infrastructure resources and services, existing applications need to be migrated to clouds. However...... for supporting migration to cloud computing based on our experiences from migrating an Open Source System (OSS), Hackystat, to two different cloud computing platforms. We explained the process by performing a comparative analysis of our efforts to migrate Hackystate to Amazon Web Services and Google App Engine....... We also report the potential challenges, suitable solutions, and lesson learned to support the presented process framework. We expect that the reported experiences can serve guidelines for those who intend to migrate software applications to cloud computing....
Industrial applications of computer tomography

International Nuclear Information System (INIS)

Sheng Kanglong; Qiang Yujun; Yang Fujia

1992-01-01

Industrial computer tomography (CT) and its application is a rapidly developing field of high technology. CT systems have been playing important roles in nondestructive testing (NDT) of products and equipment for a number of industries. Recently, the technique has advanced into the area of industrial process control, bringing even greater benefit to mankind. The basic principles and typical structure of an industrial CT system Descriptions are given of some successful CT systems for either NDT application or process control purposes
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

Energy Technology Data Exchange (ETDEWEB)

Qin, J; Bauer, M A, E-mail: qin.jinhui@gmail.com, E-mail: bauer@uwo.ca [Computer Science Department, University of Western Ontario, London, ON N6A 5B7 (Canada)

2010-11-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

International Nuclear Information System (INIS)

Qin, J; Bauer, M A

2010-01-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
High Performance Computing (HPC) Challenge (HPCC) Benchmark Suite Development

National Research Council Canada - National Science Library

Dongarra, J. J

2005-01-01

.... The applications of performance modeling are numerous, including evaluation of algorithms, optimization of code implementation, parallel library development, and comparison of system architectures...
Develop feedback system for intelligent dynamic resource allocation to improve application performance.

Energy Technology Data Exchange (ETDEWEB)

Gentile, Ann C.; Brandt, James M.; Tucker, Thomas (Open Grid Computing, Inc., Austin, TX); Thompson, David

2011-09-01

This report provides documentation for the completion of the Sandia Level II milestone 'Develop feedback system for intelligent dynamic resource allocation to improve application performance'. This milestone demonstrates the use of a scalable data collection analysis and feedback system that enables insight into how an application is utilizing the hardware resources of a high performance computing (HPC) platform in a lightweight fashion. Further we demonstrate utilizing the same mechanisms used for transporting data for remote analysis and visualization to provide low latency run-time feedback to applications. The ultimate goal of this body of work is performance optimization in the face of the ever increasing size and complexity of HPC systems.
SISYPHUS: A high performance seismic inversion factory

Science.gov (United States)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
Computers for Manned Space Applications Base on Commercial Off-the-Shelf Components

Science.gov (United States)

Vogel, T.; Gronowski, M.

2009-05-01

Similar to the consumer markets there has been an ever increasing demand in processing power, signal processing capabilities and memory space also for computers used for science data processing in space. An important driver of this development have been the payload developers for the International Space Station, requesting high-speed data acquisition and fast control loops in increasingly complex systems. Current experiments now even perform video processing and compression with their payload controllers. Nowadays the requirements for a space qualified computer are often far beyond the capabilities of, for example, the classic SPARC architecture that is found in ERC32 or LEON CPUs. An increase in performance usually demands costly and power consuming application specific solutions. Continuous developments over the last few years have now led to an alternative approach that is based on complete electronics modules manufactured for commercial and industrial customers. Computer modules used in industrial environments with a high demand for reliability under harsh environmental conditions like chemical reactors, electrical power plants or on manufacturing lines are entered into a selection procedure. Promising candidates then undergo a detailed characterisation process developed by Astrium Space Transportation. After thorough analysis and some modifications, these modules can replace fully qualified custom built electronics in specific, although not safety critical applications in manned space. This paper focuses on the benefits of COTS1 based electronics modules and the necessary analyses and modifications for their utilisation in manned space applications on the ISS. Some considerations regarding overall systems architecture will also be included. Furthermore this paper will also pinpoint issues that render such modules unsuitable for specific tasks, and justify the reasons. Finally, the conclusion of this paper will advocate the implementation of COTS based
Computer Applications in Educational Audiology.

Science.gov (United States)

Mendel, Lisa Lucks; And Others

1995-01-01

This article provides an overview of how computer technologies can be used by educational audiologists. Computer technologies are classified into three categories: (1) information systems applications; (2) screening and diagnostic applications; and (3) intervention applications. (Author/DB)
Symbolic computation and its application to high energy physics

International Nuclear Information System (INIS)

Hearn, A.C.

1981-01-01

It is clear that we are in the middle of an electronic revolution whose effect will be as profound as the industrial revolution. The continuing advances in computing technology will provide us with devices which will make present day computers appear primitive. In this environment, the algebraic and other non-mumerical capabilities of such devices will become increasingly important. These lectures will review the present state of the field of algebraic computation and its potential for problem solving in high energy physics and related areas. We shall begin with a brief description of the available systems and examine the data objects which they consider. As an example of the facilities which these systems can offer, we shall then consider the problem of analytic integration, since this is so fundamental to many of the calculational techniques used by high energy physicists. Finally, we shall study the implications which the current developments in hardware technology hold for scientific problem solving. (orig.)
Additive Manufacturing and High-Performance Computing: a Disruptive Latent Technology

Science.gov (United States)

Goodwin, Bruce

2015-03-01

This presentation will discuss the relationship between recent advances in Additive Manufacturing (AM) technology, High-Performance Computing (HPC) simulation and design capabilities, and related advances in Uncertainty Quantification (UQ), and then examines their impacts upon national and international security. The presentation surveys how AM accelerates the fabrication process, while HPC combined with UQ provides a fast track for the engineering design cycle. The combination of AM and HPC/UQ almost eliminates the engineering design and prototype iterative cycle, thereby dramatically reducing cost of production and time-to-market. These methods thereby present significant benefits for US national interests, both civilian and military, in an age of austerity. Finally, considering cyber security issues and the advent of the ``cloud,'' these disruptive, currently latent technologies may well enable proliferation and so challenge both nuclear and non-nuclear aspects of international security.
Predictive Performance Tuning of OpenACC Accelerated Applications

KAUST Repository

Siddiqui, Shahzeb

2014-05-04

Graphics Processing Units (GPUs) are gradually becoming mainstream in supercomputing as their capabilities to significantly accelerate a large spectrum of scientific applications have been clearly identified and proven. Moreover, with the introduction of high level programming models such as OpenACC [1] and OpenMP 4.0 [2], these devices are becoming more accessible and practical to use by a larger scientific community. However, performance optimization of OpenACC accelerated applications usually requires an in-depth knowledge of the hardware and software specifications. We suggest a prediction-based performance tuning mechanism [3] to quickly tune OpenACC parameters for a given application to dynamically adapt to the execution environment on a given system. This approach is applied to a finite difference kernel to tune the OpenACC gang and vector clauses for mapping the compute kernels into the underlying accelerator architecture. Our experiments show a significant performance improvement against the default compiler parameters and a faster tuning by an order of magnitude compared to the brute force search tuning.
GPU-based high performance Monte Carlo simulation in neutron transport

Energy Technology Data Exchange (ETDEWEB)

Heimlich, Adino; Mol, Antonio C.A.; Pereira, Claudio M.N.A. [Instituto de Engenharia Nuclear (IEN/CNEN-RJ), Rio de Janeiro, RJ (Brazil). Lab. de Inteligencia Artificial Aplicada], e-mail: cmnap@ien.gov.br

2009-07-01

Graphics Processing Units (GPU) are high performance co-processors intended, originally, to improve the use and quality of computer graphics applications. Since researchers and practitioners realized the potential of using GPU for general purpose, their application has been extended to other fields out of computer graphics scope. The main objective of this work is to evaluate the impact of using GPU in neutron transport simulation by Monte Carlo method. To accomplish that, GPU- and CPU-based (single and multicore) approaches were developed and applied to a simple, but time-consuming problem. Comparisons demonstrated that the GPU-based approach is about 15 times faster than a parallel 8-core CPU-based approach also developed in this work. (author)
GPU-based high performance Monte Carlo simulation in neutron transport

International Nuclear Information System (INIS)

Heimlich, Adino; Mol, Antonio C.A.; Pereira, Claudio M.N.A.

2009-01-01

Graphics Processing Units (GPU) are high performance co-processors intended, originally, to improve the use and quality of computer graphics applications. Since researchers and practitioners realized the potential of using GPU for general purpose, their application has been extended to other fields out of computer graphics scope. The main objective of this work is to evaluate the impact of using GPU in neutron transport simulation by Monte Carlo method. To accomplish that, GPU- and CPU-based (single and multicore) approaches were developed and applied to a simple, but time-consuming problem. Comparisons demonstrated that the GPU-based approach is about 15 times faster than a parallel 8-core CPU-based approach also developed in this work. (author)
Characterization of high performance silicon-based VMJ PV cells for laser power transmission applications

Science.gov (United States)

Perales, Mico; Yang, Mei-huan; Wu, Cheng-liang; Hsu, Chin-wei; Chao, Wei-sheng; Chen, Kun-hsien; Zahuranec, Terry

2016-03-01

Continuing improvements in the cost and power of laser diodes have been critical in launching the emerging fields of power over fiber (PoF), and laser power beaming. Laser power is transmitted either over fiber (for PoF), or through free space (power beaming), and is converted to electricity by photovoltaic cells designed to efficiently convert the laser light. MH GoPower's vertical multi-junction (VMJ) PV cell, designed for high intensity photovoltaic applications, is fueling the emergence of this market, by enabling unparalleled photovoltaic receiver flexibility in voltage, cell size, and power output. Our research examined the use of the VMJ PV cell for laser power transmission applications. We fully characterized the performance of the VMJ PV cell under various laser conditions, including multiple near IR wavelengths and light intensities up to tens of watts per cm2. Results indicated VMJ PV cell efficiency over 40% for 9xx nm wavelengths, at laser power densities near 30 W/cm2. We also investigated the impact of the physical dimensions (length, width, and height) of the VMJ PV cell on its performance, showing similarly high performance across a wide range of cell dimensions. We then evaluated the VMJ PV cell performance within the power over fiber application, examining the cell's effectiveness in receiver packages that deliver target voltage, intensity, and power levels. By designing and characterizing multiple receivers, we illustrated techniques for packaging the VMJ PV cell for achieving high performance (> 30%), high power (> 185 W), and target voltages for power over fiber applications.
Parallel Computing:. Some Activities in High Energy Physics

Science.gov (United States)

Willers, Ian

This paper examines some activities in High Energy Physics that utilise parallel computing. The topic includes all computing from the proposed SIMD front end detectors, the farming applications, high-powered RISC processors and the large machines in the computer centers. We start by looking at the motivation behind using parallelism for general purpose computing. The developments around farming are then described from its simplest form to the more complex system in Fermilab. Finally, there is a list of some developments that are happening close to the experiments.
Application of parallel computing techniques to a large-scale reservoir simulation

International Nuclear Information System (INIS)

Zhang, Keni; Wu, Yu-Shu; Ding, Chris; Pruess, Karsten

2001-01-01

Even with the continual advances made in both computational algorithms and computer hardware used in reservoir modeling studies, large-scale simulation of fluid and heat flow in heterogeneous reservoirs remains a challenge. The problem commonly arises from intensive computational requirement for detailed modeling investigations of real-world reservoirs. This paper presents the application of a massive parallel-computing version of the TOUGH2 code developed for performing large-scale field simulations. As an application example, the parallelized TOUGH2 code is applied to develop a three-dimensional unsaturated-zone numerical model simulating flow of moisture, gas, and heat in the unsaturated zone of Yucca Mountain, Nevada, a potential repository for high-level radioactive waste. The modeling approach employs refined spatial discretization to represent the heterogeneous fractured tuffs of the system, using more than a million 3-D gridblocks. The problem of two-phase flow and heat transfer within the model domain leads to a total of 3,226,566 linear equations to be solved per Newton iteration. The simulation is conducted on a Cray T3E-900, a distributed-memory massively parallel computer. Simulation results indicate that the parallel computing technique, as implemented in the TOUGH2 code, is very efficient. The reliability and accuracy of the model results have been demonstrated by comparing them to those of small-scale (coarse-grid) models. These comparisons show that simulation results obtained with the refined grid provide more detailed predictions of the future flow conditions at the site, aiding in the assessment of proposed repository performance
Computer-Related Task Performance

DEFF Research Database (Denmark)

Longstreet, Phil; Xiao, Xiao; Sarker, Saonee

2016-01-01

The existing information system (IS) literature has acknowledged computer self-efficacy (CSE) as an important factor contributing to enhancements in computer-related task performance. However, the empirical results of CSE on performance have not always been consistent, and increasing an individual......'s CSE is often a cumbersome process. Thus, we introduce the theoretical concept of self-prophecy (SP) and examine how this social influence strategy can be used to improve computer-related task performance. Two experiments are conducted to examine the influence of SP on task performance. Results show...... that SP and CSE interact to influence performance. Implications are then discussed in terms of organizations’ ability to increase performance....
Grid Computing Application for Brain Magnetic Resonance Image Processing

International Nuclear Information System (INIS)

Valdivia, F; Crépeault, B; Duchesne, S

2012-01-01

This work emphasizes the use of grid computing and web technology for automatic post-processing of brain magnetic resonance images (MRI) in the context of neuropsychiatric (Alzheimer's disease) research. Post-acquisition image processing is achieved through the interconnection of several individual processes into pipelines. Each process has input and output data ports, options and execution parameters, and performs single tasks such as: a) extracting individual image attributes (e.g. dimensions, orientation, center of mass), b) performing image transformations (e.g. scaling, rotation, skewing, intensity standardization, linear and non-linear registration), c) performing image statistical analyses, and d) producing the necessary quality control images and/or files for user review. The pipelines are built to perform specific sequences of tasks on the alphanumeric data and MRIs contained in our database. The web application is coded in PHP and allows the creation of scripts to create, store and execute pipelines and their instances either on our local cluster or on high-performance computing platforms. To run an instance on an external cluster, the web application opens a communication tunnel through which it copies the necessary files, submits the execution commands and collects the results. We present result on system tests for the processing of a set of 821 brain MRIs from the Alzheimer's Disease Neuroimaging Initiative study via a nonlinear registration pipeline composed of 10 processes. Our results show successful execution on both local and external clusters, and a 4-fold increase in performance if using the external cluster. However, the latter's performance does not scale linearly as queue waiting times and execution overhead increase with the number of tasks to be executed.

On the Performance of the Python Programming Language for Serial and Parallel Scientific Computations

Directory of Open Access Journals (Sweden)

Xing Cai

2005-01-01

Full Text Available This article addresses the performance of scientific applications that use the Python programming language. First, we investigate several techniques for improving the computational efficiency of serial Python codes. Then, we discuss the basic programming techniques in Python for parallelizing serial scientific applications. It is shown that an efficient implementation of the array-related operations is essential for achieving good parallel performance, as for the serial case. Once the array-related operations are efficiently implemented, probably using a mixed-language implementation, good serial and parallel performance become achievable. This is confirmed by a set of numerical experiments. Python is also shown to be well suited for writing high-level parallel programs.
Performance analysis of InSb based QWFET for ultra high speed applications

International Nuclear Information System (INIS)

Subash, T. D.; Gnanasekaran, T.; Divya, C.

2015-01-01

An indium antimonide based QWFET (quantum well field effect transistor) with the gate length down to 50 nm has been designed and investigated for the first time for L-band radar applications at 230 GHz. QWFETs are designed at the high performance node of the International Technology Road Map for Semiconductors (ITRS) requirements of drive current (Semiconductor Industry Association 2010). The performance of the device is investigated using the SYNOPSYS CAD (TCAD) software. InSb based QWFET could be a promising device technology for very low power and ultra-high speed performance with 5–10 times low DC power dissipation. (semiconductor devices)
Development and Performance of the Modularized, High-performance Computing and Hybrid-architecture Capable GEOS-Chem Chemical Transport Model

Science.gov (United States)

Long, M. S.; Yantosca, R.; Nielsen, J.; Linford, J. C.; Keller, C. A.; Payer Sulprizio, M.; Jacob, D. J.

2014-12-01

The GEOS-Chem global chemical transport model (CTM), used by a large atmospheric chemistry research community, has been reengineered to serve as a platform for a range of computational atmospheric chemistry science foci and applications. Development included modularization for coupling to general circulation and Earth system models (ESMs) and the adoption of co-processor capable atmospheric chemistry solvers. This was done using an Earth System Modeling Framework (ESMF) interface that operates independently of GEOS-Chem scientific code to permit seamless transition from the GEOS-Chem stand-alone serial CTM to deployment as a coupled ESM module. In this manner, the continual stream of updates contributed by the CTM user community is automatically available for broader applications, which remain state-of-science and directly referenceable to the latest version of the standard GEOS-Chem CTM. These developments are now available as part of the standard version of the GEOS-Chem CTM. The system has been implemented as an atmospheric chemistry module within the NASA GEOS-5 ESM. The coupled GEOS-5/GEOS-Chem system was tested for weak and strong scalability and performance with a tropospheric oxidant-aerosol simulation. Results confirm that the GEOS-Chem chemical operator scales efficiently for any number of processes. Although inclusion of atmospheric chemistry in ESMs is computationally expensive, the excellent scalability of the chemical operator means that the relative cost goes down with increasing number of processes, making fine-scale resolution simulations possible.
Profiling an application for power consumption during execution on a plurality of compute nodes

Science.gov (United States)

Archer, Charles J.; Blocksome, Michael A.; Peters, Amanda E.; Ratterman, Joseph D.; Smith, Brian E.

2012-08-21

Methods, apparatus, and products are disclosed for profiling an application for power consumption during execution on a compute node that include: receiving an application for execution on a compute node; identifying a hardware power consumption profile for the compute node, the hardware power consumption profile specifying power consumption for compute node hardware during performance of various processing operations; determining a power consumption profile for the application in dependence upon the application and the hardware power consumption profile for the compute node; and reporting the power consumption profile for the application.
Amorphous Zinc Oxide Integrated Wavy Channel Thin Film Transistor Based High Performance Digital Circuits

KAUST Repository

Hanna, Amir; Hussain, Aftab M.; Omran, Hesham; Alshareef, Sarah; Salama, Khaled N.; Hussain, Muhammad Mustafa

2015-01-01

High performance thin film transistor (TFT) can be a great driving force for display, sensor/actuator, integrated electronics, and distributed computation for Internet of Everything applications. While semiconducting oxides like zinc oxide (Zn
Hadoop-GIS: A High Performance Spatial Data Warehousing System over MapReduce.

Science.gov (United States)

Aji, Ablimit; Wang, Fusheng; Vo, Hoang; Lee, Rubao; Liu, Qiaoling; Zhang, Xiaodong; Saltz, Joel

2013-08-01

Support of high performance queries on large volumes of spatial data becomes increasingly important in many application domains, including geospatial problems in numerous fields, location based services, and emerging scientific applications that are increasingly data- and compute-intensive. The emergence of massive scale spatial data is due to the proliferation of cost effective and ubiquitous positioning technologies, development of high resolution imaging technologies, and contribution from a large number of community users. There are two major challenges for managing and querying massive spatial data to support spatial queries: the explosion of spatial data, and the high computational complexity of spatial queries. In this paper, we present Hadoop-GIS - a scalable and high performance spatial data warehousing system for running large scale spatial queries on Hadoop. Hadoop-GIS supports multiple types of spatial queries on MapReduce through spatial partitioning, customizable spatial query engine RESQUE, implicit parallel spatial query execution on MapReduce, and effective methods for amending query results through handling boundary objects. Hadoop-GIS utilizes global partition indexing and customizable on demand local spatial indexing to achieve efficient query processing. Hadoop-GIS is integrated into Hive to support declarative spatial queries with an integrated architecture. Our experiments have demonstrated the high efficiency of Hadoop-GIS on query response and high scalability to run on commodity clusters. Our comparative experiments have showed that performance of Hadoop-GIS is on par with parallel SDBMS and outperforms SDBMS for compute-intensive queries. Hadoop-GIS is available as a set of library for processing spatial queries, and as an integrated software package in Hive.
High Performance Fortran for Aerospace Applications

National Research Council Canada - National Science Library

Mehrotra, Piyush

2000-01-01

.... HPF is a set of Fortran extensions designed to provide users with a high-level interface for programming data parallel scientific applications while delegating to the compiler/runtime system the task...
Computer-aided engineering in High Energy Physics

International Nuclear Information System (INIS)

Bachy, G.; Hauviller, C.; Messerli, R.; Mottier, M.

1988-01-01

Computing, standard tool for a long time in the High Energy Physics community, is being slowly introduced at CERN in the mechanical engineering field. The first major application was structural analysis followed by Computer-Aided Design (CAD). Development work is now progressing towards Computer-Aided Engineering around a powerful data base. This paper gives examples of the power of this approach applied to engineering for accelerators and detectors
Proceedings of the 2011 2nd International Congress on Computer Applications and Computational Science

CERN Document Server

Nguyen, Quang

2012-01-01

The latest inventions in computer technology influence most of human daily activities. In the near future, there is tendency that all of aspect of human life will be dependent on computer applications. In manufacturing, robotics and automation have become vital for high quality products. In education, the model of teaching and learning is focusing more on electronic media than traditional ones. Issues related to energy savings and environment is becoming critical. Computational Science should enhance the quality of human life, not only solve their problems. Computational Science should help humans to make wise decisions by presenting choices and their possible consequences. Computational Science should help us make sense of observations, understand natural language, plan and reason with extensive background knowledge. Intelligence with wisdom is perhaps an ultimate goal for human-oriented science. This book is a compilation of some recent research findings in computer application and computational sci...
High Performance Multi-GPU SpMV for Multi-component PDE-Based Applications

KAUST Repository

Abdelfattah, Ahmad; Ltaief, Hatem; Keyes, David E.

2015-01-01

-block structure. While these optimizations are important for high performance dense kernel executions, they are even more critical when dealing with sparse linear algebra operations. The most time-consuming phase of many multicomponent applications, such as models
Low cost high performance uncertainty quantification

KAUST Repository

Bekas, C.

2009-01-01

Uncertainty quantification in risk analysis has become a key application. In this context, computing the diagonal of inverse covariance matrices is of paramount importance. Standard techniques, that employ matrix factorizations, incur a cubic cost which quickly becomes intractable with the current explosion of data sizes. In this work we reduce this complexity to quadratic with the synergy of two algorithms that gracefully complement each other and lead to a radically different approach. First, we turned to stochastic estimation of the diagonal. This allowed us to cast the problem as a linear system with a relatively small number of multiple right hand sides. Second, for this linear system we developed a novel, mixed precision, iterative refinement scheme, which uses iterative solvers instead of matrix factorizations. We demonstrate that the new framework not only achieves the much needed quadratic cost but in addition offers excellent opportunities for scaling at massively parallel environments. We based our implementation on BLAS 3 kernels that ensure very high processor performance. We achieved a peak performance of 730 TFlops on 72 BG/P racks, with a sustained performance 73% of theoretical peak. We stress that the techniques presented in this work are quite general and applicable to several other important applications. Copyright © 2009 ACM.
Analysis of scalability of high-performance 3D image processing platform for virtual colonoscopy.

Science.gov (United States)

Yoshida, Hiroyuki; Wu, Yin; Cai, Wenli

2014-03-19

One of the key challenges in three-dimensional (3D) medical imaging is to enable the fast turn-around time, which is often required for interactive or real-time response. This inevitably requires not only high computational power but also high memory bandwidth due to the massive amount of data that need to be processed. For this purpose, we previously developed a software platform for high-performance 3D medical image processing, called HPC 3D-MIP platform, which employs increasingly available and affordable commodity computing systems such as the multicore, cluster, and cloud computing systems. To achieve scalable high-performance computing, the platform employed size-adaptive, distributable block volumes as a core data structure for efficient parallelization of a wide range of 3D-MIP algorithms, supported task scheduling for efficient load distribution and balancing, and consisted of a layered parallel software libraries that allow image processing applications to share the common functionalities. We evaluated the performance of the HPC 3D-MIP platform by applying it to computationally intensive processes in virtual colonoscopy. Experimental results showed a 12-fold performance improvement on a workstation with 12-core CPUs over the original sequential implementation of the processes, indicating the efficiency of the platform. Analysis of performance scalability based on the Amdahl's law for symmetric multicore chips showed the potential of a high performance scalability of the HPC 3D-MIP platform when a larger number of cores is available.
Complex three dimensional modelling of porous media using high performance computing and multi-scale incompressible approach

Science.gov (United States)

Martin, R.; Orgogozo, L.; Noiriel, C. N.; Guibert, R.; Golfier, F.; Debenest, G.; Quintard, M.

2013-05-01

In the context of biofilm growth in porous media, we developed high performance computing tools to study the impact of biofilms on the fluid transport through pores of a solid matrix. Indeed, biofilms are consortia of micro-organisms that are developing in polymeric extracellular substances that are generally located at a fluid-solid interfaces like pore interfaces in a water-saturated porous medium. Several applications of biofilms in porous media are encountered for instance in bio-remediation methods by allowing the dissolution of organic pollutants. Many theoretical studies have been done on the resulting effective properties of these modified media ([1],[2], [3]) but the bio-colonized porous media under consideration are mainly described following simplified theoretical media (stratified media, cubic networks of spheres ...). Therefore, recent experimental advances have provided tomography images of bio-colonized porous media which allow us to observe realistic biofilm micro-structures inside the porous media [4]. To solve closure system of equations related to upscaling procedures in realistic porous media, we solve the velocity field of fluids through pores on complex geometries that are described with a huge number of cells (up to billions). Calculations are made on a realistic 3D sample geometry obtained by X micro-tomography. Cell volumes are coming from a percolation experiment performed to estimate the impact of precipitation processes on the properties of a fluid transport phenomena in porous media [5]. Average permeabilities of the sample are obtained from velocities by using MPI-based high performance computing on up to 1000 processors. Steady state Stokes equations are solved using finite volume approach. Relaxation pre-conditioning is introduced to accelerate the code further. Good weak or strong scaling are reached with results obtained in hours instead of weeks. Factors of accelerations of 20 up to 40 can be reached. Tens of geometries can now be
Neuromorphic Computing, Architectures, Models, and Applications. A Beyond-CMOS Approach to Future Computing, June 29-July 1, 2016, Oak Ridge, TN

Energy Technology Data Exchange (ETDEWEB)

Potok, Thomas [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Schuman, Catherine [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Patton, Robert [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Hylton, Todd [Brain Corporation, San Diego, CA (United States); Li, Hai [Univ. of Pittsburgh, PA (United States); Pino, Robinson [US Dept. of Energy, Washington, DC (United States)

2016-12-31

The White House and Department of Energy have been instrumental in driving the development of a neuromorphic computing program to help the United States continue its lead in basic research into (1) Beyond Exascale—high performance computing beyond Moore’s Law and von Neumann architectures, (2) Scientific Discovery—new paradigms for understanding increasingly large and complex scientific data, and (3) Emerging Architectures—assessing the potential of neuromorphic and quantum architectures. Neuromorphic computing spans a broad range of scientific disciplines from materials science to devices, to computer science, to neuroscience, all of which are required to solve the neuromorphic computing grand challenge. In our workshop we focus on the computer science aspects, specifically from a neuromorphic device through an application. Neuromorphic devices present a very different paradigm to the computer science community from traditional von Neumann architectures, which raises six major questions about building a neuromorphic application from the device level. We used these fundamental questions to organize the workshop program and to direct the workshop panels and discussions. From the white papers, presentations, panels, and discussions, there emerged several recommendations on how to proceed.
Applications of Computer Algebra Conference

CERN Document Server

Martínez-Moro, Edgar

2017-01-01

The Applications of Computer Algebra (ACA) conference covers a wide range of topics from Coding Theory to Differential Algebra to Quantam Computing, focusing on the interactions of these and other areas with the discipline of Computer Algebra. This volume provides the latest developments in the field as well as its applications in various domains, including communications, modelling, and theoretical physics. The book will appeal to researchers and professors of computer algebra, applied mathematics, and computer science, as well as to engineers and computer scientists engaged in research and development.
Computer applications in radiation protection

International Nuclear Information System (INIS)

Cole, P.R.; Moores, B.M.

1995-01-01

Computer applications in general and diagnostic radiology in particular are becoming more widespread. Their application to the field of radiation protection in medical imaging, including quality control initiatives, is similarly becoming more widespread. Advances in computer technology have enabled departments of diagnostic radiology to have access to powerful yet affordable personal computers. The application of databases, expert systems and computer-based learning is under way. The executive information systems for the management of dose and QA data that are under way at IRS are discussed. An important consideration in developing these pragmatic software tools has been the range of computer literacy within the end user group. Using interfaces have been specifically designed to reflect the requirements of many end users who will have little or no computer knowledge. (Author)
Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

KAUST Repository

Downes, Turlough P.; Roller, Sabine P.; Seitsonen, Ari Paavo; Valcke, Sophie; Keyes, David E.; Sawley, Marie Christine; Schulthess, Thomas C.; Shalf, John M.

2013-01-01

and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators
DEVELOPMENT OF NEW VALVE STEELS FOR APPLICATION IN HIGH PERFORMANCE ENGINES

Directory of Open Access Journals (Sweden)

Alexandre Bellegard Farina

2013-12-01

Full Text Available UNS N07751 and UNS N07080 alloys are commonly applied for automotive valves production for high performance internal combustion engines. These alloys present high hot resistance to mechanical strength, oxidation, corrosion, creep and microstructural stability. However, these alloys presents low wear resistance and high cost due to the high nickel contents. In this work it is presented the development of two new Ni-based alloys for application in high performance automotive valve as an alternative to the alloys UNS N07751 and UNS N07080. The new developed alloys are based on a high nickel-chromium austenitic matrix with dispersion of γ’ and γ’’ phases and containing different NbC contents. Due to the nickel content reduction in the developed alloys in comparison with these actually used alloys, the new alloys present an economical advantage for substitution of UNS N07751 and UNS N0780 alloys.
Intel Xeon Phi coprocessor high performance programming

CERN Document Server

Jeffers, James

2013-01-01

Authors Jim Jeffers and James Reinders spent two years helping educate customers about the prototype and pre-production hardware before Intel introduced the first Intel Xeon Phi coprocessor. They have distilled their own experiences coupled with insights from many expert customers, Intel Field Engineers, Application Engineers and Technical Consulting Engineers, to create this authoritative first book on the essentials of programming for this new architecture and these new products. This book is useful even before you ever touch a system with an Intel Xeon Phi coprocessor. To ensure that your applications run at maximum efficiency, the authors emphasize key techniques for programming any modern parallel computing system whether based on Intel Xeon processors, Intel Xeon Phi coprocessors, or other high performance microprocessors. Applying these techniques will generally increase your program performance on any system, and better prepare you for Intel Xeon Phi coprocessors and the Intel MIC architecture. It off...
Computational intelligence in digital forensics forensic investigation and applications

CERN Document Server

Choo, Yun-Huoy; Abraham, Ajith; Srihari, Sargur

2014-01-01

Computational Intelligence techniques have been widely explored in various domains including forensics. Analysis in forensic encompasses the study of pattern analysis that answer the question of interest in security, medical, legal, genetic studies and etc. However, forensic analysis is usually performed through experiments in lab which is expensive both in cost and time. Therefore, this book seeks to explore the progress and advancement of computational intelligence technique in different focus areas of forensic studies. This aims to build stronger connection between computer scientists and forensic field experts. This book, Computational Intelligence in Digital Forensics: Forensic Investigation and Applications, is the first volume in the Intelligent Systems Reference Library series. The book presents original research results and innovative applications of computational intelligence in digital forensics. This edited volume contains seventeen chapters and presents the latest state-of-the-art advancement ...

A methodology for performing computer security reviews

International Nuclear Information System (INIS)

Hunteman, W.J.

1991-01-01

DOE Order 5637.1, ''Classified Computer Security,'' requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, we have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system. 1 tab
A methodology for performing computer security reviews

International Nuclear Information System (INIS)

Hunteman, W.J.

1991-01-01

This paper reports on DIE Order 5637.1, Classified Computer Security, which requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, the authors have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system
WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

Directory of Open Access Journals (Sweden)

Parichit Sharma

Full Text Available The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture
WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

Science.gov (United States)

Sharma, Parichit; Mantri, Shrikant S

2014-01-01

The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design
Computational Approach for Securing Radiology-Diagnostic Data in Connected Health Network using High-Performance GPU-Accelerated AES.

Science.gov (United States)

Adeshina, A M; Hashim, R

2017-03-01

Diagnostic radiology is a core and integral part of modern medicine, paving ways for the primary care physicians in the disease diagnoses, treatments and therapy managements. Obviously, all recent standard healthcare procedures have immensely benefitted from the contemporary information technology revolutions, apparently revolutionizing those approaches to acquiring, storing and sharing of diagnostic data for efficient and timely diagnosis of diseases. Connected health network was introduced as an alternative to the ageing traditional concept in healthcare system, improving hospital-physician connectivity and clinical collaborations. Undoubtedly, the modern medicinal approach has drastically improved healthcare but at the expense of high computational cost and possible breach of diagnosis privacy. Consequently, a number of cryptographical techniques are recently being applied to clinical applications, but the challenges of not being able to successfully encrypt both the image and the textual data persist. Furthermore, processing time of encryption-decryption of medical datasets, within a considerable lower computational cost without jeopardizing the required security strength of the encryption algorithm, still remains as an outstanding issue. This study proposes a secured radiology-diagnostic data framework for connected health network using high-performance GPU-accelerated Advanced Encryption Standard. The study was evaluated with radiology image datasets consisting of brain MR and CT datasets obtained from the department of Surgery, University of North Carolina, USA, and the Swedish National Infrastructure for Computing. Sample patients' notes from the University of North Carolina, School of medicine at Chapel Hill were also used to evaluate the framework for its strength in encrypting-decrypting textual data in the form of medical report. Significantly, the framework is not only able to accurately encrypt and decrypt medical image datasets, but it also
High-Performance Secure Database Access Technologies for HEP Grids

Energy Technology Data Exchange (ETDEWEB)

Matthew Vranicar; John Weicher

2006-04-17

The Large Hadron Collider (LHC) at the CERN Laboratory will become the largest scientific instrument in the world when it starts operations in 2007. Large Scale Analysis Computer Systems (computational grids) are required to extract rare signals of new physics from petabytes of LHC detector data. In addition to file-based event data, LHC data processing applications require access to large amounts of data in relational databases: detector conditions, calibrations, etc. U.S. high energy physicists demand efficient performance of grid computing applications in LHC physics research where world-wide remote participation is vital to their success. To empower physicists with data-intensive analysis capabilities a whole hyperinfrastructure of distributed databases cross-cuts a multi-tier hierarchy of computational grids. The crosscutting allows separation of concerns across both the global environment of a federation of computational grids and the local environment of a physicist’s computer used for analysis. Very few efforts are on-going in the area of database and grid integration research. Most of these are outside of the U.S. and rely on traditional approaches to secure database access via an extraneous security layer separate from the database system core, preventing efficient data transfers. Our findings are shared by the Database Access and Integration Services Working Group of the Global Grid Forum, who states that "Research and development activities relating to the Grid have generally focused on applications where data is stored in files. However, in many scientific and commercial domains, database management systems have a central role in data storage, access, organization, authorization, etc, for numerous applications.” There is a clear opportunity for a technological breakthrough, requiring innovative steps to provide high-performance secure database access technologies for grid computing. We believe that an innovative database architecture where the
High-Performance Secure Database Access Technologies for HEP Grids

International Nuclear Information System (INIS)

Vranicar, Matthew; Weicher, John

2006-01-01

The Large Hadron Collider (LHC) at the CERN Laboratory will become the largest scientific instrument in the world when it starts operations in 2007. Large Scale Analysis Computer Systems (computational grids) are required to extract rare signals of new physics from petabytes of LHC detector data. In addition to file-based event data, LHC data processing applications require access to large amounts of data in relational databases: detector conditions, calibrations, etc. U.S. high energy physicists demand efficient performance of grid computing applications in LHC physics research where world-wide remote participation is vital to their success. To empower physicists with data-intensive analysis capabilities a whole hyperinfrastructure of distributed databases cross-cuts a multi-tier hierarchy of computational grids. The crosscutting allows separation of concerns across both the global environment of a federation of computational grids and the local environment of a physicist's computer used for analysis. Very few efforts are on-going in the area of database and grid integration research. Most of these are outside of the U.S. and rely on traditional approaches to secure database access via an extraneous security layer separate from the database system core, preventing efficient data transfers. Our findings are shared by the Database Access and Integration Services Working Group of the Global Grid Forum, who states that 'Research and development activities relating to the Grid have generally focused on applications where data is stored in files. However, in many scientific and commercial domains, database management systems have a central role in data storage, access, organization, authorization, etc, for numerous applications'. There is a clear opportunity for a technological breakthrough, requiring innovative steps to provide high-performance secure database access technologies for grid computing. We believe that an innovative database architecture where the secure
DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

2014-10-17

U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.
Security in Computer Applications

CERN Multimedia

CERN. Geneva

2004-01-01

Computer security has been an increasing concern for IT professionals for a number of years, yet despite all the efforts, computer systems and networks remain highly vulnerable to attacks of different kinds. Design flaws and security bugs in the underlying software are among the main reasons for this. This lecture addresses the following question: how to create secure software? The lecture starts with a definition of computer security and an explanation of why it is so difficult to achieve. It then introduces the main security principles (like least-privilege, or defense-in-depth) and discusses security in different phases of the software development cycle. The emphasis is put on the implementation part: most common pitfalls and security bugs are listed, followed by advice on best practice for security development. The last part of the lecture covers some miscellaneous issues like the use of cryptography, rules for networking applications, and social engineering threats. This lecture was first given on Thursd...
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Science.gov (United States)

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Directory of Open Access Journals (Sweden)

David K Brown

Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

Science.gov (United States)

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
Coordinated Fault-Tolerance for High-Performance Computing Final Project Report

Energy Technology Data Exchange (ETDEWEB)

Panda, Dhabaleswar Kumar [The Ohio State University; Beckman, Pete

2011-07-28

existing publish-subscribe tools. We enhanced the intrinsic fault tolerance capabilities representative implementations of a variety of key HPC software subsystems and integrated them with the FTB. Targeting software subsystems included: MPI communication libraries, checkpoint/restart libraries, resource managers and job schedulers, and system monitoring tools. Leveraging the aforementioned infrastructure, as well as developing and utilizing additional tools, we have examined issues associated with expanded, end-to-end fault response from both system and application viewpoints. From the standpoint of system operations, we have investigated log and root cause analysis, anomaly detection and fault prediction, and generalized notification mechanisms. Our applications work has included libraries for fault-tolerance linear algebra, application frameworks for coupled multiphysics applications, and external frameworks to support the monitoring and response for general applications. Our final goal was to engage the high-end computing community to increase awareness of tools and issues around coordinated end-to-end fault management.
Polymorphous computing fabric

Science.gov (United States)

Wolinski, Christophe Czeslaw [Los Alamos, NM; Gokhale, Maya B [Los Alamos, NM; McCabe, Kevin Peter [Los Alamos, NM

2011-01-18

Fabric-based computing systems and methods are disclosed. A fabric-based computing system can include a polymorphous computing fabric that can be customized on a per application basis and a host processor in communication with said polymorphous computing fabric. The polymorphous computing fabric includes a cellular architecture that can be highly parameterized to enable a customized synthesis of fabric instances for a variety of enhanced application performances thereof. A global memory concept can also be included that provides the host processor random access to all variables and instructions associated with the polymorphous computing fabric.
MUMAX: A new high-performance micromagnetic simulation tool

International Nuclear Information System (INIS)

Vansteenkiste, A.; Van de Wiele, B.

2011-01-01

We present MUMAX, a general-purpose micromagnetic simulation tool running on graphical processing units (GPUs). MUMAX is designed for high-performance computations and specifically targets large simulations. In that case speedups of over a factor 100 x can be obtained compared to the CPU-based OOMMF program developed at NIST. MUMAX aims to be general and broadly applicable. It solves the classical Landau-Lifshitz equation taking into account the magnetostatic, exchange and anisotropy interactions, thermal effects and spin-transfer torque. Periodic boundary conditions can optionally be imposed. A spatial discretization using finite differences in two or three dimensions can be employed. MUMAX is publicly available as open-source software. It can thus be freely used and extended by community. Due to its high computational performance, MUMAX should open up the possibility of running extensive simulations that would be nearly inaccessible with typical CPU-based simulators. - Highlights: → Novel, open-source micromagnetic simulator on GPU hardware. → Speedup of ∝100x compared to other widely used tools. → Extensively validated against standard problems. → Makes previously infeasible simulations accessible.
An Evaluation of Windows-Based Computer Forensics Application Software Running on a Macintosh

Directory of Open Access Journals (Sweden)

Gregory H. Carlton

2008-09-01

Full Text Available The two most common computer forensics applications perform exclusively on Microsoft Windows Operating Systems, yet contemporary computer forensics examinations frequently encounter one or more of the three most common operating system environments, namely Windows, OS-X, or some form of UNIX or Linux. Additionally, government and private computer forensics laboratories frequently encounter budget constraints that limit their access to computer hardware. Currently, Macintosh computer systems are marketed with the ability to accommodate these three common operating system environments, including Windows XP in native and virtual environments. We performed a series of experiments to measure the functionality and performance of the two most commonly used Windows-based computer forensics applications on a Macintosh running Windows XP in native mode and in two virtual environments relative to a similarly configured Dell personal computer. The research results are directly beneficial to practitioners, and the process illustrates affective pedagogy whereby students were engaged in applied research.
Advanced in Computer Science and its Applications

CERN Document Server

Yen, Neil; Park, James; CSA 2013

2014-01-01

The theme of CSA is focused on the various aspects of computer science and its applications for advances in computer science and its applications and provides an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of computer science and its applications. Therefore this book will be include the various theories and practical applications in computer science and its applications.
SAME4HPC: A Promising Approach in Building a Scalable and Mobile Environment for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Karthik, Rajasekar [ORNL

2014-01-01

In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack with Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.
High-Performance MIM Capacitors for a Secondary Power Supply Application

Directory of Open Access Journals (Sweden)

Jiliang Mu

2018-02-01

Full Text Available Microstructure is important to the development of energy devices with high performance. In this work, a three-dimensional Si-based metal-insulator-metal (MIM capacitor has been reported, which is fabricated by microelectromechanical systems (MEMS technology. Area enlargement is achieved by forming deep trenches in a silicon substrate using the deep reactive ion etching method. The results indicate that an area of 2.45 × 103 mm2 can be realized in the deep trench structure with a high aspect ratio of 30:1. Subsequently, a dielectric Al2O3 layer and electrode W/TiN layers are deposited by atomic layer deposition. The obtained capacitor has superior performance, such as a high breakdown voltage (34.1 V, a moderate energy density (≥1.23 mJ/cm2 per unit planar area, a high breakdown electric field (6.1 ± 0.1 MV/cm, a low leakage current (10−7 A/cm2 at 22.5 V, and a low quadratic voltage coefficient of capacitance (VCC (≤63.1 ppm/V2. In addition, the device’s performance has been theoretically examined. The results show that the high energy supply and small leakage current can be attributed to the Poole–Frenkel emission in the high-field region and the trap-assisted tunneling in the low-field region. The reported capacitor has potential application as a secondary power supply.
Enabling Efficient Climate Science Workflows in High Performance Computing Environments

Science.gov (United States)

Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

2015-12-01

A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.

Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

Science.gov (United States)

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Cambridge, MA; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-04-17

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application

Science.gov (United States)

Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda A [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

2012-01-10

Methods, apparatus, and products are disclosed for reducing power consumption while synchronizing a plurality of compute nodes during execution of a parallel application that include: beginning, by each compute node, performance of a blocking operation specified by the parallel application, each compute node beginning the blocking operation asynchronously with respect to the other compute nodes; reducing, for each compute node, power to one or more hardware components of that compute node in response to that compute node beginning the performance of the blocking operation; and restoring, for each compute node, the power to the hardware components having power reduced in response to all of the compute nodes beginning the performance of the blocking operation.
Soft Computing Applications : Proceedings of the 5th International Workshop Soft Computing Applications

CERN Document Server

Fodor, János; Várkonyi-Kóczy, Annamária; Dombi, Joszef; Jain, Lakhmi

2013-01-01

This volume contains the Proceedings of the 5thInternational Workshop on Soft Computing Applications (SOFA 2012). The book covers a broad spectrum of soft computing techniques, theoretical and practical applications employing knowledge and intelligence to find solutions for world industrial, economic and medical problems. The combination of such intelligent systems tools and a large number of applications introduce a need for a synergy of scientific and technological disciplines in order to show the great potential of Soft Computing in all domains. The conference papers included in these proceedings, published post conference, were grouped into the following area of research: · Soft Computing and Fusion Algorithms in Biometrics, · Fuzzy Theory, Control andApplications, · Modelling and Control Applications, · Steps towa...
PERC 2 High-End Computer System Performance: Scalable Science and Engineering

Energy Technology Data Exchange (ETDEWEB)

Daniel Reed

2006-10-15

During two years of SciDAC PERC-2, our activities had centered largely on development of new performance analysis techniques to enable efficient use on systems containing thousands or tens of thousands of processors. In addition, we continued our application engagement efforts and utilized our tools to study the performance of various SciDAC applications on a variety of HPC platforms.
Computational materials design for energy applications

Science.gov (United States)

Ozolins, Vidvuds

2013-03-01

General adoption of sustainable energy technologies depends on the discovery and development of new high-performance materials. For instance, waste heat recovery and electricity generation via the solar thermal route require bulk thermoelectrics with a high figure of merit (ZT) and thermal stability at high-temperatures. Energy recovery applications (e.g., regenerative braking) call for the development of rapidly chargeable systems for electrical energy storage, such as electrochemical supercapacitors. Similarly, use of hydrogen as vehicular fuel depends on the ability to store hydrogen at high volumetric and gravimetric densities, as well as on the ability to extract it at ambient temperatures at sufficiently rapid rates. We will discuss how first-principles computational methods based on quantum mechanics and statistical physics can drive the understanding, improvement and prediction of new energy materials. We will cover prediction and experimental verification of new earth-abundant thermoelectrics, transition metal oxides for electrochemical supercapacitors, and kinetics of mass transport in complex metal hydrides. Research has been supported by the US Department of Energy under grant Nos. DE-SC0001342, DE-SC0001054, DE-FG02-07ER46433, and DE-FC36-08GO18136.
THC-MP: High performance numerical simulation of reactive transport and multiphase flow in porous media

Science.gov (United States)

Wei, Xiaohui; Li, Weishan; Tian, Hailong; Li, Hongliang; Xu, Haixiao; Xu, Tianfu

2015-07-01

The numerical simulation of multiphase flow and reactive transport in the porous media on complex subsurface problem is a computationally intensive application. To meet the increasingly computational requirements, this paper presents a parallel computing method and architecture. Derived from TOUGHREACT that is a well-established code for simulating subsurface multi-phase flow and reactive transport problems, we developed a high performance computing THC-MP based on massive parallel computer, which extends greatly on the computational capability for the original code. The domain decomposition method was applied to the coupled numerical computing procedure in the THC-MP. We designed the distributed data structure, implemented the data initialization and exchange between the computing nodes and the core solving module using the hybrid parallel iterative and direct solver. Numerical accuracy of the THC-MP was verified through a CO2 injection-induced reactive transport problem by comparing the results obtained from the parallel computing and sequential computing (original code). Execution efficiency and code scalability were examined through field scale carbon sequestration applications on the multicore cluster. The results demonstrate successfully the enhanced performance using the THC-MP on parallel computing facilities.
A Perspective on Computational Human Performance Models as Design Tools

Science.gov (United States)

Jones, Patricia M.

2010-01-01

The design of interactive systems, including levels of automation, displays, and controls, is usually based on design guidelines and iterative empirical prototyping. A complementary approach is to use computational human performance models to evaluate designs. An integrated strategy of model-based and empirical test and evaluation activities is particularly attractive as a methodology for verification and validation of human-rated systems for commercial space. This talk will review several computational human performance modeling approaches and their applicability to design of display and control requirements.
Computer application in scientific investigations

International Nuclear Information System (INIS)

Govorun, N.N.

1981-01-01

A short review of the computer development and application and software in JINR for the last 15 years is presented. Main trends of studies on computer application in experimental and theoretical investigations are enumerated: software of computers and their systems, software of data processing systems, designing automatic and automized systems for measuring track detectors images, development of technique of carrying out experiments on computer line, packets of applied computer codes and specialized systems. The development of the on line technique is successfully used in investigations of nuclear processes at relativistic energies. The new trend is the development of television methods of data output and its computer recording [ru
Scientific Applications Performance Evaluation on Burst Buffer

KAUST Repository

Markomanolis, George S.

2017-10-19

Parallel I/O is an integral component of modern high performance computing, especially in storing and processing very large datasets, such as the case of seismic imaging, CFD, combustion and weather modeling. The storage hierarchy includes nowadays additional layers, the latest being the usage of SSD-based storage as a Burst Buffer for I/O acceleration. We present an in-depth analysis on how to use Burst Buffer for specific cases and how the internal MPI I/O aggregators operate according to the options that the user provides during his job submission. We analyze the performance of a range of I/O intensive scientific applications, at various scales on a large installation of Lustre parallel file system compared to an SSD-based Burst Buffer. Our results show a performance improvement over Lustre when using Burst Buffer. Moreover, we show results from a data hierarchy library which indicate that the standard I/O approaches are not enough to get the expected performance from this technology. The performance gain on the total execution time of the studied applications is between 1.16 and 3 times compared to Lustre. One of the test cases achieved an impressive I/O throughput of 900 GB/s on Burst Buffer.
Tackling some of the most intricate geophysical challenges via high-performance computing

Science.gov (United States)

Khosronejad, A.

2016-12-01

Recently, world has been witnessing significant enhancements in computing power of supercomputers. Computer clusters in conjunction with the advanced mathematical algorithms has set the stage for developing and applying powerful numerical tools to tackle some of the most intricate geophysical challenges that today`s engineers face. One such challenge is to understand how turbulent flows, in real-world settings, interact with (a) rigid and/or mobile complex bed bathymetry of waterways and sea-beds in the coastal areas; (b) objects with complex geometry that are fully or partially immersed; and (c) free-surface of waterways and water surface waves in the coastal area. This understanding is especially important because the turbulent flows in real-world environments are often bounded by geometrically complex boundaries, which dynamically deform and give rise to multi-scale and multi-physics transport phenomena, and characterized by multi-lateral interactions among various phases (e.g. air/water/sediment phases). Herein, I present some of the multi-scale and multi-physics geophysical fluid mechanics processes that I have attempted to study using an in-house high-performance computational model, the so-called VFS-Geophysics. More specifically, I will present the simulation results of turbulence/sediment/solute/turbine interactions in real-world settings. Parts of the simulations I present are performed to gain scientific insights into the processes such as sand wave formation (A. Khosronejad, and F. Sotiropoulos, (2014), Numerical simulation of sand waves in a turbulent open channel flow, Journal of Fluid Mechanics, 753:150-216), while others are carried out to predict the effects of climate change and large flood events on societal infrastructures ( A. Khosronejad, et al., (2016), Large eddy simulation of turbulence and solute transport in a forested headwater stream, Journal of Geophysical Research:, doi: 10.1002/2014JF003423).
Computational fluid dynamics (CFD) assisted performance evaluation of the Twincer (TM) disposable high-dose dry powder inhaler

NARCIS (Netherlands)

de Boer, Anne H.; Hagedoorn, Paul; Woolhouse, Robert; Wynn, Ed

Objectives To use computational fluid dynamics (CFD) for evaluating and understanding the performance of the high-dose disposable Twincer (TM) dry powder inhaler, as well as to learn the effect of design modifications on dose entrainment, powder dispersion and retention behaviour. Methods Comparison
Big Data and High-Performance Computing in Global Seismology

Science.gov (United States)

Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Peter, Daniel; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen

2014-05-01

Much of our knowledge of Earth's interior is based on seismic observations and measurements. Adjoint methods provide an efficient way of incorporating 3D full wave propagation in iterative seismic inversions to enhance tomographic images and thus our understanding of processes taking place inside the Earth. Our aim is to take adjoint tomography, which has been successfully applied to regional and continental scale problems, further to image the entire planet. This is one of the extreme imaging challenges in seismology, mainly due to the intense computational requirements and vast amount of high-quality seismic data that can potentially be assimilated. We have started low-resolution inversions (T > 30 s and T > 60 s for body and surface waves, respectively) with a limited data set (253 carefully selected earthquakes and seismic data from permanent and temporary networks) on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Recent improvements in our 3D global wave propagation solvers, such as a GPU version of the SPECFEM3D_GLOBE package, will enable us perform higher-resolution (T > 9 s) and longer duration (~180 m) simulations to take the advantage of high-frequency body waves and major-arc surface waves, thereby improving imbalanced ray coverage as a result of the uneven global distribution of sources and receivers. Our ultimate goal is to use all earthquakes in the global CMT catalogue within the magnitude range of our interest and data from all available seismic networks. To take the full advantage of computational resources, we need a solid framework to manage big data sets during numerical simulations, pre-processing (i.e., data requests and quality checks, processing data, window selection, etc.) and post-processing (i.e., pre-conditioning and smoothing kernels, etc.). We address the bottlenecks in our global seismic workflow, which are mainly coming from heavy I/O traffic during simulations and the pre- and post-processing stages, by defining new data
Application of a personal computer in a high energy physics experiment

International Nuclear Information System (INIS)

Petta, P.

1987-04-01

UA1 is a detector block at the CERN Super Synchrotron Collider, MacVEE is Micro computer applied to the Control of VME Electronic Equipment, a software development system for the data readout system and for the implementation of the user interface of the experiment control. A commercial personal computer is used. Examples of applications are the Data Acquisition Console, the Scanner Desc equipment and the AMERICA Ram Disks codes. Further topics are the MacUA1 development system for M68K-VME codes and an outline of the future MacVEE System Supervisor. 23 refs., 10 figs., 3 tabs. (qui)
Cloud Computing and Its Applications in GIS

Science.gov (United States)

Kang, Cao

2011-12-01

Cloud computing is a novel computing paradigm that offers highly scalable and highly available distributed computing services. The objectives of this research are to: 1. analyze and understand cloud computing and its potential for GIS; 2. discover the feasibilities of migrating truly spatial GIS algorithms to distributed computing infrastructures; 3. explore a solution to host and serve large volumes of raster GIS data efficiently and speedily. These objectives thus form the basis for three professional articles. The first article is entitled "Cloud Computing and Its Applications in GIS". This paper introduces the concept, structure, and features of cloud computing. Features of cloud computing such as scalability, parallelization, and high availability make it a very capable computing paradigm. Unlike High Performance Computing (HPC), cloud computing uses inexpensive commodity computers. The uniform administration systems in cloud computing make it easier to use than GRID computing. Potential advantages of cloud-based GIS systems such as lower barrier to entry are consequently presented. Three cloud-based GIS system architectures are proposed: public cloud- based GIS systems, private cloud-based GIS systems and hybrid cloud-based GIS systems. Public cloud-based GIS systems provide the lowest entry barriers for users among these three architectures, but their advantages are offset by data security and privacy related issues. Private cloud-based GIS systems provide the best data protection, though they have the highest entry barriers. Hybrid cloud-based GIS systems provide a compromise between these extremes. The second article is entitled "A cloud computing algorithm for the calculation of Euclidian distance for raster GIS". Euclidean distance is a truly spatial GIS algorithm. Classical algorithms such as the pushbroom and growth ring techniques require computational propagation through the entire raster image, which makes it incompatible with the distributed nature
Recent developments of the MOA thruster, a high performance plasma accelerator for nuclear power and propulsion applications

International Nuclear Information System (INIS)

Frischauf, N.; Hettmer, M.; Grassauer, A.; Bartusch, T.; Koudelka, O.

2008-01-01

More than 60 years after the late Nobel laureate Hannes Alfven had published a letter stating that oscillating magnetic fields can accelerate ionised matter via magneto-hydrodynamic interactions in a wave like fashion, the technical implementation of Alfven waves for propulsive purposes has been proposed, patented and examined for the first time by a group of inventors. The name of the concept, utilising Alfven waves to accelerate ionised matter for propulsive purposes, is MOA -Magnetic field Oscillating Amplified thruster. Alfven waves are generated by making use of two coils, one being permanently powered and serving also as magnetic nozzle, the other one being switched on and off in a cyclic way, deforming the field lines of the overall system. It is this deformation that generates Alfven waves, which are in the next step used to transport and compress the propulsive medium, in theory leading to a propulsion system with a much higher performance than any other electric propulsion system. Based on computer simulations, which were conducted to get a first estimate on the performance of the system, MOA is a highly flexible propulsion system, whose performance parameters might easily be adapted, by changing the mass flow and/or the power level. As such the system is capable to deliver a maximum specific impulse of 13116 s (12.87 mN) at a power level of 11.16 kW, using Xe as propellant, but can also be attuned to provide a thrust of 236.5 mN (2411 s) at 6.15 kW of power. While space propulsion is expected to be the prime application for MOA and is supported by numerous applications such as Solar and/or Nuclear Electric Propulsion or even as an 'afterburner system' for Nuclear Thermal Propulsion, other terrestrial applications can be thought of as well, making the system highly suited for a common space-terrestrial application research and utilization strategy. This paper presents the recent developments of the MOA Thruster R and D activities at QASAR, the company in
An overview of the activities of the OECD/NEA Task Force on adapting computer codes in nuclear applications to parallel architectures

Energy Technology Data Exchange (ETDEWEB)

Kirk, B.L. [Oak Ridge National Lab., TN (United States); Sartori, E. [OCDE/OECD NEA Data Bank, Issy-les-Moulineaux (France); Viedma, L.G. de [Consejo de Seguridad Nuclear, Madrid (Spain)

1997-06-01

Subsequent to the introduction of High Performance Computing in the developed countries, the Organization for Economic Cooperation and Development/Nuclear Energy Agency (OECD/NEA) created the Task Force on Adapting Computer Codes in Nuclear Applications to Parallel Architectures (under the guidance of the Nuclear Science Committee`s Working Party on Advanced Computing) to study the growth area in supercomputing and its applicability to the nuclear community`s computer codes. The result has been four years of investigation for the Task Force in different subject fields - deterministic and Monte Carlo radiation transport, computational mechanics and fluid dynamics, nuclear safety, atmospheric models and waste management.
An overview of the activities of the OECD/NEA Task Force on adapting computer codes in nuclear applications to parallel architectures

International Nuclear Information System (INIS)

Kirk, B.L.; Sartori, E.; Viedma, L.G. de

1997-01-01

Subsequent to the introduction of High Performance Computing in the developed countries, the Organization for Economic Cooperation and Development/Nuclear Energy Agency (OECD/NEA) created the Task Force on Adapting Computer Codes in Nuclear Applications to Parallel Architectures (under the guidance of the Nuclear Science Committee's Working Party on Advanced Computing) to study the growth area in supercomputing and its applicability to the nuclear community's computer codes. The result has been four years of investigation for the Task Force in different subject fields - deterministic and Monte Carlo radiation transport, computational mechanics and fluid dynamics, nuclear safety, atmospheric models and waste management
Applications and issues in automotive computational aeroacoustics

International Nuclear Information System (INIS)

Karbon, K.J.; Kumarasamy, S.; Singh, R.

2002-01-01

Automotive aeroacoustics is the noise generated due to the airflow around a moving vehicle. Previously regarded as a minor contributor, wind noise is now recognized as one of the dominant vehicle sound sources, since significant progress has been made in suppressing engine and tire noise. Currently, almost all aeroacoustic development work is performed experimentally on a full-scale vehicle in the wind tunnel. Any reduction in hardware models is recognized as one of the major enablers to quickly bring the vehicle to market. In addition, prediction of noise sources and characteristics at the early stages of vehicle design will help in reducing the costly fixes at the later stages. However, predictive methods such as Computational Fluid Dynamics (CFD) and Computational Aeroacoustics (CAA) are still under development and are not considered mainstream design tools. This paper presents some initial applications and findings of CFD and CAA analysis towards vehicle aeroacoustics. Transient Reynolds Averaged Navier Stokes (RANS) and Lighthill-Curle methods are used to model low frequency buffeting and high frequency wind rush noise. Benefits and limitations of the approaches are described. (author)
Performance evaluation of throughput computing workloads using multi-core processors and graphics processors

Science.gov (United States)

Dave, Gaurav P.; Sureshkumar, N.; Blessy Trencia Lincy, S. S.

2017-11-01

Current trend in processor manufacturing focuses on multi-core architectures rather than increasing the clock speed for performance improvement. Graphic processors have become as commodity hardware for providing fast co-processing in computer systems. Developments in IoT, social networking web applications, big data created huge demand for data processing activities and such kind of throughput intensive applications inherently contains data level parallelism which is more suited for SIMD architecture based GPU. This paper reviews the architectural aspects of multi/many core processors and graphics processors. Different case studies are taken to compare performance of throughput computing applications using shared memory programming in OpenMP and CUDA API based programming.
Computer Program Application Study for Newly Constructed Fossil Power Plant Performance Analysis

Energy Technology Data Exchange (ETDEWEB)

Lee, Hyun; Park, Jong Jeng [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

1996-12-31

The power plant is affected in its availability and economy significantly by the equipment degraded gradually as operation continues, which makes it quite important to evaluate the plant performance more accurately and analyze its effects to the plant economy quantitatively. The methodology thereof includes many calculation steps and requires huge man hours and efforts but would produce relatively less precise results than desired. The object of the project first aims to figure out a methodology which can analyze numerically the inherent effects of each equipment on the cycle performance as well as its performance evaluation and which further helps to determine more reasonable investment for the effective plant economy. Another aspect of the project results in the implementation of the methodology which is embodied in the sophisticated computer programs based on the conventional personal computer with the interactive graphic user interface facilities. (author). 44 refs., figs.

Computer Program Application Study for Newly Constructed Fossil Power Plant Performance Analysis

Energy Technology Data Exchange (ETDEWEB)

Lee, Hyun; Park, Jong Jeng [Korea Electric Power Research Institute, Taejon (Korea, Republic of)

1997-12-31

The power plant is affected in its availability and economy significantly by the equipment degraded gradually as operation continues, which makes it quite important to evaluate the plant performance more accurately and analyze its effects to the plant economy quantitatively. The methodology thereof includes many calculation steps and requires huge man hours and efforts but would produce relatively less precise results than desired. The object of the project first aims to figure out a methodology which can analyze numerically the inherent effects of each equipment on the cycle performance as well as its performance evaluation and which further helps to determine more reasonable investment for the effective plant economy. Another aspect of the project results in the implementation of the methodology which is embodied in the sophisticated computer programs based on the conventional personal computer with the interactive graphic user interface facilities. (author). 44 refs., figs.
Designing a High Performance Parallel Personal Cluster

OpenAIRE

Kapanova, K. G.; Sellier, J. M.

2016-01-01

Today, many scientific and engineering areas require high performance computing to perform computationally intensive experiments. For example, many advances in transport phenomena, thermodynamics, material properties, computational chemistry and physics are possible only because of the availability of such large scale computing infrastructures. Yet many challenges are still open. The cost of energy consumption, cooling, competition for resources have been some of the reasons why the scientifi...
Integer programming theory, applications, and computations

CERN Document Server

Taha, Hamdy A

1975-01-01

Integer Programming: Theory, Applications, and Computations provides information pertinent to the theory, applications, and computations of integer programming. This book presents the computational advantages of the various techniques of integer programming.Organized into eight chapters, this book begins with an overview of the general categorization of integer applications and explains the three fundamental techniques of integer programming. This text then explores the concept of implicit enumeration, which is general in a sense that it is applicable to any well-defined binary program. Other
High-Level Synthesis: Productivity, Performance, and Software Constraints

Directory of Open Access Journals (Sweden)

Yun Liang

2012-01-01

Full Text Available FPGAs are an attractive platform for applications with high computation demand and low energy consumption requirements. However, design effort for FPGA implementations remains high—often an order of magnitude larger than design effort using high-level languages. Instead of this time-consuming process, high-level synthesis (HLS tools generate hardware implementations from algorithm descriptions in languages such as C/C++ and SystemC. Such tools reduce design effort: high-level descriptions are more compact and less error prone. HLS tools promise hardware development abstracted from software designer knowledge of the implementation platform. In this paper, we present an unbiased study of the performance, usability and productivity of HLS using AutoPilot (a state-of-the-art HLS tool. In particular, we first evaluate AutoPilot using the popular embedded benchmark kernels. Then, to evaluate the suitability of HLS on real-world applications, we perform a case study of stereo matching, an active area of computer vision research that uses techniques also common for image denoising, image retrieval, feature matching, and face recognition. Based on our study, we provide insights on current limitations of mapping general-purpose software to hardware using HLS and some future directions for HLS tool development. We also offer several guidelines for hardware-friendly software design. For popular embedded benchmark kernels, the designs produced by HLS achieve 4X to 126X speedup over the software version. The stereo matching algorithms achieve between 3.5X and 67.9X speedup over software (but still less than manual RTL design with a fivefold reduction in design effort versus manual RTL design.
Student Advising and Retention Application in Cloud Computing Environment

Directory of Open Access Journals (Sweden)

Gurdeep S Hura

2016-11-01

Full Text Available This paper proposes a new user-friendly application enhancing and expanding the current advising services of Gradesfirst currently being used for advising and retention by the Athletic department of UMES with a view to implement new performance activities like mentoring, tutoring, scheduling, and study hall hours into existing tools. This application includes various measurements that can be used to monitor and improve the performance of the students in the Athletic Department of UMES by monitoring students’ weekly study hall hours, and tutoring schedules. It also supervises tutors’ login and logout activities in order to monitor their effectiveness, supervises tutor-tutee interaction, and stores and analyzes the overall academic progress of each student. A dedicated server for providing services will be developed at the local site. The paper has been implemented in three steps. The first step involves the creation of an independent cloud computing environment that provides resources such as database creation, query-based statistical data, performance measures activities, and automated support of performance measures such as advising, mentoring, monitoring and tutoring. The second step involves the creation of an application known as Student Advising and Retention (SAR application in a cloud computing environment. This application has been designed to be a comprehensive database management system which contains relevant data regarding student academic development that supports various strategic advising and monitoring of students. The third step involves the creation of a systematic advising chart and frameworks which help advisors. The paper shows ways of creating the most appropriate advising technique based on the student’s academic needs. The proposed application runs in a Windows-based system. As stated above, the proposed application is expected to enhance and expand the current advising service of Gradesfirst tool. A brief
Hybrid cloud and cluster computing paradigms for life science applications.

Science.gov (United States)

Qiu, Judy; Ekanayake, Jaliya; Gunarathne, Thilina; Choi, Jong Youl; Bae, Seung-Hee; Li, Hui; Zhang, Bingjing; Wu, Tak-Lon; Ruan, Yang; Ekanayake, Saliya; Hughes, Adam; Fox, Geoffrey

2010-12-21

Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.
Characteristics and applications of a flat panel computer tomography system

International Nuclear Information System (INIS)

Knollmann, F.; Valencia, R.; Obenauer, S.; Buhk, J.H.

2006-01-01

Purpose: to assess a new flat panel volume computed tomography (FP-VCT) with very high isotropic spatial resolution as well as high Z-axis coverage. Materials and Methods: The prototype of an FP-VCT scanner with a detector cell size of 0.2 mm was used for numerous phantom studies, specimen examinations, and animal research projects. Results: The high spatial resolution of the new system can be used to accurately determine solid tumor volume, thus allowing for earlier assessment of the therapeutic response. In animal experimentation, whole-body perfusion mapping of mice is feasible. The high spatial resolution also improves the classification of coronary artery atherosclerotic plaques in the isolated post mortem human heart. With the depiction of intramyocardial segments of the coronary arteries, investigations of myocardial collateral circulation are feasible. In skeletal applications, an accurate analysis of the smallest bony structures, e.g., petrous bone and dental preparations, can be successfully performed, as well as investigations of repetitive studies of fracture healing and the treatment of osteoporosis. Conclusion: The introduction of FP-VCT opens up new applications for CT, including the field of molecular imaging, which are highly attractive for future clinical applications. Present limitations include limited temporal resolution and necessitate further improvement of the system. (orig.)
A Review of Lightweight Thread Approaches for High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Castello, Adrian; Pena, Antonio J.; Seo, Sangmin; Mayo, Rafael; Balaji, Pavan; Quintana-Orti, Enrique S.

2016-09-12

High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of cores. However, exascale systems will spawn hundreds of thousands of threads in order to exploit their massive parallel architectures and thus conventional OS threads are too heavy for that purpose. Several lightweight thread (LWT) libraries have recently appeared offering lighter mechanisms to tackle massive concurrency. In order to examine the suitability of LWTs in high-level runtimes, we develop a set of microbenchmarks consisting of commonlyfound patterns in current parallel codes. Moreover, we study the semantics offered by some LWT libraries in order to expose the similarities between different LWT application programming interfaces. This study reveals that a reduced set of LWT functions can be sufficient to cover the common parallel code patterns and that those LWT libraries perform better than OS threads-based solutions in cases where task and nested parallelism are becoming more popular with new architectures.
Computationally-optimized bone mechanical modeling from high-resolution structural images.

Directory of Open Access Journals (Sweden)

Jeremy F Magland

Full Text Available Image-based mechanical modeling of the complex micro-structure of human bone has shown promise as a non-invasive method for characterizing bone strength and fracture risk in vivo. In particular, elastic moduli obtained from image-derived micro-finite element (μFE simulations have been shown to correlate well with results obtained by mechanical testing of cadaveric bone. However, most existing large-scale finite-element simulation programs require significant computing resources, which hamper their use in common laboratory and clinical environments. In this work, we theoretically derive and computationally evaluate the resources needed to perform such simulations (in terms of computer memory and computation time, which are dependent on the number of finite elements in the image-derived bone model. A detailed description of our approach is provided, which is specifically optimized for μFE modeling of the complex three-dimensional architecture of trabecular bone. Our implementation includes domain decomposition for parallel computing, a novel stopping criterion, and a system for speeding up convergence by pre-iterating on coarser grids. The performance of the system is demonstrated on a dual quad-core Xeon 3.16 GHz CPUs equipped with 40 GB of RAM. Models of distal tibia derived from 3D in-vivo MR images in a patient comprising 200,000 elements required less than 30 seconds to converge (and 40 MB RAM. To illustrate the system's potential for large-scale μFE simulations, axial stiffness was estimated from high-resolution micro-CT images of a voxel array of 90 million elements comprising the human proximal femur in seven hours CPU time. In conclusion, the system described should enable image-based finite-element bone simulations in practical computation times on high-end desktop computers with applications to laboratory studies and clinical imaging.
The NEA computer program library: a possible GDMS application

International Nuclear Information System (INIS)

Schuler, W.

1978-01-01

NEA Computer Program library maintains a series of eleven sequential computer files, used for linked applications in managing their stock of computer codes for nuclear reactor calculations, storing index and program abstract information, and administering their service to requesters. The high data redundancy beween the files suggests that a data base approach would be valid and this paper suggests a possible 'schema' for an CODASYL GDMS
A ground-up approach to High Throughput Cloud Computing in High-Energy Physics

CERN Document Server

AUTHOR|(INSPIRE)INSPIRE-00245123; Ganis, Gerardo; Bagnasco, Stefano

The thesis explores various practical approaches in making existing High Throughput computing applications common in High Energy Physics work on cloud-provided resources, as well as opening the possibility for running new applications. The work is divided into two parts: firstly we describe the work done at the computing facility hosted by INFN Torino to entirely convert former Grid resources into cloud ones, eventually running Grid use cases on top along with many others in a more flexible way. Integration and conversion problems are duly described. The second part covers the development of solutions for automatizing the orchestration of cloud workers based on the load of a batch queue and the development of HEP applications based on ROOT's PROOF that can adapt at runtime to a changing number of workers.
A High Performance Block Eigensolver for Nuclear Configuration Interaction Calculations

International Nuclear Information System (INIS)

Aktulga, Hasan Metin; Afibuzzaman, Md.; Williams, Samuel; Buluc, Aydin; Shao, Meiyue

2017-01-01

As on-node parallelism increases and the performance gap between the processor and the memory system widens, achieving high performance in large-scale scientific applications requires an architecture-aware design of algorithms and solvers. We focus on the eigenvalue problem arising in nuclear Configuration Interaction (CI) calculations, where a few extreme eigenpairs of a sparse symmetric matrix are needed. Here, we consider a block iterative eigensolver whose main computational kernels are the multiplication of a sparse matrix with multiple vectors (SpMM), and tall-skinny matrix operations. We then present techniques to significantly improve the SpMM and the transpose operation SpMM T by using the compressed sparse blocks (CSB) format. We achieve 3-4× speedup on the requisite operations over good implementations with the commonly used compressed sparse row (CSR) format. We develop a performance model that allows us to correctly estimate the performance of our SpMM kernel implementations, and we identify cache bandwidth as a potential performance bottleneck beyond DRAM. We also analyze and optimize the performance of LOBPCG kernels (inner product and linear combinations on multiple vectors) and show up to 15× speedup over using high performance BLAS libraries for these operations. The resulting high performance LOBPCG solver achieves 1.4× to 1.8× speedup over the existing Lanczos solver on a series of CI computations on high-end multicore architectures (Intel Xeons). We also analyze the performance of our techniques on an Intel Xeon Phi Knights Corner (KNC) processor.
Enhancing Application Performance Using Mini-Apps: Comparison of Hybrid Parallel Programming Paradigms

Science.gov (United States)

Lawson, Gary; Sosonkina, Masha; Baurle, Robert; Hammond, Dana

2017-01-01

In many fields, real-world applications for High Performance Computing have already been developed. For these applications to stay up-to-date, new parallel strategies must be explored to yield the best performance; however, restructuring or modifying a real-world application may be daunting depending on the size of the code. In this case, a mini-app may be employed to quickly explore such options without modifying the entire code. In this work, several mini-apps have been created to enhance a real-world application performance, namely the VULCAN code for complex flow analysis developed at the NASA Langley Research Center. These mini-apps explore hybrid parallel programming paradigms with Message Passing Interface (MPI) for distributed memory access and either Shared MPI (SMPI) or OpenMP for shared memory accesses. Performance testing shows that MPI+SMPI yields the best execution performance, while requiring the largest number of code changes. A maximum speedup of 23 was measured for MPI+SMPI, but only 11 was measured for MPI+OpenMP.
Mapping the Conjugate Gradient Algorithm onto High Performance Heterogeneous Computers

Science.gov (United States)

2014-05-01

Solution of sparse indefinite systems of linear equations. Society for Industrial and Applied Mathematis 12(4), 617 –629. Parker, M. ( 2009 ). Taking advantage...44 vii 11 LIST OF SYMBOLS, ABBREVIATIONS, AND NOMENCLATURE API Application Programming Interface ASIC Application Specific Integrated Circuit...FPGA designer, 1 16 2 thus, final implementations were nearly always performed using fixed-point or integer arithmetic (Parker 2009 ). With the recent
Computational electromagnetics recent advances and engineering applications

CERN Document Server

2014-01-01

Emerging Topics in Computational Electromagnetics in Computational Electromagnetics presents advances in Computational Electromagnetics. This book is designed to fill the existing gap in current CEM literature that only cover the conventional numerical techniques for solving traditional EM problems. The book examines new algorithms, and applications of these algorithms for solving problems of current interest that are not readily amenable to efficient treatment by using the existing techniques. The authors discuss solution techniques for problems arising in nanotechnology, bioEM, metamaterials, as well as multiscale problems. They present techniques that utilize recent advances in computer technology, such as parallel architectures, and the increasing need to solve large and complex problems in a time efficient manner by using highly scalable algorithms.
Introducing high availability to non high available designed applications

Energy Technology Data Exchange (ETDEWEB)

Zelnicek, Pierre; Kebschull, Udo [Kirchhoff Institute of Physics, Ruprecht-Karls-University Heidelberg (Germany); Haaland, Oystein Senneset [Physic Institut, University of Bergen, Bergen (Norway); Lindenstruth, Volker [Frankfurt Institut fuer Advanced Studies, University Frankfurt (Germany)

2010-07-01

A common problem in scientific computing environments and compute clusters today, is how to apply high availability to legacy applications. These applications are becoming more and more a problem in increasingly complex environments and with business grade availability constraints that requires 24 x 7 x 365 hours of operation. For a majority of applications, redesign is not an option. Either because of being closed source or the effort involved would be just as great as re-writing the application from scratch. Neither is letting normal operators restart and reconfigure the applications on backup nodes a solution. In addition to the possibility of mistakes from non-experts and the cost of keeping personnel at work 24/7, these kind of operations would require administrator privileges within the compute environment and would therefore be a security risk. Therefore, these legacy applications have to be monitored and if a failure occurs autonomously migrated to a working node. The pacemaker framework is designed for both tasks and ensures the availability of the legacy applications. Distributed redundant block devices are used for fault tolerant distributed data storage. The result is an Availability Environment Classification 2 (AEC-2).
Computing farms

International Nuclear Information System (INIS)

Yeh, G.P.

2000-01-01

High-energy physics, nuclear physics, space sciences, and many other fields have large challenges in computing. In recent years, PCs have achieved performance comparable to the high-end UNIX workstations, at a small fraction of the price. We review the development and broad applications of commodity PCs as the solution to CPU needs, and look forward to the important and exciting future of large-scale PC computing
Performance Aspects of Synthesizable Computing Systems

DEFF Research Database (Denmark)

Schleuniger, Pascal

Embedded systems are used in a broad range of applications that demand high performance within severely constrained mechanical, power, and cost requirements. Embedded systems implemented in ASIC technology tend to provide the highest performance, lowest power consumption and lowest unit cost. How...
National cyber defense high performance computing and analysis : concepts, planning and roadmap.

Energy Technology Data Exchange (ETDEWEB)

Hamlet, Jason R.; Keliiaa, Curtis M.

2010-09-01

There is a national cyber dilemma that threatens the very fabric of government, commercial and private use operations worldwide. Much is written about 'what' the problem is, and though the basis for this paper is an assessment of the problem space, we target the 'how' solution space of the wide-area national information infrastructure through the advancement of science, technology, evaluation and analysis with actionable results intended to produce a more secure national information infrastructure and a comprehensive national cyber defense capability. This cybersecurity High Performance Computing (HPC) analysis concepts, planning and roadmap activity was conducted as an assessment of cybersecurity analysis as a fertile area of research and investment for high value cybersecurity wide-area solutions. This report and a related SAND2010-4765 Assessment of Current Cybersecurity Practices in the Public Domain: Cyber Indications and Warnings Domain report are intended to provoke discussion throughout a broad audience about developing a cohesive HPC centric solution to wide-area cybersecurity problems.
Integrated multi sensors and camera video sequence application for performance monitoring in archery

Science.gov (United States)

Taha, Zahari; Arif Mat-Jizat, Jessnor; Amirul Abdullah, Muhammad; Muazu Musa, Rabiu; Razali Abdullah, Mohamad; Fauzi Ibrahim, Mohamad; Hanafiah Shaharudin, Mohd Ali

2018-03-01

This paper explains the development of a comprehensive archery performance monitoring software which consisted of three camera views and five body sensors. The five body sensors evaluate biomechanical related variables of flexor and extensor muscle activity, heart rate, postural sway and bow movement during archery performance. The three camera views with the five body sensors are integrated into a single computer application which enables the user to view all the data in a single user interface. The five body sensors’ data are displayed in a numerical and graphical form in real-time. The information transmitted by the body sensors are computed with an embedded algorithm that automatically transforms the summary of the athlete’s biomechanical performance and displays in the application interface. This performance will be later compared to the pre-computed psycho-fitness performance from the prefilled data into the application. All the data; camera views, body sensors; performance-computations; are recorded for further analysis by a sports scientist. Our developed application serves as a powerful tool for assisting the coach and athletes to observe and identify any wrong technique employ during training which gives room for correction and re-evaluation to improve overall performance in the sport of archery.

MODELING A COMPUTER APPLICATION FOR MANAGEMENT OF MAINTENANCE ACTIVITIES OF UNPAVED ROADS

Directory of Open Access Journals (Sweden)

Ludmília de Souza Dias

2015-08-01

Full Text Available ABSTRACTThis study presents a contribution to the modeling of a computer application employing a method of serviceability performance for unpaved roads, aiming the management of maintenance/restoration activities of the primary surface layer. The proposed methodology consisted of field inspections during dry (April to September and rainy (October to March periods, during which objective evaluations were performed to survey of defects and their densities and degrees of severity. To aid the functional classification of analyzed road sections and the determination of the defect with major influence on the serviceability of these roads, the method of serviceability performance proposed by Silva (2009was implemented in the Visual Basic for Applications (VBA language in Microsoft Excel software. With the use of the computer application proposed it was possible to identify among the defects analyzed in field, through the index of serviceability of the sampling unit per defect type (ISUdef, which one had the greatest influence on determining the relative serviceability index per road section (IST. The results allow us to conclude that the computer application Road achieved satisfactory results, since the objective evaluation criteria applied to road sections denotes consistency regarding their serviceability.
Performance implications of virtualization and hyper-threading on high energy physics applications in a Grid environment

CERN Document Server

Gilbert, Laura; Cobban, M; Iqbal, Saima; Jenwei, Hsieh; Newman, R; Pepper, R; Tseng, Jeffrey

2005-01-01

The simulations used in the field of high energy physics are compute intensive and exhibit a high level of data parallelism. These features make such simulations ideal candidates for Grid computing. We are taking as an example the GEANT4 detector simulation used for physics studies within the ATLAS experiment at CERN. One key issue in Grid computing is that of network and system security, which can potentially inhibit the wide spread use of such simulations. Virtualization provides a feasible solution because it allows the creation of virtual compute nodes in both local and remote compute clusters, thus providing an insulating layer which can play an important role in satisfying the security concerns of all parties involved. However, it has performance implications. This study provides quantitative estimates of the virtualization and hyper- threading overhead for GEANT on commodity clusters. Results show that virtualization has less than 15% run-time overhead, and that the best run time (with the non-SMP lice...
Strategy Guideline: High Performance Residential Lighting

Energy Technology Data Exchange (ETDEWEB)

Holton, J.

2012-02-01

The Strategy Guideline: High Performance Residential Lighting has been developed to provide a tool for the understanding and application of high performance lighting in the home. The high performance lighting strategies featured in this guide are drawn from recent advances in commercial lighting for application to typical spaces found in residential buildings. This guide offers strategies to greatly reduce lighting energy use through the application of high quality fluorescent and light emitting diode (LED) technologies. It is important to note that these strategies not only save energy in the home but also serve to satisfy the homeowner's expectations for high quality lighting.
The application of AFS in the high energy physics computing system

International Nuclear Information System (INIS)

Xu Dong; Yan Xiaofei; Chen Yaodong; Chen Gang; Yu Chuansong

2010-01-01

With the development of high energy physics, physics experiments are producing large amount of data. The workload of data analysis is very large, and the analysis work needs to be finished by many scientists together. So, the computing system must provide more secure user manage function and higher level of data-sharing ability. The article introduces a solution based on AFS in the high energy physics computing system, which not only make user management safer, but also make data-sharing easier. (authors)
Quantum Computation-Based Image Representation, Processing Operations and Their Applications

Directory of Open Access Journals (Sweden)

Fei Yan

2014-10-01

Full Text Available A flexible representation of quantum images (FRQI was proposed to facilitate the extension of classical (non-quantum-like image processing applications to the quantum computing domain. The representation encodes a quantum image in the form of a normalized state, which captures information about colors and their corresponding positions in the images. Since its conception, a handful of processing transformations have been formulated, among which are the geometric transformations on quantum images (GTQI and the CTQI that are focused on the color information of the images. In addition, extensions and applications of FRQI representation, such as multi-channel representation for quantum images (MCQI, quantum image data searching, watermarking strategies for quantum images, a framework to produce movies on quantum computers and a blueprint for quantum video encryption and decryption have also been suggested. These proposals extend classical-like image and video processing applications to the quantum computing domain and offer a significant speed-up with low computational resources in comparison to performing the same tasks on traditional computing devices. Each of the algorithms and the mathematical foundations for their execution were simulated using classical computing resources, and their results were analyzed alongside other classical computing equivalents. The work presented in this review is intended to serve as the epitome of advances made in FRQI quantum image processing over the past five years and to simulate further interest geared towards the realization of some secure and efficient image and video processing applications on quantum computers.
Core-Shell Columns in High-Performance Liquid Chromatography: Food Analysis Applications

OpenAIRE

Preti, Raffaella

2016-01-01

The increased separation efficiency provided by the new technology of column packed with core-shell particles in high-performance liquid chromatography (HPLC) has resulted in their widespread diffusion in several analytical fields: from pharmaceutical, biological, environmental, and toxicological. The present paper presents their most recent applications in food analysis. Their use has proved to be particularly advantageous for the determination of compounds at trace levels or when a large am...
Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models

Science.gov (United States)

Hadjidoukas, P. E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.

2015-03-01

We present Π4U, an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow.
Π4U: A high performance computing framework for Bayesian uncertainty quantification of complex models

International Nuclear Information System (INIS)

Hadjidoukas, P.E.; Angelikopoulos, P.; Papadimitriou, C.; Koumoutsakos, P.

2015-01-01

We present Π4U, 1 an extensible framework, for non-intrusive Bayesian Uncertainty Quantification and Propagation (UQ+P) of complex and computationally demanding physical models, that can exploit massively parallel computer architectures. The framework incorporates Laplace asymptotic approximations as well as stochastic algorithms, along with distributed numerical differentiation and task-based parallelism for heterogeneous clusters. Sampling is based on the Transitional Markov Chain Monte Carlo (TMCMC) algorithm and its variants. The optimization tasks associated with the asymptotic approximations are treated via the Covariance Matrix Adaptation Evolution Strategy (CMA-ES). A modified subset simulation method is used for posterior reliability measurements of rare events. The framework accommodates scheduling of multiple physical model evaluations based on an adaptive load balancing library and shows excellent scalability. In addition to the software framework, we also provide guidelines as to the applicability and efficiency of Bayesian tools when applied to computationally demanding physical models. Theoretical and computational developments are demonstrated with applications drawn from molecular dynamics, structural dynamics and granular flow
Lifetime laser damage performance of β-Ga2O3 for high power applications

Directory of Open Access Journals (Sweden)

Jae-Hyuck Yoo

2018-03-01

Full Text Available Gallium oxide (Ga2O3 is an emerging wide bandgap semiconductor with potential applications in power electronics and high power optical systems where gallium nitride and silicon carbide have already demonstrated unique advantages compared to gallium arsenide and silicon-based devices. Establishing the stability and breakdown conditions of these next-generation materials is critical to assessing their potential performance in devices subjected to large electric fields. Here, using systematic laser damage performance tests, we establish that β-Ga2O3 has the highest lifetime optical damage performance of any conductive material measured to date, above 10 J/cm2 (1.4 GW/cm2. This has direct implications for its use as an active component in high power laser systems and may give insight into its utility for high-power switching applications. Both heteroepitaxial and bulk β-Ga2O3 samples were benchmarked against a heteroepitaxial gallium nitride sample, revealing an order of magnitude higher optical lifetime damage threshold for β-Ga2O3. Photoluminescence and Raman spectroscopy results suggest that the exceptional damage performance of β-Ga2O3 is due to lower absorptive defect concentrations and reduced epitaxial stress.
Lifetime laser damage performance of β -Ga2O3 for high power applications

Science.gov (United States)

Yoo, Jae-Hyuck; Rafique, Subrina; Lange, Andrew; Zhao, Hongping; Elhadj, Selim

2018-03-01

Gallium oxide (Ga2O3) is an emerging wide bandgap semiconductor with potential applications in power electronics and high power optical systems where gallium nitride and silicon carbide have already demonstrated unique advantages compared to gallium arsenide and silicon-based devices. Establishing the stability and breakdown conditions of these next-generation materials is critical to assessing their potential performance in devices subjected to large electric fields. Here, using systematic laser damage performance tests, we establish that β-Ga2O3 has the highest lifetime optical damage performance of any conductive material measured to date, above 10 J/cm2 (1.4 GW/cm2). This has direct implications for its use as an active component in high power laser systems and may give insight into its utility for high-power switching applications. Both heteroepitaxial and bulk β-Ga2O3 samples were benchmarked against a heteroepitaxial gallium nitride sample, revealing an order of magnitude higher optical lifetime damage threshold for β-Ga2O3. Photoluminescence and Raman spectroscopy results suggest that the exceptional damage performance of β-Ga2O3 is due to lower absorptive defect concentrations and reduced epitaxial stress.
Computer sciences

Science.gov (United States)

Smith, Paul H.

1988-01-01

The Computer Science Program provides advanced concepts, techniques, system architectures, algorithms, and software for both space and aeronautics information sciences and computer systems. The overall goal is to provide the technical foundation within NASA for the advancement of computing technology in aerospace applications. The research program is improving the state of knowledge of fundamental aerospace computing principles and advancing computing technology in space applications such as software engineering and information extraction from data collected by scientific instruments in space. The program includes the development of special algorithms and techniques to exploit the computing power provided by high performance parallel processors and special purpose architectures. Research is being conducted in the fundamentals of data base logic and improvement techniques for producing reliable computing systems.
High performance computing of density matrix renormalization group method for 2-dimensional model. Parallelization strategy toward peta computing

International Nuclear Information System (INIS)

Yamada, Susumu; Igarashi, Ryo; Machida, Masahiko; Imamura, Toshiyuki; Okumura, Masahiko; Onishi, Hiroaki

2010-01-01

We parallelize the density matrix renormalization group (DMRG) method, which is a ground-state solver for one-dimensional quantum lattice systems. The parallelization allows us to extend the applicable range of the DMRG to n-leg ladders i.e., quasi two-dimension cases. Such an extension is regarded to bring about several breakthroughs in e.g., quantum-physics, chemistry, and nano-engineering. However, the straightforward parallelization requires all-to-all communications between all processes which are unsuitable for multi-core systems, which is a mainstream of current parallel computers. Therefore, we optimize the all-to-all communications by the following two steps. The first one is the elimination of the communications between all processes by only rearranging data distribution with the communication data amount kept. The second one is the avoidance of the communication conflict by rescheduling the calculation and the communication. We evaluate the performance of the DMRG method on multi-core supercomputers and confirm that our two-steps tuning is quite effective. (author)
Accomplish the Application Area in Cloud Computing

OpenAIRE

Bansal, Nidhi; Awasthi, Amit

2012-01-01

In the cloud computing application area of accomplish, we find the fact that cloud computing covers a lot of areas are its main asset. At a top level, it is an approach to IT where many users, some even from different companies get access to shared IT resources such as servers, routers and various file extensions, instead of each having their own dedicated servers. This offers many advantages like lower costs and higher efficiency. Unfortunately there have been some high profile incidents whe...
Applications of interval computations

CERN Document Server

Kreinovich, Vladik

1996-01-01

Primary Audience for the Book • Specialists in numerical computations who are interested in algorithms with automatic result verification. • Engineers, scientists, and practitioners who desire results with automatic verification and who would therefore benefit from the experience of suc cessful applications. • Students in applied mathematics and computer science who want to learn these methods. Goal Of the Book This book contains surveys of applications of interval computations, i. e. , appli cations of numerical methods with automatic result verification, that were pre sented at an international workshop on the subject in EI Paso, Texas, February 23-25, 1995. The purpose of this book is to disseminate detailed and surveyed information about existing and potential applications of this new growing field. Brief Description of the Papers At the most fundamental level, interval arithmetic operations work with sets: The result of a single arithmetic operation is the set of all possible results as the o...
Application of Ionic Liquids in High Performance Reversed-Phase Chromatography

Directory of Open Access Journals (Sweden)

Wentao Bi

2009-06-01

Full Text Available Ionic liquids, considered “green” chemicals, are widely used in many areas of analytical chemistry due to their unique properties. Recently, ionic liquids have been used as a kind of novel additive in separation and combined with silica to synthesize new stationary phase as separation media. This review will focus on the properties and mechanisms of ionic liquids and their potential applications as mobile phase modifier and surface-bonded stationary phase in reversed-phase high performance liquid chromatography (RP-HPLC. Ionic liquids demonstrate advantages and potential in chromatographic field.
The Benefits and Complexities of Operating Geographic Information Systems (GIS) in a High Performance Computing (HPC) Environment

Science.gov (United States)

Shute, J.; Carriere, L.; Duffy, D.; Hoy, E.; Peters, J.; Shen, Y.; Kirschbaum, D.

2017-12-01

The NASA Center for Climate Simulation (NCCS) at the Goddard Space Flight Center is building and maintaining an Enterprise GIS capability for its stakeholders, to include NASA scientists, industry partners, and the public. This platform is powered by three GIS subsystems operating in a highly-available, virtualized environment: 1) the Spatial Analytics Platform is the primary NCCS GIS and provides users discoverability of the vast DigitalGlobe/NGA raster assets within the NCCS environment; 2) the Disaster Mapping Platform provides mapping and analytics services to NASA's Disaster Response Group; and 3) the internal (Advanced Data Analytics Platform/ADAPT) enterprise GIS provides users with the full suite of Esri and open source GIS software applications and services. All systems benefit from NCCS's cutting edge infrastructure, to include an InfiniBand network for high speed data transfers; a mixed/heterogeneous environment featuring seamless sharing of information between Linux and Windows subsystems; and in-depth system monitoring and warning systems. Due to its co-location with the NCCS Discover High Performance Computing (HPC) environment and the Advanced Data Analytics Platform (ADAPT), the GIS platform has direct access to several large NCCS datasets including DigitalGlobe/NGA, Landsat, MERRA, and MERRA2. Additionally, the NCCS ArcGIS Desktop Windows virtual machines utilize existing NetCDF and OPeNDAP assets for visualization, modelling, and analysis - thus eliminating the need for data duplication. With the advent of this platform, Earth scientists have full access to vast data repositories and the industry-leading tools required for successful management and analysis of these multi-petabyte, global datasets. The full system architecture and integration with scientific datasets will be presented. Additionally, key applications and scientific analyses will be explained, to include the NASA Global Landslide Catalog (GLC) Reporter crowdsourcing application, the
CloudMC: a cloud computing application for Monte Carlo simulation

International Nuclear Information System (INIS)

Miras, H; Jiménez, R; Miras, C; Gomà, C

2013-01-01

This work presents CloudMC, a cloud computing application—developed in Windows Azure®, the platform of the Microsoft® cloud—for the parallelization of Monte Carlo simulations in a dynamic virtual cluster. CloudMC is a web application designed to be independent of the Monte Carlo code in which the simulations are based—the simulations just need to be of the form: input files → executable → output files. To study the performance of CloudMC in Windows Azure®, Monte Carlo simulations with penelope were performed on different instance (virtual machine) sizes, and for different number of instances. The instance size was found to have no effect on the simulation runtime. It was also found that the decrease in time with the number of instances followed Amdahl's law, with a slight deviation due to the increase in the fraction of non-parallelizable time with increasing number of instances. A simulation that would have required 30 h of CPU on a single instance was completed in 48.6 min when executed on 64 instances in parallel (speedup of 37 ×). Furthermore, the use of cloud computing for parallel computing offers some advantages over conventional clusters: high accessibility, scalability and pay per usage. Therefore, it is strongly believed that cloud computing will play an important role in making Monte Carlo dose calculation a reality in future clinical practice. (note)
In search of novel, high performance and intelligent materials for applications in severe and unconditioned environments

International Nuclear Information System (INIS)

Gyeabour Ayensu, A. I.; Normeshie, C. M. K.

2007-01-01

For extreme operating conditions in aerospace, nuclear power plants and medical applications, novel materials have become more competitive over traditional materials because of the unique characteristics. Extensive research programmes are being undertaken to develop high performance and knowledge-intensive new materials, since existing materials cannot meet the stringent technological requirements of advanced materials for emerging industries. The technologies of intermetallic compounds, nanostructural materials, advanced composites, and photonics materials are presented. In addition, medical biomaterial implants of high functional performance based on biocompatibility, resistance against corrosion and degradation, and for applications in hostile environment of human body are discussed. The opportunities for African researchers to collaborate in international research programmes to develop local raw materials into high performance materials are also highlighted. (au)
Collectively loading an application in a parallel computer

Science.gov (United States)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Miller, Samuel J.; Mundy, Michael B.

2016-01-05

Collectively loading an application in a parallel computer, the parallel computer comprising a plurality of compute nodes, including: identifying, by a parallel computer control system, a subset of compute nodes in the parallel computer to execute a job; selecting, by the parallel computer control system, one of the subset of compute nodes in the parallel computer as a job leader compute node; retrieving, by the job leader compute node from computer memory, an application for executing the job; and broadcasting, by the job leader to the subset of compute nodes in the parallel computer, the application for executing the job.
SiGe epitaxial memory for neuromorphic computing with reproducible high performance based on engineered dislocations

Science.gov (United States)

Choi, Shinhyun; Tan, Scott H.; Li, Zefan; Kim, Yunjo; Choi, Chanyeol; Chen, Pai-Yu; Yeon, Hanwool; Yu, Shimeng; Kim, Jeehwan

2018-01-01

Although several types of architecture combining memory cells and transistors have been used to demonstrate artificial synaptic arrays, they usually present limited scalability and high power consumption. Transistor-free analog switching devices may overcome these limitations, yet the typical switching process they rely on—formation of filaments in an amorphous medium—is not easily controlled and hence hampers the spatial and temporal reproducibility of the performance. Here, we demonstrate analog resistive switching devices that possess desired characteristics for neuromorphic computing networks with minimal performance variations using a single-crystalline SiGe layer epitaxially grown on Si as a switching medium. Such epitaxial random access memories utilize threading dislocations in SiGe to confine metal filaments in a defined, one-dimensional channel. This confinement results in drastically enhanced switching uniformity and long retention/high endurance with a high analog on/off ratio. Simulations using the MNIST handwritten recognition data set prove that epitaxial random access memories can operate with an online learning accuracy of 95.1%.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.