rbrc supercomputer program: Topics by WorldWideScience.org

Sample records for rbrc supercomputer program

PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP, VOLUME 77, RBRC SCIENTIFIC REVIEW COMMITTEE MEETING, OCTOBER 10-12, 2005

International Nuclear Information System (INIS)

SAMIOS, N.P.

2005-01-01

The eighth evaluation of the RIKEN BNL Research Center (RBRC) took place on October 10-12, 2005, at Brookhaven National Laboratory. The members of the Scientific Review Committee (SRC) were Dr. Jean-Paul Blaizot, Professor Makoto Kobayashi, Dr. Akira Masaike, Professor Charles Young Prescott (Chair), Professor Stephen Sharpe (absent), and Professor Jack Sandweiss. We are grateful to Professor Akira Ukawa who was appointed to the SRC to cover Professor Sharpe's area of expertise. In addition to reviewing this year's program, the committee, augmented by Professor Kozi Nakai, evaluated the RBRC proposal for a five-year extension of the RIKEN BNL Collaboration MOU beyond 2007. Dr. Koji Kaya, Director of the Discovery Research Institute, RIKEN, Japan, presided over the session on the extension proposal. In order to illustrate the breadth and scope of the RBRC program, each member of the Center made a presentation on higher research efforts. In addition, a special session was held in connection with the RBRC QCDSP and QCDOC supercomputers. Professor Norman H. Christ, a collaborator from Columbia University, gave a presentation on the progress and status of the project, and Professor Frithjof Karsch of BNL presented the first physics results from QCDOC. Although the main purpose of this review is a report to RIKEN Management (Dr. Ryoji Noyori, RIKEN President) on the health, scientific value, management and future prospects of the Center, the RBRC management felt that a compendium of the scientific presentations are of sufficient quality and interest that they warrant a wider distribution. Therefore we have made this compilation and present it to the community for its information and enlightenment
PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP, VOLUME 77, RBRC SCIENTIFIC REVIEW COMMITTEE MEETING, OCTOBER 10-12, 2005

Energy Technology Data Exchange (ETDEWEB)

SAMIOS, N.P.

2005-10-10

The eighth evaluation of the RIKEN BNL Research Center (RBRC) took place on October 10-12, 2005, at Brookhaven National Laboratory. The members of the Scientific Review Committee (SRC) were Dr. Jean-Paul Blaizot, Professor Makoto Kobayashi, Dr. Akira Masaike, Professor Charles Young Prescott (Chair), Professor Stephen Sharpe (absent), and Professor Jack Sandweiss. We are grateful to Professor Akira Ukawa who was appointed to the SRC to cover Professor Sharpe's area of expertise. In addition to reviewing this year's program, the committee, augmented by Professor Kozi Nakai, evaluated the RBRC proposal for a five-year extension of the RIKEN BNL Collaboration MOU beyond 2007. Dr. Koji Kaya, Director of the Discovery Research Institute, RIKEN, Japan, presided over the session on the extension proposal. In order to illustrate the breadth and scope of the RBRC program, each member of the Center made a presentation on higher research efforts. In addition, a special session was held in connection with the RBRC QCDSP and QCDOC supercomputers. Professor Norman H. Christ, a collaborator from Columbia University, gave a presentation on the progress and status of the project, and Professor Frithjof Karsch of BNL presented the first physics results from QCDOC. Although the main purpose of this review is a report to RIKEN Management (Dr. Ryoji Noyori, RIKEN President) on the health, scientific value, management and future prospects of the Center, the RBRC management felt that a compendium of the scientific presentations are of sufficient quality and interest that they warrant a wider distribution. Therefore we have made this compilation and present it to the community for its information and enlightenment.
Proceedings of RIKEN BNL Research Center Workshop, Volume 91, RBRC Scientific Review Committee Meeting

Energy Technology Data Exchange (ETDEWEB)

Samios,N.P.

2008-11-17

The ninth evaluation of the RIKEN BNL Research Center (RBRC) took place on Nov. 17-18, 2008, at Brookhaven National Laboratory. The members of the Scientific Review Committee (SRC) were Dr. Dr. Wit Busza (Chair), Dr. Miklos Gyulassy, Dr. Akira Masaike, Dr. Richard Milner, Dr. Alfred Mueller, and Dr. Akira Ukawa. We are pleased that Dr. Yasushige Yano, the Director of the Nishina Institute of RIKEN, Japan participated in this meeting both in informing the committee of the activities of the Nishina Institute and the role of RBRC and as an observer of this review. In order to illustrate the breadth and scope of the RBRC program, each member of the Center made a presentation on his/her research efforts. This encompassed three major areas of investigation, theoretical, experimental and computational physics. In addition the committee met privately with the fellows and postdocs to ascertain their opinions and concerns. Although the main purpose of this review is a report to RIKEN Management (Dr. Ryoji Noyori, RIKEN President) on the health, scientific value, management and future prospects of the Center, the RBRC management felt that a compendium of the scientific presentations are of sufficient quality and interest that they warrant a wider distribution. Therefore we have made this compilation and present it to the community for its information and enlightenment.
A training program for scientific supercomputing users

Energy Technology Data Exchange (ETDEWEB)

Hanson, F.; Moher, T.; Sabelli, N.; Solem, A.

1988-01-01

There is need for a mechanism to transfer supercomputing technology into the hands of scientists and engineers in such a way that they will acquire a foundation of knowledge that will permit integration of supercomputing as a tool in their research. Most computing center training emphasizes computer-specific information about how to use a particular computer system; most academic programs teach concepts to computer scientists. Only a few brief courses and new programs are designed for computational scientists. This paper describes an eleven-week training program aimed principally at graduate and postdoctoral students in computationally-intensive fields. The program is designed to balance the specificity of computing center courses, the abstractness of computer science courses, and the personal contact of traditional apprentice approaches. It is based on the experience of computer scientists and computational scientists, and consists of seminars and clinics given by many visiting and local faculty. It covers a variety of supercomputing concepts, issues, and practices related to architecture, operating systems, software design, numerical considerations, code optimization, graphics, communications, and networks. Its research component encourages understanding of scientific computing and supercomputer hardware issues. Flexibility in thinking about computing needs is emphasized by the use of several different supercomputer architectures, such as the Cray X/MP48 at the National Center for Supercomputing Applications at University of Illinois at Urbana-Champaign, IBM 3090 600E/VF at the Cornell National Supercomputer Facility, and Alliant FX/8 at the Advanced Computing Research Facility at Argonne National Laboratory. 11 refs., 6 tabs.
PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP: VOLUME 69 RBRC SCIENTIFIC REVIEW COMMITTEE MEETING

International Nuclear Information System (INIS)

SAMIOS, N.P.

2005-01-01

The RIKEN BNL Research Center (RSRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the 'Rikagaku Kenkyusho' (RIKEN, The Institute of Physical and Chemical Research) of Japan. The Center is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has both a theory and experimental component. At present the theoretical group has 4 Fellows and 3 Research Associates as well as 11 RHIC Physics/University Fellows (academic year 2003-2004). To date there are approximately 30 graduates from the program of which 13 have attained tenure positions at major institutions worldwide. The experimental group is smaller and has 2 Fellows and 3 RHIC Physics/University Fellows and 3 Research Associates, and historically 6 individuals have attained permanent positions. Beginning in 2001 a new RIKEN Spin Program (RSP) category was implemented at RBRC. These appointments are joint positions of RBRC and RIKEN and include the following positions in theory and experiment: RSP Researchers, RSP Research Associates, and Young Researchers, who are mentored by senior RBRC Scientists, A number of RIKEN Jr. Research Associates and Visiting Scientists also contribute to the physics program at the Center. RBRC has an active workshop program on strong interaction physics with each workshop focused on a specific physics problem. Each workshop speaker is encouraged to select a few of the most important transparencies from his or her presentation, accompanied by a page of explanation. This material is collected at the end of the workshop by the organizer to form proceedings, which can therefore be available within a short time. To date there are sixty nine proceedings volumes available. The construction of a 0.6 teraflops parallel processor, dedicated to lattice QCD, begun at the Center on February 19, 1998, was completed on August 28, 1998 and is still
Scientific articles of the RBRC/CCAST Symposium on Spin Physics Lattice QCD and RHIC Physics

International Nuclear Information System (INIS)

2003-01-01

This volume comprises scientific articles of the symposium on spin physics, lattice QCD and RHIC physics organized by RIKEN BNL research center (RBRC) and China center of advanced science and technology (CCAST). The talks were discussing the spin structure of nucleons and other problems of RHIC physics
Japanese supercomputer technology

International Nuclear Information System (INIS)

Buzbee, B.L.; Ewald, R.H.; Worlton, W.J.

1982-01-01

In February 1982, computer scientists from the Los Alamos National Laboratory and Lawrence Livermore National Laboratory visited several Japanese computer manufacturers. The purpose of these visits was to assess the state of the art of Japanese supercomputer technology and to advise Japanese computer vendors of the needs of the US Department of Energy (DOE) for more powerful supercomputers. The Japanese foresee a domestic need for large-scale computing capabilities for nuclear fusion, image analysis for the Earth Resources Satellite, meteorological forecast, electrical power system analysis (power flow, stability, optimization), structural and thermal analysis of satellites, and very large scale integrated circuit design and simulation. To meet this need, Japan has launched an ambitious program to advance supercomputer technology. This program is described
Supercomputational science

CERN Document Server

Wilson, S

1990-01-01

In contemporary research, the supercomputer now ranks, along with radio telescopes, particle accelerators and the other apparatus of "big science", as an expensive resource, which is nevertheless essential for state of the art research. Supercomputers are usually provided as shar.ed central facilities. However, unlike, telescopes and accelerators, they are find a wide range of applications which extends across a broad spectrum of research activity. The difference in performance between a "good" and a "bad" computer program on a traditional serial computer may be a factor of two or three, but on a contemporary supercomputer it can easily be a factor of one hundred or even more! Furthermore, this factor is likely to increase with future generations of machines. In keeping with the large capital and recurrent costs of these machines, it is appropriate to devote effort to training and familiarization so that supercomputers are employed to best effect. This volume records the lectures delivered at a Summer School ...
SCIENTIFIC PRESENTATIONS of the 11. MEETING OF THE MANAGEMENT STEERING COMMITTEE OF THE RIKEN BNL COLLABORATION (RBRC SCIENTIFIC ARTICLE, VOLUME 11)

International Nuclear Information System (INIS)

Samios, N.P.

2005-01-01

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkyusho,'' (RIKEN) The Institute of Physical and Chemical Research, of Japan. The Center is dedicated to the study of strong interactions, including hard QCD/spin physics, lattice QCD and RHIC (Relativistic Heavy Ion Collider) physics through nurturing of a new generation of young physicists. The agreement was extended in 2002 for another five year period. This 11th steering group meeting consisted of a series of reports on current activities and future perspectives. Presentation titles and authors included: 'RBRC operations and accomplishments' by Nicholas P. Samios, 'Theoretical physics at RIKEN-BNL Center: strong interactions and QCD' by Larry McLerran, 'RBRC experimental group and Wako base', by Hideto En'yo, 'The QCDOC project overview and status' by Norman H. Christ, 'RHIC spin physics' by Gerry Bunce, 'RHIC heavy ion progam' by Yasuyuki Akiba, 'RIKEN's current status and future plans' by Samuel Aronson, 'Procedure for proposing renewal of the collaboration agreement in 2007' by Chiharu Shimoyamada, and 'New direction of RPRC beyond JFY 2007' by Nicholas P. Samios
Role of supercomputers in magnetic fusion and energy research programs

International Nuclear Information System (INIS)

Killeen, J.

1985-06-01

The importance of computer modeling in magnetic fusion (MFE) and energy research (ER) programs is discussed. The need for the most advanced supercomputers is described, and the role of the National Magnetic Fusion Energy Computer Center in meeting these needs is explained
An assessment of worldwide supercomputer usage

Energy Technology Data Exchange (ETDEWEB)

Wasserman, H.J.; Simmons, M.L.; Hayes, A.H.

1995-01-01

This report provides a comparative study of advanced supercomputing usage in Japan and the United States as of Spring 1994. It is based on the findings of a group of US scientists whose careers have centered on programming, evaluating, and designing high-performance supercomputers for over ten years. The report is a follow-on to an assessment of supercomputing technology in Europe and Japan that was published in 1993. Whereas the previous study focused on supercomputer manufacturing capabilities, the primary focus of the current work was to compare where and how supercomputers are used. Research for this report was conducted through both literature studies and field research in Japan.
SCIENTIFIC PRESENTATION. 7TH MEETING OF THE MANAGEMENT STEERING COMMITTEE OF THE RIKEN BNL COLLABORATION.

Energy Technology Data Exchange (ETDEWEB)

Lee, T.D.

2001-02-13

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkysho,'' (RIKEN) The Institute of Physical and Chemical Research, of Japan. The Center is dedicated to the study of strong interactions, including hard QCD/spin physics, lattice QCD and RHIC (Relativistic Heavy Ion Collider) physics through nurturing of a new generation of young physicists. The Director of RBRC is Professor T. D. Lee. The first years were dedicated to the establishment of a theory group. This has essentially been completed consisting of Fellows, Postdocs, and RHIC Physics/University Fellows, with an active group of consultants. The center also organizes an extensive series of workshops on specific topics in strong interactions with an accompanying series of published proceedings. In addition, a 0.6 teraflop parallel processor computer has been constructed and operational since August 1998. It was awarded the Supercomputer 1998 Gordon Bell Prize for price performance. An active experimental group centered around the spin physics program at RHIC has subsequently also been established at RBRC. It presently consists of five Fellows, one Postdoc and several scientific collaborators with more appointments being expected in the near future. Members and participants of RBRC on occasion will develop articles such as this one, in the nature of a status report or a general review.
SCIENTIFIC PRESENTATION. 7TH MEETING OF THE MANAGEMENT STEERING COMMITTEE OF THE RIKEN BNL COLLABORATION.

Energy Technology Data Exchange (ETDEWEB)

LEE,T.D.

2001-02-13

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkysho,'' (RIKEN) The Institute of Physical and Chemical Research, of Japan. The Center is dedicated to the study of strong 'interactions, including hard QCD/spin physics, lattice QCD and RHIC (Relativistic Heavy Ion Collider) physics through nurturing of a new generation of young physicists. The Director of RBRC is Professor T. D. Lee. The first years were dedicated to the establishment of a theory group. This has essentially been completed consisting of Fellows, Postdocs, and RHIC Physics/University Fellows, with an active group of consultants. The center also organizes an extensive series of workshops on specific topics in strong interactions with an accompanying series of published proceedings. In addition, a 0.6 teraflop parallel processor computer has been constructed and operational since August 1998. It was awarded the Supercomputer 1998 Gordon Bell Prize for price performance. An active experimental group centered around the spin physics program at RHIC has subsequently also been established at RBRC. It presently consists of five Fellows, one Postdoc and several scientific collaborators with more appointments being expected in the near future. Members and participants of RBRC on occasion will develop articles such as this one, in the nature of a status report or a general review.
Scientific presentation. 7th meeting of the management steering committee of the RIKEN BNL Collaboration

International Nuclear Information System (INIS)

Lee, T.D.

2001-01-01

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkysho,'' (RIKEN) The Institute of Physical and Chemical Research, of Japan. The Center is dedicated to the study of strong interactions, including hard QCD/spin physics, lattice QCD and RHIC (Relativistic Heavy Ion Collider) physics through nurturing of a new generation of young physicists. The Director of RBRC is Professor T. D. Lee. The first years were dedicated to the establishment of a theory group. This has essentially been completed consisting of Fellows, Postdocs, and RHIC Physics/University Fellows, with an active group of consultants. The center also organizes an extensive series of workshops on specific topics in strong interactions with an accompanying series of published proceedings. In addition, a 0.6 teraflop parallel processor computer has been constructed and operational since August 1998. It was awarded the Supercomputer 1998 Gordon Bell Prize for price performance. An active experimental group centered around the spin physics program at RHIC has subsequently also been established at RBRC. It presently consists of five Fellows, one Postdoc and several scientific collaborators with more appointments being expected in the near future. Members and participants of RBRC on occasion will develop articles such as this one, in the nature of a status report or a general review
What is supercomputing ?

International Nuclear Information System (INIS)

Asai, Kiyoshi

1992-01-01

Supercomputing means the high speed computation using a supercomputer. Supercomputers and the technical term ''supercomputing'' have spread since ten years ago. The performances of the main computers installed so far in Japan Atomic Energy Research Institute are compared. There are two methods to increase computing speed by using existing circuit elements, parallel processor system and vector processor system. CRAY-1 is the first successful vector computer. Supercomputing technology was first applied to meteorological organizations in foreign countries, and to aviation and atomic energy research institutes in Japan. The supercomputing for atomic energy depends on the trend of technical development in atomic energy, and the contents are divided into the increase of computing speed in existing simulation calculation and the acceleration of the new technical development of atomic energy. The examples of supercomputing in Japan Atomic Energy Research Institute are reported. (K.I.)
Adventures in supercomputing: An innovative program for high school teachers

Energy Technology Data Exchange (ETDEWEB)

Oliver, C.E.; Hicks, H.R.; Summers, B.G. [Oak Ridge National Lab., TN (United States); Staten, D.G. [Wartburg Central High School, TN (United States)

1994-12-31

Within the realm of education, seldom does an innovative program become available with the potential to change an educator`s teaching methodology. Adventures in Supercomputing (AiS), sponsored by the U.S. Department of Energy (DOE), is such a program. It is a program for high school teachers that changes the teacher paradigm from a teacher-directed approach of teaching to a student-centered approach. {open_quotes}A student-centered classroom offers better opportunities for development of internal motivation, planning skills, goal setting and perseverance than does the traditional teacher-directed mode{close_quotes}. Not only is the process of teaching changed, but the cross-curricula integration within the AiS materials is remarkable. Written from a teacher`s perspective, this paper will describe the AiS program and its effects on teachers and students, primarily at Wartburg Central High School, in Wartburg, Tennessee. The AiS program in Tennessee is sponsored by Oak Ridge National Laboratory (ORNL).
Advanced parallel processing with supercomputer architectures

International Nuclear Information System (INIS)

Hwang, K.

1987-01-01

This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers
Comments on the parallelization efficiency of the Sunway TaihuLight supercomputer

OpenAIRE

Végh, János

2016-01-01

In the world of supercomputers, the large number of processors requires to minimize the inefficiencies of parallelization, which appear as a sequential part of the program from the point of view of Amdahl's law. The recently suggested new figure of merit is applied to the recently presented supercomputer, and the timeline of "Top 500" supercomputers is scrutinized using the metric. It is demonstrated, that in addition to the computing performance and power consumption, the new supercomputer i...
Supercomputing - Use Cases, Advances, The Future (1/2)

CERN Multimedia

CERN. Geneva

2017-01-01

Supercomputing has become a staple of science and the poster child for aggressive developments in silicon technology, energy efficiency and programming. In this series we examine the key components of supercomputing setups and the various advances – recent and past – that made headlines and delivered bigger and bigger machines. We also take a closer look at the future prospects of supercomputing, and the extent of its overlap with high throughput computing, in the context of main use cases ranging from oil exploration to market simulation. On the first day, we will focus on the history and theory of supercomputing, the top500 list and the hardware that makes supercomputers tick. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP an...
Supercomputing - Use Cases, Advances, The Future (2/2)

CERN Multimedia

CERN. Geneva

2017-01-01

Supercomputing has become a staple of science and the poster child for aggressive developments in silicon technology, energy efficiency and programming. In this series we examine the key components of supercomputing setups and the various advances – recent and past – that made headlines and delivered bigger and bigger machines. We also take a closer look at the future prospects of supercomputing, and the extent of its overlap with high throughput computing, in the context of main use cases ranging from oil exploration to market simulation. On the second day, we will focus on software and software paradigms driving supercomputers, workloads that need supercomputing treatment, advances in technology and possible future developments. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and i...

Adaptability of supercomputers to nuclear computations

International Nuclear Information System (INIS)

Asai, Kiyoshi; Ishiguro, Misako; Matsuura, Toshihiko.

1983-01-01

Recently in the field of scientific and technical calculation, the usefulness of supercomputers represented by CRAY-1 has been recognized, and they are utilized in various countries. The rapid computation of supercomputers is based on the function of vector computation. The authors investigated the adaptability to vector computation of about 40 typical atomic energy codes for the past six years. Based on the results of investigation, the adaptability of the function of vector computation that supercomputers have to atomic energy codes, the problem regarding the utilization and the future prospect are explained. The adaptability of individual calculation codes to vector computation is largely dependent on the algorithm and program structure used for the codes. The change to high speed by pipeline vector system, the investigation in the Japan Atomic Energy Research Institute and the results, and the examples of expressing the codes for atomic energy, environmental safety and nuclear fusion by vector are reported. The magnification of speed up for 40 examples was from 1.5 to 9.0. It can be said that the adaptability of supercomputers to atomic energy codes is fairly good. (Kako, I.)
KAUST Supercomputing Laboratory

KAUST Repository

Bailey, April Renee

2011-11-15

KAUST has partnered with IBM to establish a Supercomputing Research Center. KAUST is hosting the Shaheen supercomputer, named after the Arabian falcon famed for its swiftness of flight. This 16-rack IBM Blue Gene/P system is equipped with 4 gigabyte memory per node and capable of 222 teraflops, making KAUST campus the site of one of the world’s fastest supercomputers in an academic environment. KAUST is targeting petaflop capability within 3 years.
KAUST Supercomputing Laboratory

KAUST Repository

Bailey, April Renee; Kaushik, Dinesh; Winfer, Andrew

2011-01-01

KAUST has partnered with IBM to establish a Supercomputing Research Center. KAUST is hosting the Shaheen supercomputer, named after the Arabian falcon famed for its swiftness of flight. This 16-rack IBM Blue Gene/P system is equipped with 4 gigabyte memory per node and capable of 222 teraflops, making KAUST campus the site of one of the world’s fastest supercomputers in an academic environment. KAUST is targeting petaflop capability within 3 years.
Comprehensive efficiency analysis of supercomputer resource usage based on system monitoring data

Science.gov (United States)

Mamaeva, A. A.; Shaykhislamov, D. I.; Voevodin, Vad V.; Zhumatiy, S. A.

2018-03-01

One of the main problems of modern supercomputers is the low efficiency of their usage, which leads to the significant idle time of computational resources, and, in turn, to the decrease in speed of scientific research. This paper presents three approaches to study the efficiency of supercomputer resource usage based on monitoring data analysis. The first approach performs an analysis of computing resource utilization statistics, which allows to identify different typical classes of programs, to explore the structure of the supercomputer job flow and to track overall trends in the supercomputer behavior. The second approach is aimed specifically at analyzing off-the-shelf software packages and libraries installed on the supercomputer, since efficiency of their usage is becoming an increasingly important factor for the efficient functioning of the entire supercomputer. Within the third approach, abnormal jobs – jobs with abnormally inefficient behavior that differs significantly from the standard behavior of the overall supercomputer job flow – are being detected. For each approach, the results obtained in practice in the Supercomputer Center of Moscow State University are demonstrated.
Enabling department-scale supercomputing

Energy Technology Data Exchange (ETDEWEB)

Greenberg, D.S.; Hart, W.E.; Phillips, C.A.

1997-11-01

The Department of Energy (DOE) national laboratories have one of the longest and most consistent histories of supercomputer use. The authors summarize the architecture of DOE`s new supercomputers that are being built for the Accelerated Strategic Computing Initiative (ASCI). The authors then argue that in the near future scaled-down versions of these supercomputers with petaflop-per-weekend capabilities could become widely available to hundreds of research and engineering departments. The availability of such computational resources will allow simulation of physical phenomena to become a full-fledged third branch of scientific exploration, along with theory and experimentation. They describe the ASCI and other supercomputer applications at Sandia National Laboratories, and discuss which lessons learned from Sandia`s long history of supercomputing can be applied in this new setting.
Computational plasma physics and supercomputers

International Nuclear Information System (INIS)

Killeen, J.; McNamara, B.

1984-09-01

The Supercomputers of the 80's are introduced. They are 10 to 100 times more powerful than today's machines. The range of physics modeling in the fusion program is outlined. New machine architecture will influence particular codes, but parallel processing poses new coding difficulties. Increasing realism in simulations will require better numerics and more elaborate mathematics
Computational plasma physics and supercomputers. Revision 1

International Nuclear Information System (INIS)

Killeen, J.; McNamara, B.

1985-01-01

The Supercomputers of the 80's are introduced. They are 10 to 100 times more powerful than today's machines. The range of physics modeling in the fusion program is outlined. New machine architecture will influence particular models, but parallel processing poses new programming difficulties. Increasing realism in simulations will require better numerics and more elaborate mathematical models
Computational Dimensionalities of Global Supercomputing

Directory of Open Access Journals (Sweden)

Richard S. Segall

2013-12-01

Full Text Available This Invited Paper pertains to subject of my Plenary Keynote Speech at the 17th World Multi-Conference on Systemics, Cybernetics and Informatics (WMSCI 2013 held in Orlando, Florida on July 9-12, 2013. The title of my Plenary Keynote Speech was: "Dimensionalities of Computation: from Global Supercomputing to Data, Text and Web Mining" but this Invited Paper will focus only on the "Computational Dimensionalities of Global Supercomputing" and is based upon a summary of the contents of several individual articles that have been previously written with myself as lead author and published in [75], [76], [77], [78], [79], [80] and [11]. The topics of these of the Plenary Speech included Overview of Current Research in Global Supercomputing [75], Open-Source Software Tools for Data Mining Analysis of Genomic and Spatial Images using High Performance Computing [76], Data Mining Supercomputing with SAS™ JMP® Genomics ([77], [79], [80], and Visualization by Supercomputing Data Mining [81]. ______________________ [11.] Committee on the Future of Supercomputing, National Research Council (2003, The Future of Supercomputing: An Interim Report, ISBN-13: 978-0-309-09016- 2, http://www.nap.edu/catalog/10784.html [75.] Segall, Richard S.; Zhang, Qingyu and Cook, Jeffrey S.(2013, "Overview of Current Research in Global Supercomputing", Proceedings of Forty- Fourth Meeting of Southwest Decision Sciences Institute (SWDSI, Albuquerque, NM, March 12-16, 2013. [76.] Segall, Richard S. and Zhang, Qingyu (2010, "Open-Source Software Tools for Data Mining Analysis of Genomic and Spatial Images using High Performance Computing", Proceedings of 5th INFORMS Workshop on Data Mining and Health Informatics, Austin, TX, November 6, 2010. [77.] Segall, Richard S., Zhang, Qingyu and Pierce, Ryan M.(2010, "Data Mining Supercomputing with SAS™ JMP®; Genomics: Research-in-Progress, Proceedings of 2010 Conference on Applied Research in Information Technology, sponsored by
Introduction to Reconfigurable Supercomputing

CERN Document Server

Lanzagorta, Marco; Rosenberg, Robert

2010-01-01

This book covers technologies, applications, tools, languages, procedures, advantages, and disadvantages of reconfigurable supercomputing using Field Programmable Gate Arrays (FPGAs). The target audience is the community of users of High Performance Computers (HPe who may benefit from porting their applications into a reconfigurable environment. As such, this book is intended to guide the HPC user through the many algorithmic considerations, hardware alternatives, usability issues, programming languages, and design tools that need to be understood before embarking on the creation of reconfigur
Status of supercomputers in the US

International Nuclear Information System (INIS)

Fernbach, S.

1985-01-01

Current Supercomputers; that is, the Class VI machines which first became available in 1976 are being delivered in greater quantity than ever before. In addition, manufacturers are busily working on Class VII machines to be ready for delivery in CY 1987. Mainframes are being modified or designed to take on some features of the supercomputers and new companies with the intent of either competing directly in the supercomputer arena or in providing entry-level systems from which to graduate to supercomputers are springing up everywhere. Even well founded organizations like IBM and CDC are adding machines with vector instructions in their repertoires. Japanese - manufactured supercomputers are also being introduced into the U.S. Will these begin to compete with those of U.S. manufacture. Are they truly competitive. It turns out that both from the hardware and software points of view they may be superior. We may be facing the same problems in supercomputers that we faced in videosystems
Integration of Panda Workload Management System with supercomputers

Science.gov (United States)

De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Nilsson, P.; Novikov, A.; Oleynik, D.; Panitkin, S.; Poyda, A.; Read, K. F.; Ryabinkin, E.; Teslyuk, A.; Velikhov, V.; Wells, J. C.; Wenaus, T.

2016-09-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 140 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250000 cores with a peak performance of 0.3+ petaFLOPS, next LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), Supercomputer at the National Research Center "Kurchatov Institute", IT4 in Ostrava, and others). The current approach utilizes a modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run singlethreaded workloads in parallel on Titan's multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads
Supercomputing and related national projects in Japan

International Nuclear Information System (INIS)

Miura, Kenichi

1985-01-01

Japanese supercomputer development activities in the industry and research projects are outlined. Architecture, technology, software, and applications of Fujitsu's Vector Processor Systems are described as an example of Japanese supercomputers. Applications of supercomputers to high energy physics are also discussed. (orig.)
Interactive real-time nuclear plant simulations on a UNIX based supercomputer

International Nuclear Information System (INIS)

Behling, S.R.

1990-01-01

Interactive real-time nuclear plant simulations are critically important to train nuclear power plant engineers and operators. In addition, real-time simulations can be used to test the validity and timing of plant technical specifications and operational procedures. To accurately and confidently simulate a nuclear power plant transient in real-time, sufficient computer resources must be available. Since some important transients cannot be simulated using preprogrammed responses or non-physical models, commonly used simulation techniques may not be adequate. However, the power of a supercomputer allows one to accurately calculate the behavior of nuclear power plants even during very complex transients. Many of these transients can be calculated in real-time or quicker on the fastest supercomputers. The concept of running interactive real-time nuclear power plant transients on a supercomputer has been tested. This paper describes the architecture of the simulation program, the techniques used to establish real-time synchronization, and other issues related to the use of supercomputers in a new and potentially very important area. (author)
Porting Ordinary Applications to Blue Gene/Q Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Maheshwari, Ketan C.; Wozniak, Justin M.; Armstrong, Timothy; Katz, Daniel S.; Binkowski, T. Andrew; Zhong, Xiaoliang; Heinonen, Olle; Karpeyev, Dmitry; Wilde, Michael

2015-08-31

Efficiently porting ordinary applications to Blue Gene/Q supercomputers is a significant challenge. Codes are often originally developed without considering advanced architectures and related tool chains. Science needs frequently lead users to want to run large numbers of relatively small jobs (often called many-task computing, an ensemble, or a workflow), which can conflict with supercomputer configurations. In this paper, we discuss techniques developed to execute ordinary applications over leadership class supercomputers. We use the high-performance Swift parallel scripting framework and build two workflow execution techniques-sub-jobs and main-wrap. The sub-jobs technique, built on top of the IBM Blue Gene/Q resource manager Cobalt's sub-block jobs, lets users submit multiple, independent, repeated smaller jobs within a single larger resource block. The main-wrap technique is a scheme that enables C/C++ programs to be defined as functions that are wrapped by a high-performance Swift wrapper and that are invoked as a Swift script. We discuss the needs, benefits, technicalities, and current limitations of these techniques. We further discuss the real-world science enabled by these techniques and the results obtained.
TOP500 Supercomputers for June 2004

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2004-06-23

23rd Edition of TOP500 List of World's Fastest Supercomputers Released: Japan's Earth Simulator Enters Third Year in Top Position MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 23rd edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2004) at the International Supercomputer Conference in Heidelberg, Germany.
TOP500 Supercomputers for June 2005

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2005-06-22

25th Edition of TOP500 List of World's Fastest Supercomputers Released: DOE/L LNL BlueGene/L and IBM gain Top Positions MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 25th edition of the TOP500 list of the world's fastest supercomputers was released today (June 22, 2005) at the 20th International Supercomputing Conference (ISC2005) in Heidelberg Germany.
Flux-Level Transit Injection Experiments with NASA Pleiades Supercomputer

Science.gov (United States)

Li, Jie; Burke, Christopher J.; Catanzarite, Joseph; Seader, Shawn; Haas, Michael R.; Batalha, Natalie; Henze, Christopher; Christiansen, Jessie; Kepler Project, NASA Advanced Supercomputing Division

2016-06-01

Flux-Level Transit Injection (FLTI) experiments are executed with NASA's Pleiades supercomputer for the Kepler Mission. The latest release (9.3, January 2016) of the Kepler Science Operations Center Pipeline is used in the FLTI experiments. Their purpose is to validate the Analytic Completeness Model (ACM), which can be computed for all Kepler target stars, thereby enabling exoplanet occurrence rate studies. Pleiades, a facility of NASA's Advanced Supercomputing Division, is one of the world's most powerful supercomputers and represents NASA's state-of-the-art technology. We discuss the details of implementing the FLTI experiments on the Pleiades supercomputer. For example, taking into account that ~16 injections are generated by one core of the Pleiades processors in an hour, the “shallow” FLTI experiment, in which ~2000 injections are required per target star, can be done for 16% of all Kepler target stars in about 200 hours. Stripping down the transit search to bare bones, i.e. only searching adjacent high/low periods at high/low pulse durations, makes the computationally intensive FLTI experiments affordable. The design of the FLTI experiments and the analysis of the resulting data are presented in “Validating an Analytic Completeness Model for Kepler Target Stars Based on Flux-level Transit Injection Experiments” by Catanzarite et al. (#2494058).Kepler was selected as the 10th mission of the Discovery Program. Funding for the Kepler Mission has been provided by the NASA Science Mission Directorate.
Guide to dataflow supercomputing basic concepts, case studies, and a detailed example

CERN Document Server

Milutinovic, Veljko; Trifunovic, Nemanja; Giorgi, Roberto

2015-01-01

This unique text/reference describes an exciting and novel approach to supercomputing in the DataFlow paradigm. The major advantages and applications of this approach are clearly described, and a detailed explanation of the programming model is provided using simple yet effective examples. The work is developed from a series of lecture courses taught by the authors in more than 40 universities across more than 20 countries, and from research carried out by Maxeler Technologies, Inc. Topics and features: presents a thorough introduction to DataFlow supercomputing for big data problems; revie
Personal Supercomputing for Monte Carlo Simulation Using a GPU

Energy Technology Data Exchange (ETDEWEB)

Oh, Jae-Yong; Koo, Yang-Hyun; Lee, Byung-Ho [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

2008-05-15

Since the usability, accessibility, and maintenance of a personal computer (PC) are very good, a PC is a useful computer simulation tool for researchers. It has enough calculation power to simulate a small scale system with the improved performance of a PC's CPU. However, if a system is large or long time scale, we need a cluster computer or supercomputer. Recently great changes have occurred in the PC calculation environment. A graphic process unit (GPU) on a graphic card, only used to calculate display data, has a superior calculation capability to a PC's CPU. This GPU calculation performance is a match for the supercomputer in 2000. Although it has such a great calculation potential, it is not easy to program a simulation code for GPU due to difficult programming techniques for converting a calculation matrix to a 3D rendering image using graphic APIs. In 2006, NVIDIA provided the Software Development Kit (SDK) for the programming environment for NVIDIA's graphic cards, which is called the Compute Unified Device Architecture (CUDA). It makes the programming on the GPU easy without knowledge of the graphic APIs. This paper describes the basic architectures of NVIDIA's GPU and CUDA, and carries out a performance benchmark for the Monte Carlo simulation.
Personal Supercomputing for Monte Carlo Simulation Using a GPU

International Nuclear Information System (INIS)

Oh, Jae-Yong; Koo, Yang-Hyun; Lee, Byung-Ho

2008-01-01

Since the usability, accessibility, and maintenance of a personal computer (PC) are very good, a PC is a useful computer simulation tool for researchers. It has enough calculation power to simulate a small scale system with the improved performance of a PC's CPU. However, if a system is large or long time scale, we need a cluster computer or supercomputer. Recently great changes have occurred in the PC calculation environment. A graphic process unit (GPU) on a graphic card, only used to calculate display data, has a superior calculation capability to a PC's CPU. This GPU calculation performance is a match for the supercomputer in 2000. Although it has such a great calculation potential, it is not easy to program a simulation code for GPU due to difficult programming techniques for converting a calculation matrix to a 3D rendering image using graphic APIs. In 2006, NVIDIA provided the Software Development Kit (SDK) for the programming environment for NVIDIA's graphic cards, which is called the Compute Unified Device Architecture (CUDA). It makes the programming on the GPU easy without knowledge of the graphic APIs. This paper describes the basic architectures of NVIDIA's GPU and CUDA, and carries out a performance benchmark for the Monte Carlo simulation

TOP500 Supercomputers for November 2003

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2003-11-16

22nd Edition of TOP500 List of World s Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 22nd edition of the TOP500 list of the worlds fastest supercomputers was released today (November 16, 2003). The Earth Simulator supercomputer retains the number one position with its Linpack benchmark performance of 35.86 Tflop/s (''teraflops'' or trillions of calculations per second). It was built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan.
Proceedings of the first energy research power supercomputer users symposium

International Nuclear Information System (INIS)

1991-01-01

The Energy Research Power Supercomputer Users Symposium was arranged to showcase the richness of science that has been pursued and accomplished in this program through the use of supercomputers and now high performance parallel computers over the last year: this report is the collection of the presentations given at the Symposium. ''Power users'' were invited by the ER Supercomputer Access Committee to show that the use of these computational tools and the associated data communications network, ESNet, go beyond merely speeding up computations. Today the work often directly contributes to the advancement of the conceptual developments in their fields and the computational and network resources form the very infrastructure of today's science. The Symposium also provided an opportunity, which is rare in this day of network access to computing resources, for the invited users to compare and discuss their techniques and approaches with those used in other ER disciplines. The significance of new parallel architectures was highlighted by the interesting evening talk given by Dr. Stephen Orszag of Princeton University
INTEL: Intel based systems move up in supercomputing ranks

CERN Multimedia

2002-01-01

"The TOP500 supercomputer rankings released today at the Supercomputing 2002 conference show a dramatic increase in the number of Intel-based systems being deployed in high-performance computing (HPC) or supercomputing areas" (1/2 page).
World's fastest supercomputer opens up to users

Science.gov (United States)

Xin, Ling

2016-08-01

China's latest supercomputer - Sunway TaihuLight - has claimed the crown as the world's fastest computer according to the latest TOP500 list, released at the International Supercomputer Conference in Frankfurt in late June.
OpenMP Performance on the Columbia Supercomputer

Science.gov (United States)

Haoqiang, Jin; Hood, Robert

2005-01-01

This presentation discusses Columbia World Class Supercomputer which is one of the world's fastest supercomputers providing 61 TFLOPs (10/20/04). Conceived, designed, built, and deployed in just 120 days. A 20-node supercomputer built on proven 512-processor nodes. The largest SGI system in the world with over 10,000 Intel Itanium 2 processors and provides the largest node size incorporating commodity parts (512) and the largest shared-memory environment (2048) with 88% efficiency tops the scalar systems on the Top500 list.
Centralized supercomputer support for magnetic fusion energy research

International Nuclear Information System (INIS)

Fuss, D.; Tull, G.G.

1984-01-01

High-speed computers with large memories are vital to magnetic fusion energy research. Magnetohydrodynamic (MHD), transport, equilibrium, Vlasov, particle, and Fokker-Planck codes that model plasma behavior play an important role in designing experimental hardware and interpreting the resulting data, as well as in advancing plasma theory itself. The size, architecture, and software of supercomputers to run these codes are often the crucial constraints on the benefits such computational modeling can provide. Hence, vector computers such as the CRAY-1 offer a valuable research resource. To meet the computational needs of the fusion program, the National Magnetic Fusion Energy Computer Center (NMFECC) was established in 1974 at the Lawrence Livermore National Laboratory. Supercomputers at the central computing facility are linked to smaller computer centers at each of the major fusion laboratories by a satellite communication network. In addition to providing large-scale computing, the NMFECC environment stimulates collaboration and the sharing of computer codes and data among the many fusion researchers in a cost-effective manner
Extending ATLAS Computing to Commercial Clouds and Supercomputers

CERN Document Server

Nilsson, P; The ATLAS collaboration; Filipcic, A; Klimentov, A; Maeno, T; Oleynik, D; Panitkin, S; Wenaus, T; Wu, W

2014-01-01

The Large Hadron Collider will resume data collection in 2015 with substantially increased computing requirements relative to its first 2009-2013 run. A near doubling of the energy and the data rate, high level of event pile-up, and detector upgrades will mean the number and complexity of events to be analyzed will increase dramatically. A naive extrapolation of the Run 1 experience would suggest that a 5-6 fold increase in computing resources are needed - impossible within the anticipated flat computing budgets in the near future. Consequently ATLAS is engaged in an ambitious program to expand its computing to all available resources, notably including opportunistic use of commercial clouds and supercomputers. Such resources present new challenges in managing heterogeneity, supporting data flows, parallelizing workflows, provisioning software, and other aspects of distributed computing, all while minimizing operational load. We will present the ATLAS experience to date with clouds and supercomputers, and des...
Desktop supercomputer: what can it do?

Science.gov (United States)

Bogdanov, A.; Degtyarev, A.; Korkhov, V.

2017-12-01

The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.
FPS scientific and supercomputers computers in chemistry

International Nuclear Information System (INIS)

Curington, I.J.

1987-01-01

FPS Array Processors, scientific computers, and highly parallel supercomputers are used in nearly all aspects of compute-intensive computational chemistry. A survey is made of work utilizing this equipment, both published and current research. The relationship of the computer architecture to computational chemistry is discussed, with specific reference to Molecular Dynamics, Quantum Monte Carlo simulations, and Molecular Graphics applications. Recent installations of the FPS T-Series are highlighted, and examples of Molecular Graphics programs running on the FPS-5000 are shown
Desktop supercomputer: what can it do?

International Nuclear Information System (INIS)

Bogdanov, A.; Degtyarev, A.; Korkhov, V.

2017-01-01

The paper addresses the issues of solving complex problems that require using supercomputers or multiprocessor clusters available for most researchers nowadays. Efficient distribution of high performance computing resources according to actual application needs has been a major research topic since high-performance computing (HPC) technologies became widely introduced. At the same time, comfortable and transparent access to these resources was a key user requirement. In this paper we discuss approaches to build a virtual private supercomputer available at user's desktop: a virtual computing environment tailored specifically for a target user with a particular target application. We describe and evaluate possibilities to create the virtual supercomputer based on light-weight virtualization technologies, and analyze the efficiency of our approach compared to traditional methods of HPC resource management.
TOP500 Supercomputers for November 2004

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2004-11-08

24th Edition of TOP500 List of World's Fastest Supercomputers Released: DOE/IBM BlueGene/L and NASA/SGI's Columbia gain Top Positions MANNHEIM, Germany; KNOXVILLE, Tenn.; BERKELEY, Calif. In what has become a closely watched event in the world of high-performance computing, the 24th edition of the TOP500 list of the worlds fastest supercomputers was released today (November 8, 2004) at the SC2004 Conference in Pittsburgh, Pa.
TOP500 Supercomputers for June 2003

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2003-06-23

21st Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 21st edition of the TOP500 list of the world's fastest supercomputers was released today (June 23, 2003). The Earth Simulator supercomputer built by NEC and installed last year at the Earth Simulator Center in Yokohama, Japan, with its Linpack benchmark performance of 35.86 Tflop/s (teraflops or trillions of calculations per second), retains the number one position. The number 2 position is held by the re-measured ASCI Q system at Los Alamos National Laboratory. With 13.88 Tflop/s, it is the second system ever to exceed the 10 Tflop/smark. ASCIQ was built by Hewlett-Packard and is based on the AlphaServerSC computer system.
TOP500 Supercomputers for June 2002

Energy Technology Data Exchange (ETDEWEB)

Strohmaier, Erich; Meuer, Hans W.; Dongarra, Jack; Simon, Horst D.

2002-06-20

19th Edition of TOP500 List of World's Fastest Supercomputers Released MANNHEIM, Germany; KNOXVILLE, Tenn.;&BERKELEY, Calif. In what has become a much-anticipated event in the world of high-performance computing, the 19th edition of the TOP500 list of the worlds fastest supercomputers was released today (June 20, 2002). The recently installed Earth Simulator supercomputer at the Earth Simulator Center in Yokohama, Japan, is as expected the clear new number 1. Its performance of 35.86 Tflop/s (trillions of calculations per second) running the Linpack benchmark is almost five times higher than the performance of the now No.2 IBM ASCI White system at Lawrence Livermore National Laboratory (7.2 Tflop/s). This powerful leap frogging to the top by a system so much faster than the previous top system is unparalleled in the history of the TOP500.
Status reports of supercomputing astrophysics in Japan

International Nuclear Information System (INIS)

Nakamura, Takashi; Nagasawa, Mikio

1990-01-01

The Workshop on Supercomputing Astrophysics was held at National Laboratory for High Energy Physics (KEK, Tsukuba) from August 31 to September 2, 1989. More than 40 participants of physicists, astronomers were attendant and discussed many topics in the informal atmosphere. The main purpose of this workshop was focused on the theoretical activities in computational astrophysics in Japan. It was also aimed to promote effective collaboration between the numerical experimentists working on supercomputing technique. The various subjects of the presented papers of hydrodynamics, plasma physics, gravitating systems, radiative transfer and general relativity are all stimulating. In fact, these numerical calculations become possible now in Japan owing to the power of Japanese supercomputer such as HITAC S820, Fujitsu VP400E and NEC SX-2. (J.P.N.)
The ETA10 supercomputer system

International Nuclear Information System (INIS)

Swanson, C.D.

1987-01-01

The ETA Systems, Inc. ETA 10 is a next-generation supercomputer featuring multiprocessing, a large hierarchical memory system, high performance input/output, and network support for both batch and interactive processing. Advanced technology used in the ETA 10 includes liquid nitrogen cooled CMOS logic with 20,000 gates per chip, a single printed circuit board for each CPU, and high density static and dynamics MOS memory chips. Software for the ETA 10 includes an underlying kernel that supports multiple user environments, a new ETA FORTRAN compiler with an advanced automatic vectorizer, a multitasking library and debugging tools. Possible developments for future supercomputers from ETA Systems are discussed. (orig.)
PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP, VOLUME 72, RHIC SPIN COLLABORATION MEETINGS XXXI, XXXII, XXXIII

International Nuclear Information System (INIS)

OGAWA, A.

2005-01-01

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkyusho'' (RIKEN, The Institute of Physical and Chemical Research) of Japan. The Center is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has both a theory and experimental component. At present the theoretical group has 4 Fellows and 3 Research Associates as well as 11 RHIC Physics/University Fellows (academic year 2003-2004). To date there are approximately 30 graduates from the program of which 13 have attained tenure positions at major institutions worldwide. The experimental group is smaller and has 2 Fellows and 3 RHIC Physics/University Fellows and 3 Research Associates, and historically 6 individuals have attained permanent positions. Beginning in 2001 a new RIKEN Spin Program (RSP) category was implemented at RBRC. These appointments are joint positions of RBRC and RIKEN and include the following positions in theory and experiment: RSP Researchers, RSP Research Associates, and Young Researchers, who are mentored by senior RBRC Scientists. A number of RIKEN Jr. Research Associates and Visiting Scientists also contribute to the physics program at the Center. RBRC has an active workshop program on strong interaction physics with each workshop focused on a specific physics problem. Each workshop speaker is encouraged to select a few of the most important transparencies from his or her presentation, accompanied by a page of explanation. This material is collected at the end of the workshop by the organizer to form proceedings, which can therefore be available within a short time. To date there are seventy-two proceeding volumes available. The construction of a 0.6 teraflops parallel processor, dedicated to lattice QCD, begun at the Center on February 19, 1998, was completed on August 28, 1998 and is still
Integration of Titan supercomputer at OLCF with ATLAS Production System

CERN Document Server

AUTHOR|(SzGeCERN)643806; The ATLAS collaboration; De, Kaushik; Klimentov, Alexei; Nilsson, Paul; Oleynik, Danila; Padolski, Siarhei; Panitkin, Sergey; Wenaus, Torre

2017-01-01

The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this paper we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for jo...
Integration of Titan supercomputer at OLCF with ATLAS production system

CERN Document Server

Panitkin, Sergey; The ATLAS collaboration

2016-01-01

The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this talk we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for job...
Supercomputer algorithms for reactivity, dynamics and kinetics of small molecules

International Nuclear Information System (INIS)

Lagana, A.

1989-01-01

Even for small systems, the accurate characterization of reactive processes is so demanding of computer resources as to suggest the use of supercomputers having vector and parallel facilities. The full advantages of vector and parallel architectures can sometimes be obtained by simply modifying existing programs, vectorizing the manipulation of vectors and matrices, and requiring the parallel execution of independent tasks. More often, however, a significant time saving can be obtained only when the computer code undergoes a deeper restructuring, requiring a change in the computational strategy or, more radically, the adoption of a different theoretical treatment. This book discusses supercomputer strategies based upon act and approximate methods aimed at calculating the electronic structure and the reactive properties of small systems. The book shows how, in recent years, intense design activity has led to the ability to calculate accurate electronic structures for reactive systems, exact and high-level approximations to three-dimensional reactive dynamics, and to efficient directive and declaratory software for the modelling of complex systems
Applications of supercomputing and the utility industry: Calculation of power transfer capabilities

International Nuclear Information System (INIS)

Jensen, D.D.; Behling, S.R.; Betancourt, R.

1990-01-01

Numerical models and iterative simulation using supercomputers can furnish cost-effective answers to utility industry problems that are all but intractable using conventional computing equipment. An example of the use of supercomputers by the utility industry is the determination of power transfer capability limits for power transmission systems. This work has the goal of markedly reducing the run time of transient stability codes used to determine power distributions following major system disturbances. To date, run times of several hours on a conventional computer have been reduced to several minutes on state-of-the-art supercomputers, with further improvements anticipated to reduce run times to less than a minute. In spite of the potential advantages of supercomputers, few utilities have sufficient need for a dedicated in-house supercomputing capability. This problem is resolved using a supercomputer center serving a geographically distributed user base coupled via high speed communication networks

Supercomputers to transform Science

CERN Multimedia

2006-01-01

"New insights into the structure of space and time, climate modeling, and the design of novel drugs, are but a few of the many research areas that will be transforned by the installation of three supercomputers at the Unversity of Bristol." (1/2 page)
Convex unwraps its first grown-up supercomputer

Energy Technology Data Exchange (ETDEWEB)

Manuel, T.

1988-03-03

Convex Computer Corp.'s new supercomputer family is even more of an industry blockbuster than its first system. At a tenfold jump in performance, it's far from just an incremental upgrade over its first minisupercomputer, the C-1. The heart of the new family, the new C-2 processor, churning at 50 million floating-point operations/s, spawns a group of systems whose performance could pass for some fancy supercomputers-namely those of the Cray Research Inc. family. When added to the C-1, Convex's five new supercomputers create the C series, a six-member product group offering a performance range from 20 to 200 Mflops. They mark an important transition for Convex from a one-product high-tech startup to a multinational company with a wide-ranging product line. It's a tough transition but the Richardson, Texas, company seems to be doing it. The extended product line propels Convex into the upper end of the minisupercomputer class and nudges it into the low end of the big supercomputers. It positions Convex in an uncrowded segment of the market in the $500,000 to $1 million range offering 50 to 200 Mflops of performance. The company is making this move because the minisuper area, which it pioneered, quickly became crowded with new vendors, causing prices and gross margins to drop drastically.
Supercomputer debugging workshop 1991 proceedings

Energy Technology Data Exchange (ETDEWEB)

Brown, J.

1991-01-01

This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)
Supercomputer debugging workshop 1991 proceedings

Energy Technology Data Exchange (ETDEWEB)

Brown, J.

1991-12-31

This report discusses the following topics on supercomputer debugging: Distributed debugging; use interface to debugging tools and standards; debugging optimized codes; debugging parallel codes; and debugger performance and interface as analysis tools. (LSP)
Micro-mechanical Simulations of Soils using Massively Parallel Supercomputers

Directory of Open Access Journals (Sweden)

David W. Washington

2004-06-01

Full Text Available In this research a computer program, Trubal version 1.51, based on the Discrete Element Method was converted to run on a Connection Machine (CM-5,a massively parallel supercomputer with 512 nodes, to expedite the computational times of simulating Geotechnical boundary value problems. The dynamic memory algorithm in Trubal program did not perform efficiently in CM-2 machine with the Single Instruction Multiple Data (SIMD architecture. This was due to the communication overhead involving global array reductions, global array broadcast and random data movement. Therefore, a dynamic memory algorithm in Trubal program was converted to a static memory arrangement and Trubal program was successfully converted to run on CM-5 machines. The converted program was called "TRUBAL for Parallel Machines (TPM." Simulating two physical triaxial experiments and comparing simulation results with Trubal simulations validated the TPM program. With a 512 nodes CM-5 machine TPM produced a nine-fold speedup demonstrating the inherent parallelism within algorithms based on the Discrete Element Method.
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Energy Technology Data Exchange (ETDEWEB)

De, K [University of Texas at Arlington; Jha, S [Rutgers University; Klimentov, A [Brookhaven National Laboratory (BNL); Maeno, T [Brookhaven National Laboratory (BNL); Nilsson, P [Brookhaven National Laboratory (BNL); Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Wells, Jack C [ORNL; Wenaus, T [Brookhaven National Laboratory (BNL)

2016-01-01

The Large Hadron Collider (LHC), operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe, and were recently credited for the discovery of a Higgs boson. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than Grid computing can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, Europe and Russia (in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF), MIRA supercomputer at Argonne Leadership Computing Facilities (ALCF), Supercomputer at the National Research Center Kurchatov Institute , IT4 in Ostrava and others). Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation
The ETA systems plans for supercomputers

International Nuclear Information System (INIS)

Swanson, C.D.

1987-01-01

The ETA Systems, is a class VII supercomputer featuring multiprocessing, a large hierarchical memory system, high performance input/output, and network support for both batch and interactive processing. Advanced technology used in the ETA 10 includes liquid nitrogen cooled CMOS logic with 20,000 gates per chip, a single printed circuit board for each CPU, and high density static and dynamic MOS memory chips. Software for the ETA 10 includes an underlying kernel that supports multiple user environments, a new ETA FORTRAN compiler with an advanced automatic vectorizer, a multitasking library and debugging tools. Possible developments for future supercomputers from ETA Systems are discussed
Automatic discovery of the communication network topology for building a supercomputer model

Science.gov (United States)

Sobolev, Sergey; Stefanov, Konstantin; Voevodin, Vadim

2016-10-01

The Research Computing Center of Lomonosov Moscow State University is developing the Octotron software suite for automatic monitoring and mitigation of emergency situations in supercomputers so as to maximize hardware reliability. The suite is based on a software model of the supercomputer. The model uses a graph to describe the computing system components and their interconnections. One of the most complex components of a supercomputer that needs to be included in the model is its communication network. This work describes the proposed approach for automatically discovering the Ethernet communication network topology in a supercomputer and its description in terms of the Octotron model. This suite automatically detects computing nodes and switches, collects information about them and identifies their interconnections. The application of this approach is demonstrated on the "Lomonosov" and "Lomonosov-2" supercomputers.
PNNL supercomputer to become largest computing resource on the Grid

CERN Multimedia

2002-01-01

Hewlett Packard announced that the US DOE Pacific Northwest National Laboratory will connect a 9.3-teraflop HP supercomputer to the DOE Science Grid. This will be the largest supercomputer attached to a computer grid anywhere in the world (1 page).
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

Science.gov (United States)

Klimentov, A.; De, K.; Jha, S.; Maeno, T.; Nilsson, P.; Oleynik, D.; Panitkin, S.; Wells, J.; Wenaus, T.

2016-10-01

The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the
Integration Of PanDA Workload Management System With Supercomputers for ATLAS and Data Intensive Science

International Nuclear Information System (INIS)

Klimentov, A; Maeno, T; Nilsson, P; Panitkin, S; Wenaus, T; De, K; Oleynik, D; Jha, S; Wells, J

2016-01-01

The.LHC, operating at CERN, is leading Big Data driven scientific explorations. Experiments at the LHC explore the fundamental nature of matter and the basic forces that shape our universe. ATLAS, one of the largest collaborations ever assembled in the sciences, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. The ATLAS experiment uses PanDA (Production and Data Analysis) Workload Management System for managing the workflow for all data processing on over 150 data centers. Through PanDA, ATLAS physicists see a single computing facility that enables rapid scientific breakthroughs for the experiment, even though the data centers are physically scattered all over the world. While PanDA currently uses more than 250,000 cores with a peak performance of 0.3 petaFLOPS, LHC data taking runs require more resources than grid can possibly provide. To alleviate these challenges, LHC experiments are engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with supercomputers in United States, in particular with Titan supercomputer at Oak Ridge Leadership Computing Facility. Current approach utilizes modified PanDA pilot framework for job submission to the supercomputers batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on LCFs multi-core worker nodes. This implementation was tested with a variety of Monte-Carlo workloads on several supercomputing platforms for ALICE and ATLAS experiments and it is in full pro duction for the ATLAS since September 2015. We will present our current accomplishments with running PanDA at supercomputers and demonstrate our ability to use PanDA as a portal independent of the
Dust modelling and forecasting in the Barcelona Supercomputing Center: Activities and developments

Energy Technology Data Exchange (ETDEWEB)

Perez, C; Baldasano, J M; Jimenez-Guerrero, P; Jorba, O; Haustein, K; Basart, S [Earth Sciences Department. Barcelona Supercomputing Center. Barcelona (Spain); Cuevas, E [Izanaa Atmospheric Research Center. Agencia Estatal de Meteorologia, Tenerife (Spain); Nickovic, S [Atmospheric Research and Environment Branch, World Meteorological Organization, Geneva (Switzerland)], E-mail: carlos.perez@bsc.es

2009-03-01

The Barcelona Supercomputing Center (BSC) is the National Supercomputer Facility in Spain, hosting MareNostrum, one of the most powerful Supercomputers in Europe. The Earth Sciences Department of BSC operates daily regional dust and air quality forecasts and conducts intensive modelling research for short-term operational prediction. This contribution summarizes the latest developments and current activities in the field of sand and dust storm modelling and forecasting.
Dust modelling and forecasting in the Barcelona Supercomputing Center: Activities and developments

International Nuclear Information System (INIS)

Perez, C; Baldasano, J M; Jimenez-Guerrero, P; Jorba, O; Haustein, K; Basart, S; Cuevas, E; Nickovic, S

2009-01-01

The Barcelona Supercomputing Center (BSC) is the National Supercomputer Facility in Spain, hosting MareNostrum, one of the most powerful Supercomputers in Europe. The Earth Sciences Department of BSC operates daily regional dust and air quality forecasts and conducts intensive modelling research for short-term operational prediction. This contribution summarizes the latest developments and current activities in the field of sand and dust storm modelling and forecasting.
Integration of PanDA workload management system with Titan supercomputer at OLCF

CERN Document Server

AUTHOR|(INSPIRE)INSPIRE-00300320; Klimentov, Alexei; Oleynik, Danila; Panitkin, Sergey; Petrosyan, Artem; Vaniachine, Alexandre; Wenaus, Torre; Schovancova, Jaroslava

2015-01-01

The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, next LHC data taking run will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modi ed PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multi-core worker nodes. It also gives PanDA new capability to collect, in real time, information about unused...
Integration of PanDA workload management system with Titan supercomputer at OLCF

CERN Document Server

Panitkin, Sergey; The ATLAS collaboration; Klimentov, Alexei; Oleynik, Danila; Petrosyan, Artem; Schovancova, Jaroslava; Vaniachine, Alexandre; Wenaus, Torre

2015-01-01

The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently uses more than 100,000 cores at well over 100 Grid sites with a peak performance of 0.3 petaFLOPS, next LHC data taking run will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multi-core worker nodes. It also gives PanDA new capability to collect, in real tim...
Supercomputers Of The Future

Science.gov (United States)

Peterson, Victor L.; Kim, John; Holst, Terry L.; Deiwert, George S.; Cooper, David M.; Watson, Andrew B.; Bailey, F. Ron

1992-01-01

Report evaluates supercomputer needs of five key disciplines: turbulence physics, aerodynamics, aerothermodynamics, chemistry, and mathematical modeling of human vision. Predicts these fields will require computer speed greater than 10(Sup 18) floating-point operations per second (FLOP's) and memory capacity greater than 10(Sup 15) words. Also, new parallel computer architectures and new structured numerical methods will make necessary speed and capacity available.
Computational Science with the Titan Supercomputer: Early Outcomes and Lessons Learned

Science.gov (United States)

Wells, Jack

2014-03-01

Modeling and simulation with petascale computing has supercharged the process of innovation and understanding, dramatically accelerating time-to-insight and time-to-discovery. This presentation will focus on early outcomes from the Titan supercomputer at the Oak Ridge National Laboratory. Titan has over 18,000 hybrid compute nodes consisting of both CPUs and GPUs. In this presentation, I will discuss the lessons we have learned in deploying Titan and preparing applications to move from conventional CPU architectures to a hybrid machine. I will present early results of materials applications running on Titan and the implications for the research community as we prepare for exascale supercomputer in the next decade. Lastly, I will provide an overview of user programs at the Oak Ridge Leadership Computing Facility with specific information how researchers may apply for allocations of computing resources. This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725.
NASA Advanced Supercomputing Facility Expansion

Science.gov (United States)

Thigpen, William W.

2017-01-01

The NASA Advanced Supercomputing (NAS) Division enables advances in high-end computing technologies and in modeling and simulation methods to tackle some of the toughest science and engineering challenges facing NASA today. The name "NAS" has long been associated with leadership and innovation throughout the high-end computing (HEC) community. We play a significant role in shaping HEC standards and paradigms, and provide leadership in the areas of large-scale InfiniBand fabrics, Lustre open-source filesystems, and hyperwall technologies. We provide an integrated high-end computing environment to accelerate NASA missions and make revolutionary advances in science. Pleiades, a petaflop-scale supercomputer, is used by scientists throughout the U.S. to support NASA missions, and is ranked among the most powerful systems in the world. One of our key focus areas is in modeling and simulation to support NASA's real-world engineering applications and make fundamental advances in modeling and simulation methods.
ATLAS Software Installation on Supercomputers

CERN Document Server

Undrus, Alexander; The ATLAS collaboration

2018-01-01

PowerPC and high performance computers (HPC) are important resources for computing in the ATLAS experiment. The future LHC data processing will require more resources than Grid computing, currently using approximately 100,000 cores at well over 100 sites, can provide. Supercomputers are extremely powerful as they use resources of hundreds of thousands CPUs joined together. However their architectures have different instruction sets. ATLAS binary software distributions for x86 chipsets do not fit these architectures, as emulation of these chipsets results in huge performance loss. This presentation describes the methodology of ATLAS software installation from source code on supercomputers. The installation procedure includes downloading the ATLAS code base as well as the source of about 50 external packages, such as ROOT and Geant4, followed by compilation, and rigorous unit and integration testing. The presentation reports the application of this procedure at Titan HPC and Summit PowerPC at Oak Ridge Computin...
JINR supercomputer of the module type for event parallel analysis

International Nuclear Information System (INIS)

Kolpakov, I.F.; Senner, A.E.; Smirnov, V.A.

1987-01-01

A model of a supercomputer with 50 million of operations per second is suggested. Its realization allows one to solve JINR data analysis problems for large spectrometers (in particular DELPHY collaboration). The suggested module supercomputer is based on 32-bit commercial available microprocessor with a processing rate of about 1 MFLOPS. The processors are combined by means of VME standard busbars. MicroVAX-11 is a host computer organizing the operation of the system. Data input and output is realized via microVAX-11 computer periphery. Users' software is based on the FORTRAN-77. The supercomputer is connected with a JINR net port and all JINR users get an access to the suggested system

Supercomputers and quantum field theory

International Nuclear Information System (INIS)

Creutz, M.

1985-01-01

A review is given of why recent simulations of lattice gauge theories have resulted in substantial demands from particle theorists for supercomputer time. These calculations have yielded first principle results on non-perturbative aspects of the strong interactions. An algorithm for simulating dynamical quark fields is discussed. 14 refs
Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu; Taylor, Valerie

2011-01-01

The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.
Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu

2011-03-29

The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.
New generation of docking programs: Supercomputer validation of force fields and quantum-chemical methods for docking.

Science.gov (United States)

Sulimov, Alexey V; Kutov, Danil C; Katkova, Ekaterina V; Ilin, Ivan S; Sulimov, Vladimir B

2017-11-01

Discovery of new inhibitors of the protein associated with a given disease is the initial and most important stage of the whole process of the rational development of new pharmaceutical substances. New inhibitors block the active site of the target protein and the disease is cured. Computer-aided molecular modeling can considerably increase effectiveness of new inhibitors development. Reliable predictions of the target protein inhibition by a small molecule, ligand, is defined by the accuracy of docking programs. Such programs position a ligand in the target protein and estimate the protein-ligand binding energy. Positioning accuracy of modern docking programs is satisfactory. However, the accuracy of binding energy calculations is too low to predict good inhibitors. For effective application of docking programs to new inhibitors development the accuracy of binding energy calculations should be higher than 1kcal/mol. Reasons of limited accuracy of modern docking programs are discussed. One of the most important aspects limiting this accuracy is imperfection of protein-ligand energy calculations. Results of supercomputer validation of several force fields and quantum-chemical methods for docking are presented. The validation was performed by quasi-docking as follows. First, the low energy minima spectra of 16 protein-ligand complexes were found by exhaustive minima search in the MMFF94 force field. Second, energies of the lowest 8192 minima are recalculated with CHARMM force field and PM6-D3H4X and PM7 quantum-chemical methods for each complex. The analysis of minima energies reveals the docking positioning accuracies of the PM7 and PM6-D3H4X quantum-chemical methods and the CHARMM force field are close to one another and they are better than the positioning accuracy of the MMFF94 force field. Copyright © 2017 Elsevier Inc. All rights reserved.
Supercomputer applications in nuclear research

International Nuclear Information System (INIS)

Ishiguro, Misako

1992-01-01

The utilization of supercomputers in Japan Atomic Energy Research Institute is mainly reported. The fields of atomic energy research which use supercomputers frequently and the contents of their computation are outlined. What is vectorizing is simply explained, and nuclear fusion, nuclear reactor physics, the hydrothermal safety of nuclear reactors, the parallel property that the atomic energy computations of fluids and others have, the algorithm for vector treatment and the effect of speed increase by vectorizing are discussed. At present Japan Atomic Energy Research Institute uses two systems of FACOM VP 2600/10 and three systems of M-780. The contents of computation changed from criticality computation around 1970, through the analysis of LOCA after the TMI accident, to nuclear fusion research, the design of new type reactors and reactor safety assessment at present. Also the method of using computers advanced from batch processing to time sharing processing, from one-dimensional to three dimensional computation, from steady, linear to unsteady nonlinear computation, from experimental analysis to numerical simulation and so on. (K.I.)
Getting To Exascale: Applying Novel Parallel Programming Models To Lab Applications For The Next Generation Of Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Dube, Evi [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Shereda, Charles [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Nau, Lee [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Harris, Lance [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2010-09-27

As supercomputing moves toward exascale, node architectures will change significantly. CPU core counts on nodes will increase by an order of magnitude or more. Heterogeneous architectures will become more commonplace, with GPUs or FPGAs providing additional computational power. Novel programming models may make better use of on-node parallelism in these new architectures than do current models. In this paper we examine several of these novel models – UPC, CUDA, and OpenCL –to determine their suitability to LLNL scientific application codes. Our study consisted of several phases: We conducted interviews with code teams and selected two codes to port; We learned how to program in the new models and ported the codes; We debugged and tuned the ported applications; We measured results, and documented our findings. We conclude that UPC is a challenge for porting code, Berkeley UPC is not very robust, and UPC is not suitable as a general alternative to OpenMP for a number of reasons. CUDA is well supported and robust but is a proprietary NVIDIA standard, while OpenCL is an open standard. Both are well suited to a specific set of application problems that can be run on GPUs, but some problems are not suited to GPUs. Further study of the landscape of novel models is recommended.
Mistral Supercomputer Job History Analysis

OpenAIRE

Zasadziński, Michał; Muntés-Mulero, Victor; Solé, Marc; Ludwig, Thomas

2018-01-01

In this technical report, we show insights and results of operational data analysis from petascale supercomputer Mistral, which is ranked as 42nd most powerful in the world as of January 2018. Data sources include hardware monitoring data, job scheduler history, topology, and hardware information. We explore job state sequences, spatial distribution, and electric power patterns.
Parallel processor programs in the Federal Government

Science.gov (United States)

Schneck, P. B.; Austin, D.; Squires, S. L.; Lehmann, J.; Mizell, D.; Wallgren, K.

1985-01-01

In 1982, a report dealing with the nation's research needs in high-speed computing called for increased access to supercomputing resources for the research community, research in computational mathematics, and increased research in the technology base needed for the next generation of supercomputers. Since that time a number of programs addressing future generations of computers, particularly parallel processors, have been started by U.S. government agencies. The present paper provides a description of the largest government programs in parallel processing. Established in fiscal year 1985 by the Institute for Defense Analyses for the National Security Agency, the Supercomputing Research Center will pursue research to advance the state of the art in supercomputing. Attention is also given to the DOE applied mathematical sciences research program, the NYU Ultracomputer project, the DARPA multiprocessor system architectures program, NSF research on multiprocessor systems, ONR activities in parallel computing, and NASA parallel processor projects.
Visualization on supercomputing platform level II ASC milestone (3537-1B) results from Sandia.

Energy Technology Data Exchange (ETDEWEB)

Geveci, Berk (Kitware, Inc., Clifton Park, NY); Fabian, Nathan; Marion, Patrick (Kitware, Inc., Clifton Park, NY); Moreland, Kenneth D.

2010-09-01

This report provides documentation for the completion of the Sandia portion of the ASC Level II Visualization on the platform milestone. This ASC Level II milestone is a joint milestone between Sandia National Laboratories and Los Alamos National Laboratories. This milestone contains functionality required for performing visualization directly on a supercomputing platform, which is necessary for peta-scale visualization. Sandia's contribution concerns in-situ visualization, running a visualization in tandem with a solver. Visualization and analysis of petascale data is limited by several factors which must be addressed as ACES delivers the Cielo platform. Two primary difficulties are: (1) Performance of interactive rendering, which is most computationally intensive portion of the visualization process. For terascale platforms, commodity clusters with graphics processors(GPUs) have been used for interactive rendering. For petascale platforms, visualization and rendering may be able to run efficiently on the supercomputer platform itself. (2) I/O bandwidth, which limits how much information can be written to disk. If we simply analyze the sparse information that is saved to disk we miss the opportunity to analyze the rich information produced every timestep by the simulation. For the first issue, we are pursuing in-situ analysis, in which simulations are coupled directly with analysis libraries at runtime. This milestone will evaluate the visualization and rendering performance of current and next generation supercomputers in contrast to GPU-based visualization clusters, and evaluate the performance of common analysis libraries coupled with the simulation that analyze and write data to disk during a running simulation. This milestone will explore, evaluate and advance the maturity level of these technologies and their applicability to problems of interest to the ASC program. Scientific simulation on parallel supercomputers is traditionally performed in four
Extracting the Textual and Temporal Structure of Supercomputing Logs

Energy Technology Data Exchange (ETDEWEB)

Jain, S; Singh, I; Chandra, A; Zhang, Z; Bronevetsky, G

2009-05-26

Supercomputers are prone to frequent faults that adversely affect their performance, reliability and functionality. System logs collected on these systems are a valuable resource of information about their operational status and health. However, their massive size, complexity, and lack of standard format makes it difficult to automatically extract information that can be used to improve system management. In this work we propose a novel method to succinctly represent the contents of supercomputing logs, by using textual clustering to automatically find the syntactic structures of log messages. This information is used to automatically classify messages into semantic groups via an online clustering algorithm. Further, we describe a methodology for using the temporal proximity between groups of log messages to identify correlated events in the system. We apply our proposed methods to two large, publicly available supercomputing logs and show that our technique features nearly perfect accuracy for online log-classification and extracts meaningful structural and temporal message patterns that can be used to improve the accuracy of other log analysis techniques.
Unique Methodologies for Nano/Micro Manufacturing Job Training Via Desktop Supercomputer Modeling and Simulation

Energy Technology Data Exchange (ETDEWEB)

Kimball, Clyde [Northern Illinois Univ., DeKalb, IL (United States); Karonis, Nicholas [Northern Illinois Univ., DeKalb, IL (United States); Lurio, Laurence [Northern Illinois Univ., DeKalb, IL (United States); Piot, Philippe [Northern Illinois Univ., DeKalb, IL (United States); Xiao, Zhili [Northern Illinois Univ., DeKalb, IL (United States); Glatz, Andreas [Northern Illinois Univ., DeKalb, IL (United States); Pohlman, Nicholas [Northern Illinois Univ., DeKalb, IL (United States); Hou, Minmei [Northern Illinois Univ., DeKalb, IL (United States); Demir, Veysel [Northern Illinois Univ., DeKalb, IL (United States); Song, Jie [Northern Illinois Univ., DeKalb, IL (United States); Duffin, Kirk [Northern Illinois Univ., DeKalb, IL (United States); Johns, Mitrick [Northern Illinois Univ., DeKalb, IL (United States); Sims, Thomas [Northern Illinois Univ., DeKalb, IL (United States); Yin, Yanbin [Northern Illinois Univ., DeKalb, IL (United States)

2012-11-21

This project establishes an initiative in high speed (Teraflop)/large-memory desktop supercomputing for modeling and simulation of dynamic processes important for energy and industrial applications. It provides a training ground for employment of current students in an emerging field with skills necessary to access the large supercomputing systems now present at DOE laboratories. It also provides a foundation for NIU faculty to quantum leap beyond their current small cluster facilities. The funding extends faculty and student capability to a new level of analytic skills with concomitant publication avenues. The components of the Hewlett Packard computer obtained by the DOE funds create a hybrid combination of a Graphics Processing System (12 GPU/Teraflops) and a Beowulf CPU system (144 CPU), the first expandable via the NIU GAEA system to ~60 Teraflops integrated with a 720 CPU Beowulf system. The software is based on access to the NVIDIA/CUDA library and the ability through MATLAB multiple licenses to create additional local programs. A number of existing programs are being transferred to the CPU Beowulf Cluster. Since the expertise necessary to create the parallel processing applications has recently been obtained at NIU, this effort for software development is in an early stage. The educational program has been initiated via formal tutorials and classroom curricula designed for the coming year. Specifically, the cost focus was on hardware acquisitions and appointment of graduate students for a wide range of applications in engineering, physics and computer science.
Proceedings of RIKEN BNL Research Center Workshop: Thermal Photons and Dileptons in Heavy-Ion Collisions. Volume 119

Energy Technology Data Exchange (ETDEWEB)

David, G. [Brookhaven National Laboratory (BNL), Upton, NY (United States); Rapp, R. [Brookhaven National Laboratory (BNL), Upton, NY (United States); Ruan, L. [Brookhaven National Laboratory (BNL), Upton, NY (United States); Yee, H-U. [Brookhaven National Laboratory (BNL), Upton, NY (United States)

2014-09-11

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkyusho'' (RIKEN, The Institute of Physical and Chemical Research) of Japan and the U. S. Department of Energy’s Office of Science. The RBRC is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has theory, lattice gauge computing and experimental components. It is presently exploring the possibility of an astrophysics component being added to the program. The primary theme for this workshop related to sharing the latest experimental and theoretical developments in area of low transverse momentum (p_T) dielectron and photons. All the presentations given at the workshop are included in this proceedings, primarily as PowerPoint presentations.
SUPERCOMPUTERS FOR AIDING ECONOMIC PROCESSES WITH REFERENCE TO THE FINANCIAL SECTOR

Directory of Open Access Journals (Sweden)

Jerzy Balicki

2014-12-01

Full Text Available The article discusses the use of supercomputers to support business processes with particular emphasis on the financial sector. A reference was made to the selected projects that support economic development. In particular, we propose the use of supercomputers to perform artificial intel-ligence methods in banking. The proposed methods combined with modern technology enables a significant increase in the competitiveness of enterprises and banks by adding new functionality.
Symbolic simulation of engineering systems on a supercomputer

International Nuclear Information System (INIS)

Ragheb, M.; Gvillo, D.; Makowitz, H.

1986-01-01

Model-Based Production-Rule systems for analysis are developed for the symbolic simulation of Complex Engineering systems on a CRAY X-MP Supercomputer. The Fault-Tree and Event-Tree Analysis methodologies from Systems-Analysis are used for problem representation and are coupled to the Rule-Based System Paradigm from Knowledge Engineering to provide modelling of engineering devices. Modelling is based on knowledge of the structure and function of the device rather than on human expertise alone. To implement the methodology, we developed a production-Rule Analysis System that uses both backward-chaining and forward-chaining: HAL-1986. The inference engine uses an Induction-Deduction-Oriented antecedent-consequent logic and is programmed in Portable Standard Lisp (PSL). The inference engine is general and can accommodate general modifications and additions to the knowledge base. The methodologies used will be demonstrated using a model for the identification of faults, and subsequent recovery from abnormal situations in Nuclear Reactor Safety Analysis. The use of the exposed methodologies for the prognostication of future device responses under operational and accident conditions using coupled symbolic and procedural programming is discussed
Exploiting Thread Parallelism for Ocean Modeling on Cray XC Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sarje, Abhinav [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Jacobsen, Douglas W. [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Williams, Samuel W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ringler, Todd [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Oliker, Leonid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2016-05-01

The incorporation of increasing core counts in modern processors used to build state-of-the-art supercomputers is driving application development towards exploitation of thread parallelism, in addition to distributed memory parallelism, with the goal of delivering efficient high-performance codes. In this work we describe the exploitation of threading and our experiences with it with respect to a real-world ocean modeling application code, MPAS-Ocean. We present detailed performance analysis and comparisons of various approaches and configurations for threading on the Cray XC series supercomputers.
Visualization environment of the large-scale data of JAEA's supercomputer system

Energy Technology Data Exchange (ETDEWEB)

Sakamoto, Kensaku [Japan Atomic Energy Agency, Center for Computational Science and e-Systems, Tokai, Ibaraki (Japan); Hoshi, Yoshiyuki [Research Organization for Information Science and Technology (RIST), Tokai, Ibaraki (Japan)

2013-11-15

On research and development of various fields of nuclear energy, visualization of calculated data is especially useful to understand the result of simulation in an intuitive way. Many researchers who run simulations on the supercomputer in Japan Atomic Energy Agency (JAEA) are used to transfer calculated data files from the supercomputer to their local PCs for visualization. In recent years, as the size of calculated data has gotten larger with improvement of supercomputer performance, reduction of visualization processing time as well as efficient use of JAEA network is being required. As a solution, we introduced a remote visualization system which has abilities to utilize parallel processors on the supercomputer and to reduce the usage of network resources by transferring data of intermediate visualization process. This paper reports a study on the performance of image processing with the remote visualization system. The visualization processing time is measured and the influence of network speed is evaluated by varying the drawing mode, the size of visualization data and the number of processors. Based on this study, a guideline for using the remote visualization system is provided to show how the system can be used effectively. An upgrade policy of the next system is also shown. (author)
Multi-petascale highly efficient parallel supercomputer

Science.gov (United States)

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2015-07-14

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.
Integration of PanDA workload management system with Titan supercomputer at OLCF

Science.gov (United States)

De, K.; Klimentov, A.; Oleynik, D.; Panitkin, S.; Petrosyan, A.; Schovancova, J.; Vaniachine, A.; Wenaus, T.

2015-12-01

The PanDA (Production and Distributed Analysis) workload management system (WMS) was developed to meet the scale and complexity of LHC distributed computing for the ATLAS experiment. While PanDA currently distributes jobs to more than 100,000 cores at well over 100 Grid sites, the future LHC data taking runs will require more resources than Grid computing can possibly provide. To alleviate these challenges, ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. We will describe a project aimed at integration of PanDA WMS with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). The current approach utilizes a modified PanDA pilot framework for job submission to Titan's batch queues and local data management, with light-weight MPI wrappers to run single threaded workloads in parallel on Titan's multicore worker nodes. It also gives PanDA new capability to collect, in real time, information about unused worker nodes on Titan, which allows precise definition of the size and duration of jobs submitted to Titan according to available free resources. This capability significantly reduces PanDA job wait time while improving Titan's utilization efficiency. This implementation was tested with a variety of Monte-Carlo workloads on Titan and is being tested on several other supercomputing platforms. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to publish or reproduce the published form of this manuscript, or allow others to do so, for United States Government purposes.
QCD on the BlueGene/L Supercomputer

International Nuclear Information System (INIS)

Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.

2005-01-01

In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented
QCD on the BlueGene/L Supercomputer

Science.gov (United States)

Bhanot, G.; Chen, D.; Gara, A.; Sexton, J.; Vranas, P.

2005-03-01

In June 2004 QCD was simulated for the first time at sustained speed exceeding 1 TeraFlops in the BlueGene/L supercomputer at the IBM T.J. Watson Research Lab. The implementation and performance of QCD in the BlueGene/L is presented.

Development of seismic tomography software for hybrid supercomputers

Science.gov (United States)

Nikitin, Alexandr; Serdyukov, Alexandr; Duchkov, Anton

2015-04-01

Seismic tomography is a technique used for computing velocity model of geologic structure from first arrival travel times of seismic waves. The technique is used in processing of regional and global seismic data, in seismic exploration for prospecting and exploration of mineral and hydrocarbon deposits, and in seismic engineering for monitoring the condition of engineering structures and the surrounding host medium. As a consequence of development of seismic monitoring systems and increasing volume of seismic data, there is a growing need for new, more effective computational algorithms for use in seismic tomography applications with improved performance, accuracy and resolution. To achieve this goal, it is necessary to use modern high performance computing systems, such as supercomputers with hybrid architecture that use not only CPUs, but also accelerators and co-processors for computation. The goal of this research is the development of parallel seismic tomography algorithms and software package for such systems, to be used in processing of large volumes of seismic data (hundreds of gigabytes and more). These algorithms and software package will be optimized for the most common computing devices used in modern hybrid supercomputers, such as Intel Xeon CPUs, NVIDIA Tesla accelerators and Intel Xeon Phi co-processors. In this work, the following general scheme of seismic tomography is utilized. Using the eikonal equation solver, arrival times of seismic waves are computed based on assumed velocity model of geologic structure being analyzed. In order to solve the linearized inverse problem, tomographic matrix is computed that connects model adjustments with travel time residuals, and the resulting system of linear equations is regularized and solved to adjust the model. The effectiveness of parallel implementations of existing algorithms on target architectures is considered. During the first stage of this work, algorithms were developed for execution on
Graphics supercomputer for computational fluid dynamics research

Science.gov (United States)

Liaw, Goang S.

1994-11-01

The objective of this project is to purchase a state-of-the-art graphics supercomputer to improve the Computational Fluid Dynamics (CFD) research capability at Alabama A & M University (AAMU) and to support the Air Force research projects. A cutting-edge graphics supercomputer system, Onyx VTX, from Silicon Graphics Computer Systems (SGI), was purchased and installed. Other equipment including a desktop personal computer, PC-486 DX2 with a built-in 10-BaseT Ethernet card, a 10-BaseT hub, an Apple Laser Printer Select 360, and a notebook computer from Zenith were also purchased. A reading room has been converted to a research computer lab by adding some furniture and an air conditioning unit in order to provide an appropriate working environments for researchers and the purchase equipment. All the purchased equipment were successfully installed and are fully functional. Several research projects, including two existing Air Force projects, are being performed using these facilities.
Simulation of x-rays in refractive structure by the Monte Carlo method using the supercomputer SKIF

International Nuclear Information System (INIS)

Yaskevich, Yu.R.; Kravchenko, O.I.; Soroka, I.I.; Chembrovskij, A.G.; Kolesnik, A.S.; Serikova, N.V.; Petrov, P.V.; Kol'chevskij, N.N.

2013-01-01

Software 'Xray-SKIF' for the simulation of the X-rays in refractive structures by the Monte-Carlo method using the supercomputer SKIF BSU are developed. The program generates a large number of rays propagated from a source to the refractive structure. The ray trajectory under assumption of geometrical optics is calculated. Absorption is calculated for each ray inside of refractive structure. Dynamic arrays are used for results of calculation rays parameters, its restore the X-ray field distributions very fast at different position of detector. It was found that increasing the number of processors leads to proportional decreasing of calculation time: simulation of 10 8 X-rays using supercomputer with the number of processors from 1 to 30 run-times equal 3 hours and 6 minutes, respectively. 10 9 X-rays are calculated by software 'Xray-SKIF' which allows to reconstruct the X-ray field after refractive structure with a special resolution of 1 micron. (authors)
Feynman diagrams sampling for quantum field theories on the QPACE 2 supercomputer

Energy Technology Data Exchange (ETDEWEB)

Rappl, Florian

2016-08-01

This work discusses the application of Feynman diagram sampling in quantum field theories. The method uses a computer simulation to sample the diagrammatic space obtained in a series expansion. For running large physical simulations powerful computers are obligatory, effectively splitting the thesis in two parts. The first part deals with the method of Feynman diagram sampling. Here the theoretical background of the method itself is discussed. Additionally, important statistical concepts and the theory of the strong force, quantum chromodynamics, are introduced. This sets the context of the simulations. We create and evaluate a variety of models to estimate the applicability of diagrammatic methods. The method is then applied to sample the perturbative expansion of the vertex correction. In the end we obtain the value for the anomalous magnetic moment of the electron. The second part looks at the QPACE 2 supercomputer. This includes a short introduction to supercomputers in general, as well as a closer look at the architecture and the cooling system of QPACE 2. Guiding benchmarks of the InfiniBand network are presented. At the core of this part, a collection of best practices and useful programming concepts are outlined, which enables the development of efficient, yet easily portable, applications for the QPACE 2 system.
A workbench for tera-flop supercomputing

International Nuclear Information System (INIS)

Resch, M.M.; Kuester, U.; Mueller, M.S.; Lang, U.

2003-01-01

Supercomputers currently reach a peak performance in the range of TFlop/s. With but one exception - the Japanese Earth Simulator - none of these systems has so far been able to also show a level of sustained performance for a variety of applications that comes close to the peak performance. Sustained TFlop/s are therefore rarely seen. The reasons are manifold and are well known: Bandwidth and latency both for main memory and for the internal network are the key internal technical problems. Cache hierarchies with large caches can bring relief but are no remedy to the problem. However, there are not only technical problems that inhibit the full exploitation by scientists of the potential of modern supercomputers. More and more organizational issues come to the forefront. This paper shows the approach of the High Performance Computing Center Stuttgart (HLRS) to deliver a sustained performance of TFlop/s for a wide range of applications from a large group of users spread over Germany. The core of the concept is the role of the data. Around this we design a simulation workbench that hides the complexity of interacting computers, networks and file systems from the user. (authors)
A visual analytics system for optimizing the performance of large-scale networks in supercomputing systems

Directory of Open Access Journals (Sweden)

Takanori Fujiwara

2018-03-01

Full Text Available The overall efficiency of an extreme-scale supercomputer largely relies on the performance of its network interconnects. Several of the state of the art supercomputers use networks based on the increasingly popular Dragonfly topology. It is crucial to study the behavior and performance of different parallel applications running on Dragonfly networks in order to make optimal system configurations and design choices, such as job scheduling and routing strategies. However, in order to study these temporal network behavior, we would need a tool to analyze and correlate numerous sets of multivariate time-series data collected from the Dragonfly’s multi-level hierarchies. This paper presents such a tool–a visual analytics system–that uses the Dragonfly network to investigate the temporal behavior and optimize the communication performance of a supercomputer. We coupled interactive visualization with time-series analysis methods to help reveal hidden patterns in the network behavior with respect to different parallel applications and system configurations. Our system also provides multiple coordinated views for connecting behaviors observed at different levels of the network hierarchies, which effectively helps visual analysis tasks. We demonstrate the effectiveness of the system with a set of case studies. Our system and findings can not only help improve the communication performance of supercomputing applications, but also the network performance of next-generation supercomputers. Keywords: Supercomputing, Parallel communication network, Dragonfly networks, Time-series data, Performance analysis, Visual analytics
KfK-seminar series on supercomputing und visualization from May till September 1992

International Nuclear Information System (INIS)

Hohenhinnebusch, W.

1993-05-01

During the period of may 1992 to september 1992 a series of seminars was held at KfK on several topics of supercomputing in different fields of application. The aim was to demonstrate the importance of supercomputing and visualization in numerical simulations of complex physical and technical phenomena. This report contains the collection of all submitted seminar papers. (orig./HP) [de
Parallel adaptation of a vectorised quantumchemical program system

International Nuclear Information System (INIS)

Van Corler, L.C.H.; Van Lenthe, J.H.

1987-01-01

Supercomputers, like the CRAY 1 or the Cyber 205, have had, and still have, a marked influence on Quantum Chemistry. Vectorization has led to a considerable increase in the performance of Quantum Chemistry programs. However, clockcycle times more than a factor 10 smaller than those of the present supercomputers are not to be expected. Therefore future supercomputers will have to depend on parallel structures. Recently, the first examples of such supercomputers have been installed. To be prepared for this new generation of (parallel) supercomputers one should consider the concepts one wants to use and the kind of problems one will encounter during implementation of existing vectorized programs on those parallel systems. The authors implemented four important parts of a large quantumchemical program system (ATMOL), i.e. integrals, SCF, 4-index and Direct-CI in the parallel environment at ECSEC (Rome, Italy). This system offers simulated parallellism on the host computer (IBM 4381) and real parallellism on at most 10 attached processors (FPS-164). Quantumchemical programs usually handle large amounts of data and very large, often sparse matrices. The transfer of that many data can cause problems concerning communication and overhead, in view of which shared memory and shared disks must be considered. The strategy and the tools that were used to parallellise the programs are shown. Also, some examples are presented to illustrate effectiveness and performance of the system in Rome for these type of calculations
The Pawsey Supercomputer geothermal cooling project

Science.gov (United States)

Regenauer-Lieb, K.; Horowitz, F.; Western Australian Geothermal Centre Of Excellence, T.

2010-12-01

The Australian Government has funded the Pawsey supercomputer in Perth, Western Australia, providing computational infrastructure intended to support the future operations of the Australian Square Kilometre Array radiotelescope and to boost next-generation computational geosciences in Australia. Supplementary funds have been directed to the development of a geothermal exploration well to research the potential for direct heat use applications at the Pawsey Centre site. Cooling the Pawsey supercomputer may be achieved by geothermal heat exchange rather than by conventional electrical power cooling, thus reducing the carbon footprint of the Pawsey Centre and demonstrating an innovative green technology that is widely applicable in industry and urban centres across the world. The exploration well is scheduled to be completed in 2013, with drilling due to commence in the third quarter of 2011. One year is allocated to finalizing the design of the exploration, monitoring and research well. Success in the geothermal exploration and research program will result in an industrial-scale geothermal cooling facility at the Pawsey Centre, and will provide a world-class student training environment in geothermal energy systems. A similar system is partially funded and in advanced planning to provide base-load air-conditioning for the main campus of the University of Western Australia. Both systems are expected to draw ~80-95 degrees C water from aquifers lying between 2000 and 3000 meters depth from naturally permeable rocks of the Perth sedimentary basin. The geothermal water will be run through absorption chilling devices, which only require heat (as opposed to mechanical work) to power a chilled water stream adequate to meet the cooling requirements. Once the heat has been removed from the geothermal water, licensing issues require the water to be re-injected back into the aquifer system. These systems are intended to demonstrate the feasibility of powering large-scale air
Proceedings of RIKEN BNL Research Center Workshop

Energy Technology Data Exchange (ETDEWEB)

Samios, Nicholas P. [Brookhaven National Lab. (BNL), Upton, NY (United States)

2013-01-24

The twelfth evaluation of the RIKEN BNL Research Center (RBRC) took place on November 6 – 8, 2012 at Brookhaven National Laboratory. The members of the Scientific Review Committee (SRC), present at the meeting, were: Prof. Wit Busza, Prof. Miklos Gyulassy, Prof. Kenichi Imai, Prof. Richard Milner (Chair), Prof. Alfred Mueller, Prof. Charles Young Prescott, and Prof. Akira Ukawa. We are pleased that Dr. Hideto En’yo, the Director of the Nishina Institute of RIKEN, Japan, participated in this meeting both in informing the committee of the activities of the RIKEN Nishina Center for Accelerator- Based Science and the role of RBRC and as an observer of this review. In order to illustrate the breadth and scope of the RBRC program, each member of the Center made a presentation on his/her research efforts. This encompassed three major areas of investigation: theoretical, experimental and computational physics. In addition, the committee met privately with the fellows and postdocs to ascertain their opinions and concerns. Although the main purpose of this review is a report to RIKEN management on the health, scientific value, management and future prospects of the Center, the RBRC management felt that a compendium of the scientific presentations are of sufficient quality and interest that they warrant a wider distribution. Therefore we have made this compilation and present it to the community for its information and enlightenment.
MILC Code Performance on High End CPU and GPU Supercomputer Clusters

Science.gov (United States)

DeTar, Carleton; Gottlieb, Steven; Li, Ruizi; Toussaint, Doug

2018-03-01

With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.
MILC Code Performance on High End CPU and GPU Supercomputer Clusters

Directory of Open Access Journals (Sweden)

DeTar Carleton

2018-01-01

Full Text Available With recent developments in parallel supercomputing architecture, many core, multi-core, and GPU processors are now commonplace, resulting in more levels of parallelism, memory hierarchy, and programming complexity. It has been necessary to adapt the MILC code to these new processors starting with NVIDIA GPUs, and more recently, the Intel Xeon Phi processors. We report on our efforts to port and optimize our code for the Intel Knights Landing architecture. We consider performance of the MILC code with MPI and OpenMP, and optimizations with QOPQDP and QPhiX. For the latter approach, we concentrate on the staggered conjugate gradient and gauge force. We also consider performance on recent NVIDIA GPUs using the QUDA library.
Application of Supercomputer Technologies for Simulation Of Socio-Economic Systems

Directory of Open Access Journals (Sweden)

Vladimir Valentinovich Okrepilov

2015-06-01

Full Text Available To date, an extensive experience has been accumulated in investigation of problems related to quality, assessment of management systems, modeling of economic system sustainability. The performed studies have created a basis for development of a new research area — Economics of Quality. Its tools allow to use opportunities of model simulation for construction of the mathematical models adequately reflecting the role of quality in natural, technical, social regularities of functioning of the complex socio-economic systems. Extensive application and development of models, and also system modeling with use of supercomputer technologies, on our deep belief, will bring the conducted research of socio-economic systems to essentially new level. Moreover, the current scientific research makes a significant contribution to model simulation of multi-agent social systems and that is not less important, it belongs to the priority areas in development of science and technology in our country. This article is devoted to the questions of supercomputer technologies application in public sciences, first of all, — regarding technical realization of the large-scale agent-focused models (AFM. The essence of this tool is that owing to the power computer increase it has become possible to describe the behavior of many separate fragments of a difficult system, as socio-economic systems are. The article also deals with the experience of foreign scientists and practicians in launching the AFM on supercomputers, and also the example of AFM developed in CEMI RAS, stages and methods of effective calculating kernel display of multi-agent system on architecture of a modern supercomputer will be analyzed. The experiments on the basis of model simulation on forecasting the population of St. Petersburg according to three scenarios as one of the major factors influencing the development of socio-economic system and quality of life of the population are presented in the
New Mexico High School Supercomputing Challenge, 1990--1995: Five years of making a difference to students, teachers, schools, and communities. Progress report

Energy Technology Data Exchange (ETDEWEB)

Foster, M.; Kratzer, D.

1996-02-01

The New Mexico High School Supercomputing Challenge is an academic program dedicated to increasing interest in science and math among high school students by introducing them to high performance computing. This report provides a summary and evaluation of the first five years of the program, describes the program and shows the impact that it has had on high school students, their teachers, and their communities. Goals and objectives are reviewed and evaluated, growth and development of the program are analyzed, and future directions are discussed.
Proceedings of RIKEN BNL Research Center Workshop: The Approach to Equilibrium in Strongly Interacting Matter. Volume 118

Energy Technology Data Exchange (ETDEWEB)

Liao, J. [Brookhaven National Lab. (BNL), Upton, NY (United States); Venugopalan, R. [Brookhaven National Lab. (BNL), Upton, NY (United States); Berges, J. [Brookhaven National Lab. (BNL), Upton, NY (United States); Blaizot, J. -P. [Brookhaven National Lab. (BNL), Upton, NY (United States); Gelis, F. [Brookhaven National Lab. (BNL), Upton, NY (United States)

2014-04-09

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory*. It is funded by the ''Rikagaku Kenkyusho'' (RIKEN, The Institute of Physical and Chemical Research) of Japan and the U. S. Department of Energy’s Office of Science. The RBRC is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has theory, lattice gauge computing and experimental components. It is presently exploring the possibility of an astrophysics component being added to the program. The purpose of this Workshop is to critically review the recent progress on the theory and phenomenology of early time dynamics in relativistic heavy ion collisions from RHIC to LHC energies, to examine the various approaches on thermalization and existing issues, and to formulate new research efforts for the future. Topics slated to be covered include Experimental evidence for equilibration/isotropization, comparison of various approaches, dependence on the initial conditions and couplings, and turbulent cascades and Bose-Einstein condensation.
Tryton Supercomputer Capabilities for Analysis of Massive Data Streams

Directory of Open Access Journals (Sweden)

Krawczyk Henryk

2015-09-01

Full Text Available The recently deployed supercomputer Tryton, located in the Academic Computer Center of Gdansk University of Technology, provides great means for massive parallel processing. Moreover, the status of the Center as one of the main network nodes in the PIONIER network enables the fast and reliable transfer of data produced by miscellaneous devices scattered in the area of the whole country. The typical examples of such data are streams containing radio-telescope and satellite observations. Their analysis, especially with real-time constraints, can be challenging and requires the usage of dedicated software components. We propose a solution for such parallel analysis using the supercomputer, supervised by the KASKADA platform, which with the conjunction with immerse 3D visualization techniques can be used to solve problems such as pulsar detection and chronometric or oil-spill simulation on the sea surface.
Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu; Taylor, Valerie

2013-01-01

In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel's MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.
Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

KAUST Repository

Wu, Xingfu

2013-12-01

In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel\\'s MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.
Development of a Cloud Resolving Model for Heterogeneous Supercomputers

Science.gov (United States)

Sreepathi, S.; Norman, M. R.; Pal, A.; Hannah, W.; Ponder, C.

2017-12-01

A cloud resolving climate model is needed to reduce major systematic errors in climate simulations due to structural uncertainty in numerical treatments of convection - such as convective storm systems. This research describes the porting effort to enable SAM (System for Atmosphere Modeling) cloud resolving model on heterogeneous supercomputers using GPUs (Graphical Processing Units). We have isolated a standalone configuration of SAM that is targeted to be integrated into the DOE ACME (Accelerated Climate Modeling for Energy) Earth System model. We have identified key computational kernels from the model and offloaded them to a GPU using the OpenACC programming model. Furthermore, we are investigating various optimization strategies intended to enhance GPU utilization including loop fusion/fission, coalesced data access and loop refactoring to a higher abstraction level. We will present early performance results, lessons learned as well as optimization strategies. The computational platform used in this study is the Summitdev system, an early testbed that is one generation removed from Summit, the next leadership class supercomputer at Oak Ridge National Laboratory. The system contains 54 nodes wherein each node has 2 IBM POWER8 CPUs and 4 NVIDIA Tesla P100 GPUs. This work is part of a larger project, ACME-MMF component of the U.S. Department of Energy(DOE) Exascale Computing Project. The ACME-MMF approach addresses structural uncertainty in cloud processes by replacing traditional parameterizations with cloud resolving "superparameterization" within each grid cell of global climate model. Super-parameterization dramatically increases arithmetic intensity, making the MMF approach an ideal strategy to achieve good performance on emerging exascale computing architectures. The goal of the project is to integrate superparameterization into ACME, and explore its full potential to scientifically and computationally advance climate simulation and prediction.
Enabling Diverse Software Stacks on Supercomputers using High Performance Virtual Clusters.

Energy Technology Data Exchange (ETDEWEB)

Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Brightwell, Ron [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-05-01

While large-scale simulations have been the hallmark of the High Performance Computing (HPC) community for decades, Large Scale Data Analytics (LSDA) workloads are gaining attention within the scientific community not only as a processing component to large HPC simulations, but also as standalone scientific tools for knowledge discovery. With the path towards Exascale, new HPC runtime systems are also emerging in a way that differs from classical distributed com- puting models. However, system software for such capabilities on the latest extreme-scale DOE supercomputing needs to be enhanced to more appropriately support these types of emerging soft- ware ecosystems. In this paper, we propose the use of Virtual Clusters on advanced supercomputing resources to enable systems to support not only HPC workloads, but also emerging big data stacks. Specifi- cally, we have deployed the KVM hypervisor within Cray's Compute Node Linux on a XC-series supercomputer testbed. We also use libvirt and QEMU to manage and provision VMs directly on compute nodes, leveraging Ethernet-over-Aries network emulation. To our knowledge, this is the first known use of KVM on a true MPP supercomputer. We investigate the overhead our solution using HPC benchmarks, both evaluating single-node performance as well as weak scaling of a 32-node virtual cluster. Overall, we find single node performance of our solution using KVM on a Cray is very efficient with near-native performance. However overhead increases by up to 20% as virtual cluster size increases, due to limitations of the Ethernet-over-Aries bridged network. Furthermore, we deploy Apache Spark with large data analysis workloads in a Virtual Cluster, ef- fectively demonstrating how diverse software ecosystems can be supported by High Performance Virtual Clusters.

Cellular-automata supercomputers for fluid-dynamics modeling

International Nuclear Information System (INIS)

Margolus, N.; Toffoli, T.; Vichniac, G.

1986-01-01

We report recent developments in the modeling of fluid dynamics, and give experimental results (including dynamical exponents) obtained using cellular automata machines. Because of their locality and uniformity, cellular automata lend themselves to an extremely efficient physical realization; with a suitable architecture, an amount of hardware resources comparable to that of a home computer can achieve (in the simulation of cellular automata) the performance of a conventional supercomputer
The TeraGyroid Experiment – Supercomputing 2003

Directory of Open Access Journals (Sweden)

R.J. Blake

2005-01-01

Full Text Available Amphiphiles are molecules with hydrophobic tails and hydrophilic heads. When dispersed in solvents, they self assemble into complex mesophases including the beautiful cubic gyroid phase. The goal of the TeraGyroid experiment was to study defect pathways and dynamics in these gyroids. The UK's supercomputing and USA's TeraGrid facilities were coupled together, through a dedicated high-speed network, into a single computational Grid for research work that peaked around the Supercomputing 2003 conference. The gyroids were modeled using lattice Boltzmann methods with parameter spaces explored using many 1283 and 3grid point simulations, this data being used to inform the world's largest three-dimensional time dependent simulation with 10243-grid points. The experiment generated some 2 TBytes of useful data. In terms of Grid technology the project demonstrated the migration of simulations (using Globus middleware to and fro across the Atlantic exploiting the availability of resources. Integration of the systems accelerated the time to insight. Distributed visualisation of the output datasets enabled the parameter space of the interactions within the complex fluid to be explored from a number of sites, informed by discourse over the Access Grid. The project was sponsored by EPSRC (UK and NSF (USA with trans-Atlantic optical bandwidth provided by British Telecommunications.
Analyzing the Interplay of Failures and Workload on a Leadership-Class Supercomputer

Energy Technology Data Exchange (ETDEWEB)

Meneses, Esteban [University of Pittsburgh; Ni, Xiang [University of Illinois at Urbana-Champaign; Jones, Terry R [ORNL; Maxwell, Don E [ORNL

2015-01-01

The unprecedented computational power of cur- rent supercomputers now makes possible the exploration of complex problems in many scientific fields, from genomic analysis to computational fluid dynamics. Modern machines are powerful because they are massive: they assemble millions of cores and a huge quantity of disks, cards, routers, and other components. But it is precisely the size of these machines that glooms the future of supercomputing. A system that comprises many components has a high chance to fail, and fail often. In order to make the next generation of supercomputers usable, it is imperative to use some type of fault tolerance platform to run applications on large machines. Most fault tolerance strategies can be optimized for the peculiarities of each system and boost efficacy by keeping the system productive. In this paper, we aim to understand how failure characterization can improve resilience in several layers of the software stack: applications, runtime systems, and job schedulers. We examine the Titan supercomputer, one of the fastest systems in the world. We analyze a full year of Titan in production and distill the failure patterns of the machine. By looking into Titan s log files and using the criteria of experts, we provide a detailed description of the types of failures. In addition, we inspect the job submission files and describe how the system is used. Using those two sources, we cross correlate failures in the machine to executing jobs and provide a picture of how failures affect the user experience. We believe such characterization is fundamental in developing appropriate fault tolerance solutions for Cray systems similar to Titan.
Critical evaluation of quality assurance in laboratory diagnosis of tuberculosis in selected nearby microscopic centers under RNTCP

Directory of Open Access Journals (Sweden)

Anuradha

2013-08-01

Full Text Available Objective: RNTCP relies on sputum smear microscopy for diagnosis, categorization of patients for treatment and assessment of their program. Therefore, it is crucial that the smear microscopy services provided are of highest quality possible. The current study is undertaken to do on site evaluation and Random blinded rechecking (RBRC of slides at selected microscopic centers. Material & Methods: Five microscopic centers were selected for onsite evaluation and Random Blinded rechecking. Slides were collected monthly from the respective DMCs. A questionnaire was developed to assess the overall operational conditions at the DMCs and a checklist was prepared to record the observation during the visit. RBRC slides were read by two microbiologists independently and results were compared with RNTCP results. Slides were read before and after restaining the slides. Results: After the evaluation of checklist and questionnaire, it was found that 100% centers were following the charts for smear preparation, staining and grading with adequate stock supply. One out of 5 centers had maximum number of slides with poor quality of smear (16.7%, 8% uneven smear and 14% slides with improper thickness. There was 100% concordance when reading five positive and five negative smears. The mean time spent on microscopic examination was 4.4 minutes, compared with recommended time of 10 minutes. Out of 828 slides rechecked under RBRC one low false negative error was found. Conclusion: The evaluation of quality control practices was found satisfactory. The laboratory staff was able to incorporate simple quality control procedures for AFB microscopy into their routine practice, resulting in reliable service. Onsite evaluation and RBRC
Supercomputer and cluster performance modeling and analysis efforts:2004-2006.

Energy Technology Data Exchange (ETDEWEB)

Sturtevant, Judith E.; Ganti, Anand; Meyer, Harold (Hal) Edward; Stevenson, Joel O.; Benner, Robert E., Jr. (.,; .); Goudy, Susan Phelps; Doerfler, Douglas W.; Domino, Stefan Paul; Taylor, Mark A.; Malins, Robert Joseph; Scott, Ryan T.; Barnette, Daniel Wayne; Rajan, Mahesh; Ang, James Alfred; Black, Amalia Rebecca; Laub, Thomas William; Vaughan, Courtenay Thomas; Franke, Brian Claude

2007-02-01

This report describes efforts by the Performance Modeling and Analysis Team to investigate performance characteristics of Sandia's engineering and scientific applications on the ASC capability and advanced architecture supercomputers, and Sandia's capacity Linux clusters. Efforts to model various aspects of these computers are also discussed. The goals of these efforts are to quantify and compare Sandia's supercomputer and cluster performance characteristics; to reveal strengths and weaknesses in such systems; and to predict performance characteristics of, and provide guidelines for, future acquisitions and follow-on systems. Described herein are the results obtained from running benchmarks and applications to extract performance characteristics and comparisons, as well as modeling efforts, obtained during the time period 2004-2006. The format of the report, with hypertext links to numerous additional documents, purposefully minimizes the document size needed to disseminate the extensive results from our research.
Ultrascalable petaflop parallel supercomputer

Science.gov (United States)

Blumrich, Matthias A [Ridgefield, CT; Chen, Dong [Croton On Hudson, NY; Chiu, George [Cross River, NY; Cipolla, Thomas M [Katonah, NY; Coteus, Paul W [Yorktown Heights, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Hall, Shawn [Pleasantville, NY; Haring, Rudolf A [Cortlandt Manor, NY; Heidelberger, Philip [Cortlandt Manor, NY; Kopcsay, Gerard V [Yorktown Heights, NY; Ohmacht, Martin [Yorktown Heights, NY; Salapura, Valentina [Chappaqua, NY; Sugavanam, Krishnan [Mahopac, NY; Takken, Todd [Brewster, NY

2010-07-20

A massively parallel supercomputer of petaOPS-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC) having up to four processing elements. The ASIC nodes are interconnected by multiple independent networks that optimally maximize the throughput of packet communications between nodes with minimal latency. The multiple networks may include three high-speed networks for parallel algorithm message passing including a Torus, collective network, and a Global Asynchronous network that provides global barrier and notification functions. These multiple independent networks may be collaboratively or independently utilized according to the needs or phases of an algorithm for optimizing algorithm processing performance. The use of a DMA engine is provided to facilitate message passing among the nodes without the expenditure of processing resources at the node.
Productive Parallel Programming: The PCN Approach

Directory of Open Access Journals (Sweden)

Ian Foster

1992-01-01

Full Text Available We describe the PCN programming system, focusing on those features designed to improve the productivity of scientists and engineers using parallel supercomputers. These features include a simple notation for the concise specification of concurrent algorithms, the ability to incorporate existing Fortran and C code into parallel applications, facilities for reusing parallel program components, a portable toolkit that allows applications to be developed on a workstation or small parallel computer and run unchanged on supercomputers, and integrated debugging and performance analysis tools. We survey representative scientific applications and identify problem classes for which PCN has proved particularly useful.
Direct exploitation of a top 500 Supercomputer for Analysis of CMS Data

International Nuclear Information System (INIS)

Cabrillo, I; Cabellos, L; Marco, J; Fernandez, J; Gonzalez, I

2014-01-01

The Altamira Supercomputer hosted at the Instituto de Fisica de Cantatbria (IFCA) entered in operation in summer 2012. Its last generation FDR Infiniband network used (for message passing) in parallel jobs, supports the connection to General Parallel File System (GPFS) servers, enabling an efficient simultaneous processing of multiple data demanding jobs. Sharing a common GPFS system and a single LDAP-based identification with the existing Grid clusters at IFCA allows CMS researchers to exploit the large instantaneous capacity of this supercomputer to execute analysis jobs. The detailed experience describing this opportunistic use for skimming and final analysis of CMS 2012 data for a specific physics channel, resulting in an order of magnitude reduction of the waiting time, is presented.
Toward a Proof of Concept Cloud Framework for Physics Applications on Blue Gene Supercomputers

International Nuclear Information System (INIS)

Dreher, Patrick; Scullin, William; Vouk, Mladen

2015-01-01

Traditional high performance supercomputers are capable of delivering large sustained state-of-the-art computational resources to physics applications over extended periods of time using batch processing mode operating environments. However, today there is an increasing demand for more complex workflows that involve large fluctuations in the levels of HPC physics computational requirements during the simulations. Some of the workflow components may also require a richer set of operating system features and schedulers than normally found in a batch oriented HPC environment. This paper reports on progress toward a proof of concept design that implements a cloud framework onto BG/P and BG/Q platforms at the Argonne Leadership Computing Facility. The BG/P implementation utilizes the Kittyhawk utility and the BG/Q platform uses an experimental heterogeneous FusedOS operating system environment. Both platforms use the Virtual Computing Laboratory as the cloud computing system embedded within the supercomputer. This proof of concept design allows a cloud to be configured so that it can capitalize on the specialized infrastructure capabilities of a supercomputer and the flexible cloud configurations without resorting to virtualization. Initial testing of the proof of concept system is done using the lattice QCD MILC code. These types of user reconfigurable environments have the potential to deliver experimental schedulers and operating systems within a working HPC environment for physics computations that may be different from the native OS and schedulers on production HPC supercomputers. (paper)
Plane-wave electronic structure calculations on a parallel supercomputer

International Nuclear Information System (INIS)

Nelson, J.S.; Plimpton, S.J.; Sears, M.P.

1993-01-01

The development of iterative solutions of Schrodinger's equation in a plane-wave (pw) basis over the last several years has coincided with great advances in the computational power available for performing the calculations. These dual developments have enabled many new and interesting condensed matter phenomena to be studied from a first-principles approach. The authors present a detailed description of the implementation on a parallel supercomputer (hypercube) of the first-order equation-of-motion solution to Schrodinger's equation, using plane-wave basis functions and ab initio separable pseudopotentials. By distributing the plane-waves across the processors of the hypercube many of the computations can be performed in parallel, resulting in decreases in the overall computation time relative to conventional vector supercomputers. This partitioning also provides ample memory for large Fast Fourier Transform (FFT) meshes and the storage of plane-wave coefficients for many hundreds of energy bands. The usefulness of the parallel techniques is demonstrated by benchmark timings for both the FFT's and iterations of the self-consistent solution of Schrodinger's equation for different sized Si unit cells of up to 512 atoms
Problem solving in nuclear engineering using supercomputers

International Nuclear Information System (INIS)

Schmidt, F.; Scheuermann, W.; Schatz, A.

1987-01-01

The availability of supercomputers enables the engineer to formulate new strategies for problem solving. One such strategy is the Integrated Planning and Simulation System (IPSS). With the integrated systems, simulation models with greater consistency and good agreement with actual plant data can be effectively realized. In the present work some of the basic ideas of IPSS are described as well as some of the conditions necessary to build such systems. Hardware and software characteristics as realized are outlined. (orig.) [de
Assessment techniques for a learning-centered curriculum: evaluation design for adventures in supercomputing

Energy Technology Data Exchange (ETDEWEB)

Helland, B. [Ames Lab., IA (United States); Summers, B.G. [Oak Ridge National Lab., TN (United States)

1996-09-01

As the classroom paradigm shifts from being teacher-centered to being learner-centered, student assessments are evolving from typical paper and pencil testing to other methods of evaluation. Students should be probed for understanding, reasoning, and critical thinking abilities rather than their ability to return memorized facts. The assessment of the Department of Energy`s pilot program, Adventures in Supercomputing (AiS), offers one example of assessment techniques developed for learner-centered curricula. This assessment has employed a variety of methods to collect student data. Methods of assessment used were traditional testing, performance testing, interviews, short questionnaires via email, and student presentations of projects. The data obtained from these sources have been analyzed by a professional assessment team at the Center for Children and Technology. The results have been used to improve the AiS curriculum and establish the quality of the overall AiS program. This paper will discuss the various methods of assessment used and the results.
Visualizing quantum scattering on the CM-2 supercomputer

International Nuclear Information System (INIS)

Richardson, J.L.

1991-01-01

We implement parallel algorithms for solving the time-dependent Schroedinger equation on the CM-2 supercomputer. These methods are unconditionally stable as well as unitary at each time step and have the advantage of being spatially local and explicit. We show how to visualize the dynamics of quantum scattering using techniques for visualizing complex wave functions. Several scattering problems are solved to demonstrate the use of these methods. (orig.)
Design of multiple sequence alignment algorithms on parallel, distributed memory supercomputers.

Science.gov (United States)

Church, Philip C; Goscinski, Andrzej; Holt, Kathryn; Inouye, Michael; Ghoting, Amol; Makarychev, Konstantin; Reumann, Matthias

2011-01-01

The challenge of comparing two or more genomes that have undergone recombination and substantial amounts of segmental loss and gain has recently been addressed for small numbers of genomes. However, datasets of hundreds of genomes are now common and their sizes will only increase in the future. Multiple sequence alignment of hundreds of genomes remains an intractable problem due to quadratic increases in compute time and memory footprint. To date, most alignment algorithms are designed for commodity clusters without parallelism. Hence, we propose the design of a multiple sequence alignment algorithm on massively parallel, distributed memory supercomputers to enable research into comparative genomics on large data sets. Following the methodology of the sequential progressiveMauve algorithm, we design data structures including sequences and sorted k-mer lists on the IBM Blue Gene/P supercomputer (BG/P). Preliminary results show that we can reduce the memory footprint so that we can potentially align over 250 bacterial genomes on a single BG/P compute node. We verify our results on a dataset of E.coli, Shigella and S.pneumoniae genomes. Our implementation returns results matching those of the original algorithm but in 1/2 the time and with 1/4 the memory footprint for scaffold building. In this study, we have laid the basis for multiple sequence alignment of large-scale datasets on a massively parallel, distributed memory supercomputer, thus enabling comparison of hundreds instead of a few genome sequences within reasonable time.
Modeling radiative transport in ICF plasmas on an IBM SP2 supercomputer

International Nuclear Information System (INIS)

Johansen, J.A.; MacFarlane, J.J.; Moses, G.A.

1995-01-01

At the University of Wisconsin-Madison the authors have integrated a collisional-radiative-equilibrium model into their CONRAD radiation-hydrodynamics code. This integrated package allows them to accurately simulate the transport processes involved in ICF plasmas; including the important effects of self-absorption of line-radiation. However, as they increase the amount of atomic structure utilized in their transport models, the computational demands increase nonlinearly. In an attempt to meet this increased computational demand, they have recently embarked on a mission to parallelize the CONRAD program. The parallel CONRAD development is being performed on an IBM SP2 supercomputer. The parallelism is based on a message passing paradigm, and is being implemented using PVM. At the present time they have determined that approximately 70% of the sequential program can be executed in parallel. Accordingly, they expect that the parallel version will yield a speedup on the order of three times that of the sequential version. This translates into only 10 hours of execution time for the parallel version, whereas the sequential version required 30 hours
Novel Supercomputing Approaches for High Performance Linear Algebra Using FPGAs, Phase II

Data.gov (United States)

National Aeronautics and Space Administration — Supercomputing plays a major role in many areas of science and engineering, and it has had tremendous impact for decades in areas such as aerospace, defense, energy,...
Critical evaluation of quality assurance in laboratory diagnosis of tuberculosis in selected nearby microscopic centers under RNTCP

Directory of Open Access Journals (Sweden)

Anuradha

2013-01-01

Full Text Available Objective: RNTCP relies on sputum smear microscopy for diagnosis, categorization of patients for treatment and assessment of their program. Therefore, it is crucial that the smear microscopy services provided are of highest quality possible. The current study is undertaken to do on site evaluation and Random blinded rechecking (RBRC of slides at selected microscopic centers. Material & Methods: Five microscopic centers were selected for onsite evaluation and Random Blinded rechecking. Slides were collected monthly from the respective DMCs. A questionnaire was developed to assess the overall operational conditions at the DMCs and a checklist was prepared to record the observation during the visit. RBRC slides were read by two microbiologists independently and results were compared with RNTCP results. Slides were read before and after restaining the slides. Results: After the evaluation of checklist and questionnaire, it was found that 100% centers were following the charts for smear preparation, staining and grading with adequate stock supply. One out of 5 centers had maximum number of slides with poor quality of smear (16.7%, 8% uneven smear and 14% slides with improper thickness. There was 100% concordance when reading five positive and five negative smears. The mean time spent on microscopic examination was 4.4 minutes, compared with recommended time of 10 minutes. Out of 828 slides rechecked under RBRC one low false negative error was found. Conclusion: The evaluation of quality control practices was found satisfactory. The laboratory staff was able to incorporate simple quality control procedures for AFB microscopy into their routine practice, resulting in reliable service. Onsite evaluation and RBRC are viable measures of laboratory performance and both should be continued.
BSMBench: a flexible and scalable supercomputer benchmark from computational particle physics

CERN Document Server

Bennett, Ed; Del Debbio, Luigi; Jordan, Kirk; Patella, Agostino; Pica, Claudio; Rago, Antonio

2016-01-01

Benchmarking plays a central role in the evaluation of High Performance Computing architectures. Several benchmarks have been designed that allow users to stress various components of supercomputers. In order for the figures they provide to be useful, benchmarks need to be representative of the most common real-world scenarios. In this work, we introduce BSMBench, a benchmarking suite derived from Monte Carlo code used in computational particle physics. The advantage of this suite (which can be freely downloaded from http://www.bsmbench.org/) over others is the capacity to vary the relative importance of computation and communication. This enables the tests to simulate various practical situations. To showcase BSMBench, we perform a wide range of tests on various architectures, from desktop computers to state-of-the-art supercomputers, and discuss the corresponding results. Possible future directions of development of the benchmark are also outlined.
High Performance Networks From Supercomputing to Cloud Computing

CERN Document Server

Abts, Dennis

2011-01-01

Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati
Integration of Titan supercomputer at OLCF with ATLAS Production System

Science.gov (United States)

Barreiro Megino, F.; De, K.; Jha, S.; Klimentov, A.; Maeno, T.; Nilsson, P.; Oleynik, D.; Padolski, S.; Panitkin, S.; Wells, J.; Wenaus, T.; ATLAS Collaboration

2017-10-01

The PanDA (Production and Distributed Analysis) workload management system was developed to meet the scale and complexity of distributed computing for the ATLAS experiment. PanDA managed resources are distributed worldwide, on hundreds of computing sites, with thousands of physicists accessing hundreds of Petabytes of data and the rate of data processing already exceeds Exabyte per year. While PanDA currently uses more than 200,000 cores at well over 100 Grid sites, future LHC data taking runs will require more resources than Grid computing can possibly provide. Additional computing and storage resources are required. Therefore ATLAS is engaged in an ambitious program to expand the current computing model to include additional resources such as the opportunistic use of supercomputers. In this paper we will describe a project aimed at integration of ATLAS Production System with Titan supercomputer at Oak Ridge Leadership Computing Facility (OLCF). Current approach utilizes modified PanDA Pilot framework for job submission to Titan’s batch queues and local data management, with lightweight MPI wrappers to run single node workloads in parallel on Titan’s multi-core worker nodes. It provides for running of standard ATLAS production jobs on unused resources (backfill) on Titan. The system already allowed ATLAS to collect on Titan millions of core-hours per month, execute hundreds of thousands jobs, while simultaneously improving Titans utilization efficiency. We will discuss the details of the implementation, current experience with running the system, as well as future plans aimed at improvements in scalability and efficiency. Notice: This manuscript has been authored, by employees of Brookhaven Science Associates, LLC under Contract No. DE-AC02-98CH10886 with the U.S. Department of Energy. The publisher by accepting the manuscript for publication acknowledges that the United States Government retains a non-exclusive, paid-up, irrevocable, world-wide license to

Intelligent Personal Supercomputer for Solving Scientific and Technical Problems

Directory of Open Access Journals (Sweden)

Khimich, O.M.

2016-09-01

Full Text Available New domestic intellіgent personal supercomputer of hybrid architecture Inparkom_pg for the mathematical modeling of processes in the defense industry, engineering, construction, etc. was developed. Intelligent software for the automatic research and tasks of computational mathematics with approximate data of different structures was designed. Applied software to provide mathematical modeling problems in construction, welding and filtration processes was implemented.
Supercomputers and the future of computational atomic scattering physics

International Nuclear Information System (INIS)

Younger, S.M.

1989-01-01

The advent of the supercomputer has opened new vistas for the computational atomic physicist. Problems of hitherto unparalleled complexity are now being examined using these new machines, and important connections with other fields of physics are being established. This talk briefly reviews some of the most important trends in computational scattering physics and suggests some exciting possibilities for the future. 7 refs., 2 figs
Multi-petascale highly efficient parallel supercomputer

Science.gov (United States)

Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen-Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

2018-05-15

A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaflop-scale includes node architectures based upon System-On-a-Chip technology, where each processing node comprises a single Application Specific Integrated Circuit (ASIC). The ASIC nodes are interconnected by a five dimensional torus network that optimally maximize the throughput of packet communications between nodes and minimize latency. The network implements collective network and a global asynchronous network that provides global barrier and notification functions. Integrated in the node design include a list-based prefetcher. The memory system implements transaction memory, thread level speculation, and multiversioning cache that improves soft error rate at the same time and supports DMA functionality allowing for parallel processing message-passing.
Mathematical methods and supercomputing in nuclear applications. Proceedings. Vol. 2

International Nuclear Information System (INIS)

Kuesters, H.; Stein, E.; Werner, W.

1993-04-01

All papers of the two volumes are separately indexed in the data base. Main topics are: Progress in advanced numerical techniques, fluid mechanics, on-line systems, artificial intelligence applications, nodal methods reactor kinetics, reactor design, supercomputer architecture, probabilistic estimation of risk assessment, methods in transport theory, advances in Monte Carlo techniques, and man-machine interface. (orig.)
Mathematical methods and supercomputing in nuclear applications. Proceedings. Vol. 1

International Nuclear Information System (INIS)

Kuesters, H.; Stein, E.; Werner, W.

1993-04-01

All papers of the two volumes are separately indexed in the data base. Main topics are: Progress in advanced numerical techniques, fluid mechanics, on-line systems, artificial intelligence applications, nodal methods reactor kinetics, reactor design, supercomputer architecture, probabilistic estimation of risk assessment, methods in transport theory, advances in Monte Carlo techniques, and man-machine interface. (orig.)
PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP, VOLUME 65, RHIC SPIN COLLABORATION MEETINGS XXVII, XXVIII, and XXX

International Nuclear Information System (INIS)

OGAWA, A.

2004-01-01

The RIKEN BNL Research Center (RSRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the 'Rikagaku Kenkyusho' (RIKEN, The Institute of Physical and Chemical Research) of Japan. The Center is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has both a theory and experimental component. At present the theoretical group has 4 Fellows and 3 Research Associates as well as 11 RHIC Physics/University Fellows (academic year 2003-2004). To date there are approximately 30 graduates from the program of which 13 have attained tenure positions at major institutions worldwide. The experimental group is smaller and has 2 Fellows and 3 RHIC Physics/University Fellows and 3 Research Associates, and historically 6 individuals have attained permanent positions. Beginning in 2001 a new RIKEN Spin Program (RSP) category was implemented at RBRC. These appointments are joint positions of RBRC and RIKEN and include the following positions in theory and experiment: RSP Researchers, RSP Research Associates, and Young Researchers, who are mentored by senior RBRC Scientists, A number of RIKEN Jr. Research Associates and Visiting Scientists also contribute to the physics program at the Center. RBRC has an active workshop program on strong interaction physics with each workshop focused on a specific physics problem. Each workshop speaker is encouraged to select a few of the most important transparencies from his or her presentation, accompanied by a page of explanation. This material is collected at the end of the workshop by the organizer to form proceedings, which can therefore be available within a short time. To date there are sixty nine proceedings volumes available. The construction of a 0.6 teraflops parallel processor, dedicated to lattice QCD, begun at the Center on February 19, 1998, was completed on August 28, 1998 and is still
Design and performance characterization of electronic structure calculations on massively parallel supercomputers

DEFF Research Database (Denmark)

Romero, N. A.; Glinsvad, Christian; Larsen, Ask Hjorth

2013-01-01

Density function theory (DFT) is the most widely employed electronic structure method because of its favorable scaling with system size and accuracy for a broad range of molecular and condensed-phase systems. The advent of massively parallel supercomputers has enhanced the scientific community...
Computational fluid dynamics research at the United Technologies Research Center requiring supercomputers

Science.gov (United States)

Landgrebe, Anton J.

1987-01-01

An overview of research activities at the United Technologies Research Center (UTRC) in the area of Computational Fluid Dynamics (CFD) is presented. The requirement and use of various levels of computers, including supercomputers, for the CFD activities is described. Examples of CFD directed toward applications to helicopters, turbomachinery, heat exchangers, and the National Aerospace Plane are included. Helicopter rotor codes for the prediction of rotor and fuselage flow fields and airloads were developed with emphasis on rotor wake modeling. Airflow and airload predictions and comparisons with experimental data are presented. Examples are presented of recent parabolized Navier-Stokes and full Navier-Stokes solutions for hypersonic shock-wave/boundary layer interaction, and hydrogen/air supersonic combustion. In addition, other examples of CFD efforts in turbomachinery Navier-Stokes methodology and separated flow modeling are presented. A brief discussion of the 3-tier scientific computing environment is also presented, in which the researcher has access to workstations, mid-size computers, and supercomputers.
ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

Science.gov (United States)

Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

2018-04-27

A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.
Storage-Intensive Supercomputing Benchmark Study

Energy Technology Data Exchange (ETDEWEB)

Cohen, J; Dossa, D; Gokhale, M; Hysom, D; May, J; Pearce, R; Yoo, A

2007-10-30

Critical data science applications requiring frequent access to storage perform poorly on today's computing architectures. This project addresses efficient computation of data-intensive problems in national security and basic science by exploring, advancing, and applying a new form of computing called storage-intensive supercomputing (SISC). Our goal is to enable applications that simply cannot run on current systems, and, for a broad range of data-intensive problems, to deliver an order of magnitude improvement in price/performance over today's data-intensive architectures. This technical report documents much of the work done under LDRD 07-ERD-063 Storage Intensive Supercomputing during the period 05/07-09/07. The following chapters describe: (1) a new file I/O monitoring tool iotrace developed to capture the dynamic I/O profiles of Linux processes; (2) an out-of-core graph benchmark for level-set expansion of scale-free graphs; (3) an entity extraction benchmark consisting of a pipeline of eight components; and (4) an image resampling benchmark drawn from the SWarp program in the LSST data processing pipeline. The performance of the graph and entity extraction benchmarks was measured in three different scenarios: data sets residing on the NFS file server and accessed over the network; data sets stored on local disk; and data sets stored on the Fusion I/O parallel NAND Flash array. The image resampling benchmark compared performance of software-only to GPU-accelerated. In addition to the work reported here, an additional text processing application was developed that used an FPGA to accelerate n-gram profiling for language classification. The n-gram application will be presented at SC07 at the High Performance Reconfigurable Computing Technologies and Applications Workshop. The graph and entity extraction benchmarks were run on a Supermicro server housing the NAND Flash 40GB parallel disk array, the Fusion-io. The Fusion system specs are as follows
Summary of multi-core hardware and programming model investigations

Energy Technology Data Exchange (ETDEWEB)

Kelly, Suzanne Marie; Pedretti, Kevin Thomas Tauke; Levenhagen, Michael J.

2008-05-01

This report summarizes our investigations into multi-core processors and programming models for parallel scientific applications. The motivation for this study was to better understand the landscape of multi-core hardware, future trends, and the implications on system software for capability supercomputers. The results of this study are being used as input into the design of a new open-source light-weight kernel operating system being targeted at future capability supercomputers made up of multi-core processors. A goal of this effort is to create an agile system that is able to adapt to and efficiently support whatever multi-core hardware and programming models gain acceptance by the community.
Explaining the gap between theoretical peak performance and real performance for supercomputer architectures

International Nuclear Information System (INIS)

Schoenauer, W.; Haefner, H.

1993-01-01

The basic architectures of vector and parallel computers with their properties are presented. Then the memory size and the arithmetic operations in the context of memory bandwidth are discussed. For the exemplary discussion of a single operation micro-measurements of the vector triad for the IBM 3090 VF and the CRAY Y-MP/8 are presented. They reveal the details of the losses for a single operation. Then we analyze the global performance of a whole supercomputer by identifying reduction factors that bring down the theoretical peak performance to the poor real performance. The responsibilities of the manufacturer and of the user for these losses are dicussed. Then the price-performance ratio for different architectures in a snapshot of January 1991 is briefly mentioned. Finally some remarks to a user-friendly architecture for a supercomputer will be made. (orig.)
De Novo Ultrascale Atomistic Simulations On High-End Parallel Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Nakano, A; Kalia, R K; Nomura, K; Sharma, A; Vashishta, P; Shimojo, F; van Duin, A; Goddard, III, W A; Biswas, R; Srivastava, D; Yang, L H

2006-09-04

We present a de novo hierarchical simulation framework for first-principles based predictive simulations of materials and their validation on high-end parallel supercomputers and geographically distributed clusters. In this framework, high-end chemically reactive and non-reactive molecular dynamics (MD) simulations explore a wide solution space to discover microscopic mechanisms that govern macroscopic material properties, into which highly accurate quantum mechanical (QM) simulations are embedded to validate the discovered mechanisms and quantify the uncertainty of the solution. The framework includes an embedded divide-and-conquer (EDC) algorithmic framework for the design of linear-scaling simulation algorithms with minimal bandwidth complexity and tight error control. The EDC framework also enables adaptive hierarchical simulation with automated model transitioning assisted by graph-based event tracking. A tunable hierarchical cellular decomposition parallelization framework then maps the O(N) EDC algorithms onto Petaflops computers, while achieving performance tunability through a hierarchy of parameterized cell data/computation structures, as well as its implementation using hybrid Grid remote procedure call + message passing + threads programming. High-end computing platforms such as IBM BlueGene/L, SGI Altix 3000 and the NSF TeraGrid provide an excellent test grounds for the framework. On these platforms, we have achieved unprecedented scales of quantum-mechanically accurate and well validated, chemically reactive atomistic simulations--1.06 billion-atom fast reactive force-field MD and 11.8 million-atom (1.04 trillion grid points) quantum-mechanical MD in the framework of the EDC density functional theory on adaptive multigrids--in addition to 134 billion-atom non-reactive space-time multiresolution MD, with the parallel efficiency as high as 0.998 on 65,536 dual-processor BlueGene/L nodes. We have also achieved an automated execution of hierarchical QM
Evaluation of Random Blinded Re-Checking of AFB Slides under Revised National Tuberculosis Control Programme in Solapur District

Directory of Open Access Journals (Sweden)

Swapnil Vishnu Lale

2013-01-01

Full Text Available Background: One of the important components of revised national tuberculosis control programme is Good quality diagnosis, primarily by sputum smear microscopy. All efforts are made to ensure that the designated microscopy centers function at optimal level. The process of Random Blinded Re-Checking (RBRCof Acid Fast Bacillus slides is built in the programme. Objectives: To study the relationship of different types of errors detected in RBRC with respect to time, place and cost. To study the stability and capability of the process of RBRC. Methods: Analysis of secondary data of external quality assessment of Solapur district since January 2006 is supplemented by direct implementation of the programme since April 2011 till date. Data analysis is done using statistical software Minitab version 16.Results: Since January 2006 to May 2012;42191 slides were re-checked in 77 RBRC sessions at District Tuberculosis Center, Solapur.Different types of 69 errors were detected. Onsite evaluation and panel testing did not show any discordance. Barshi and Mangalwedha Tuberculosis Units (TU showed significantly higher number of errors as compared to Karmala TU. (P<0.002 Weighted Pareto Chart revealed that the costliest form of errors is high false negatives and low false negatives. Conclusion:Detection of errors in RBRC sessions follows Poisson distribution. The process of RBRC is found to be in control and capable of achieving the desired target of detection of errors.
HPL and STREAM Benchmarks on SANAM Supercomputer

KAUST Repository

Bin Sulaiman, Riman A.

2017-01-01

SANAM supercomputer was jointly built by KACST and FIAS in 2012 ranking second that year in the Green500 list with a power efficiency of 2.3 GFLOPS/W (Rohr et al., 2014). It is a heterogeneous accelerator-based HPC system that has 300 compute nodes. Each node includes two Intel Xeon E5?2650 CPUs, two AMD FirePro S10000 dual GPUs and 128 GiB of main memory. In this work, the seven benchmarks of HPCC were installed and configured to reassess the performance of SANAM, as part of an unpublished master thesis, after it was reassembled in the Kingdom of Saudi Arabia. We present here detailed results of HPL and STREAM benchmarks.
HPL and STREAM Benchmarks on SANAM Supercomputer

KAUST Repository

Bin Sulaiman, Riman A.

2017-03-13

SANAM supercomputer was jointly built by KACST and FIAS in 2012 ranking second that year in the Green500 list with a power efficiency of 2.3 GFLOPS/W (Rohr et al., 2014). It is a heterogeneous accelerator-based HPC system that has 300 compute nodes. Each node includes two Intel Xeon E5?2650 CPUs, two AMD FirePro S10000 dual GPUs and 128 GiB of main memory. In this work, the seven benchmarks of HPCC were installed and configured to reassess the performance of SANAM, as part of an unpublished master thesis, after it was reassembled in the Kingdom of Saudi Arabia. We present here detailed results of HPL and STREAM benchmarks.
An efficient implementation of a backpropagation learning algorithm on quadrics parallel supercomputer

International Nuclear Information System (INIS)

Taraglio, S.; Massaioli, F.

1995-08-01

A parallel implementation of a library to build and train Multi Layer Perceptrons via the Back Propagation algorithm is presented. The target machine is the SIMD massively parallel supercomputer Quadrics. Performance measures are provided on three different machines with different number of processors, for two network examples. A sample source code is given
Supercomputing Centers and Electricity Service Providers

DEFF Research Database (Denmark)

Patki, Tapasya; Bates, Natalie; Ghatikar, Girish

2016-01-01

from a detailed, quantitative survey-based analysis and compare the perspectives of the European grid and SCs to the ones of the United States (US). We then show that contrary to the expectation, SCs in the US are more open toward cooperating and developing demand-management strategies with their ESPs......Supercomputing Centers (SCs) have high and variable power demands, which increase the challenges of the Electricity Service Providers (ESPs) with regards to efficient electricity distribution and reliable grid operation. High penetration of renewable energy generation further exacerbates...... this problem. In order to develop a symbiotic relationship between the SCs and their ESPs and to support effective power management at all levels, it is critical to understand and analyze how the existing relationships were formed and how these are expected to evolve. In this paper, we first present results...
Multi-level programming paradigm for extreme computing

International Nuclear Information System (INIS)

Petiton, S.; Sato, M.; Emad, N.; Calvin, C.; Tsuji, M.; Dandouna, M.

2013-01-01

In order to propose a framework and programming paradigms for post peta-scale computing, on the road to exa-scale computing and beyond, we introduced new languages, associated with a hierarchical multi-level programming paradigm, allowing scientific end-users and developers to program highly hierarchical architectures designed for extreme computing. In this paper, we explain the interest of such hierarchical multi-level programming paradigm for extreme computing and its well adaptation to several large computational science applications, such as for linear algebra solvers used for reactor core physic. We describe the YML language and framework allowing describing graphs of parallel components, which may be developed using PGAS-like language such as XMP, scheduled and computed on supercomputers. Then, we propose experimentations on supercomputers (such as the 'K' and 'Hooper' ones) of the hybrid method MERAM (Multiple Explicitly Restarted Arnoldi Method) as a case study for iterative methods manipulating sparse matrices, and the block Gauss-Jordan method as a case study for direct method manipulating dense matrices. We conclude proposing evolutions for this programming paradigm. (authors)
Building more powerful less expensive supercomputers using Processing-In-Memory (PIM) LDRD final report.

Energy Technology Data Exchange (ETDEWEB)

Murphy, Richard C.

2009-09-01

This report details the accomplishments of the 'Building More Powerful Less Expensive Supercomputers Using Processing-In-Memory (PIM)' LDRD ('PIM LDRD', number 105809) for FY07-FY09. Latency dominates all levels of supercomputer design. Within a node, increasing memory latency, relative to processor cycle time, limits CPU performance. Between nodes, the same increase in relative latency impacts scalability. Processing-In-Memory (PIM) is an architecture that directly addresses this problem using enhanced chip fabrication technology and machine organization. PIMs combine high-speed logic and dense, low-latency, high-bandwidth DRAM, and lightweight threads that tolerate latency by performing useful work during memory transactions. This work examines the potential of PIM-based architectures to support mission critical Sandia applications and an emerging class of more data intensive informatics applications. This work has resulted in a stronger architecture/implementation collaboration between 1400 and 1700. Additionally, key technology components have impacted vendor roadmaps, and we are in the process of pursuing these new collaborations. This work has the potential to impact future supercomputer design and construction, reducing power and increasing performance. This final report is organized as follow: this summary chapter discusses the impact of the project (Section 1), provides an enumeration of publications and other public discussion of the work (Section 1), and concludes with a discussion of future work and impact from the project (Section 1). The appendix contains reprints of the refereed publications resulting from this work.

Supercomputers and the mathematical modeling of high complexity problems

International Nuclear Information System (INIS)

Belotserkovskii, Oleg M

2010-01-01

This paper is a review of many works carried out by members of our scientific school in past years. The general principles of constructing numerical algorithms for high-performance computers are described. Several techniques are highlighted and these are based on the method of splitting with respect to physical processes and are widely used in computing nonlinear multidimensional processes in fluid dynamics, in studies of turbulence and hydrodynamic instabilities and in medicine and other natural sciences. The advances and developments related to the new generation of high-performance supercomputing in Russia are presented.
C Versus Fortran-77 for Scientific Programming

Directory of Open Access Journals (Sweden)

Tom MacDonald

1992-01-01

Full Text Available The predominant programming language for numeric and scientific applications is Fortran-77 and supercomputers are primarily used to run large-scale numeric and scientific applications. Standard C* is not widely used for numerical and scientific programming, yet Standard C provides many desirable linguistic features not present in Fortran-77. Furthermore, the existence of a standard library and preprocessor eliminates the worst portability problems. A comparison of Standard C and Fortran-77 shows several key deficiencies in C that reduce its ability to adequately solve some numerical problems. Some of these problems have already been addressed by the C standard but others remain. Standard C with a few extensions and modifications could be suitable for all numerical applications and could become more popular in supercomputing environments.
Heat dissipation computations of a HVDC ground electrode using a supercomputer

International Nuclear Information System (INIS)

Greiss, H.; Mukhedkar, D.; Lagace, P.J.

1990-01-01

This paper reports on the temperature, of soil surrounding a High Voltage Direct Current (HVDC) toroidal ground electrode of practical dimensions, in both homogeneous and non-homogeneous soils that was computed at incremental points in time using finite difference methods on a supercomputer. Curves of the response were computed and plotted at several locations within the soil in the vicinity of the ground electrode for various values of the soil parameters
Argonne National Lab deploys Force10 networks' massively dense ethernet switch for supercomputing cluster

CERN Multimedia

2003-01-01

"Force10 Networks, Inc. today announced that Argonne National Laboratory (Argonne, IL) has successfully deployed Force10 E-Series switch/routers to connect to the TeraGrid, the world's largest supercomputing grid, sponsored by the National Science Foundation (NSF)" (1/2 page).
A supercomputing application for reactors core design and optimization

International Nuclear Information System (INIS)

Hourcade, Edouard; Gaudier, Fabrice; Arnaud, Gilles; Funtowiez, David; Ammar, Karim

2010-01-01

Advanced nuclear reactor designs are often intuition-driven processes where designers first develop or use simplified simulation tools for each physical phenomenon involved. Through the project development, complexity in each discipline increases and implementation of chaining/coupling capabilities adapted to supercomputing optimization process are often postponed to a further step so that task gets increasingly challenging. In the context of renewal in reactor designs, project of first realization are often run in parallel with advanced design although very dependant on final options. As a consequence, the development of tools to globally assess/optimize reactor core features, with the on-going design methods accuracy, is needed. This should be possible within reasonable simulation time and without advanced computer skills needed at project management scale. Also, these tools should be ready to easily cope with modeling progresses in each discipline through project life-time. An early stage development of multi-physics package adapted to supercomputing is presented. The URANIE platform, developed at CEA and based on the Data Analysis Framework ROOT, is very well adapted to this approach. It allows diversified sampling techniques (SRS, LHS, qMC), fitting tools (neuronal networks...) and optimization techniques (genetic algorithm). Also data-base management and visualization are made very easy. In this paper, we'll present the various implementing steps of this core physics tool where neutronics, thermo-hydraulics, and fuel mechanics codes are run simultaneously. A relevant example of optimization of nuclear reactor safety characteristics will be presented. Also, flexibility of URANIE tool will be illustrated with the presentation of several approaches to improve Pareto front quality. (author)
Performance Evaluation of Supercomputers using HPCC and IMB Benchmarks

Science.gov (United States)

Saini, Subhash; Ciotti, Robert; Gunney, Brian T. N.; Spelce, Thomas E.; Koniges, Alice; Dossa, Don; Adamidis, Panagiotis; Rabenseifner, Rolf; Tiyyagura, Sunil R.; Mueller, Matthias;

2006-01-01

The HPC Challenge (HPCC) benchmark suite and the Intel MPI Benchmark (IMB) are used to compare and evaluate the combined performance of processor, memory subsystem and interconnect fabric of five leading supercomputers - SGI Altix BX2, Cray XI, Cray Opteron Cluster, Dell Xeon cluster, and NEC SX-8. These five systems use five different networks (SGI NUMALINK4, Cray network, Myrinet, InfiniBand, and NEC IXS). The complete set of HPCC benchmarks are run on each of these systems. Additionally, we present Intel MPI Benchmarks (IMB) results to study the performance of 11 MPI communication functions on these systems.

An Interface for Biomedical Big Data Processing on the Tianhe-2 Supercomputer.

Science.gov (United States)

Yang, Xi; Wu, Chengkun; Lu, Kai; Fang, Lin; Zhang, Yong; Li, Shengkang; Guo, Guixin; Du, YunFei

2017-12-01

Big data, cloud computing, and high-performance computing (HPC) are at the verge of convergence. Cloud computing is already playing an active part in big data processing with the help of big data frameworks like Hadoop and Spark. The recent upsurge of high-performance computing in China provides extra possibilities and capacity to address the challenges associated with big data. In this paper, we propose Orion-a big data interface on the Tianhe-2 supercomputer-to enable big data applications to run on Tianhe-2 via a single command or a shell script. Orion supports multiple users, and each user can launch multiple tasks. It minimizes the effort needed to initiate big data applications on the Tianhe-2 supercomputer via automated configuration. Orion follows the "allocate-when-needed" paradigm, and it avoids the idle occupation of computational resources. We tested the utility and performance of Orion using a big genomic dataset and achieved a satisfactory performance on Tianhe-2 with very few modifications to existing applications that were implemented in Hadoop/Spark. In summary, Orion provides a practical and economical interface for big data processing on Tianhe-2.
Supercomputations and big-data analysis in strong-field ultrafast optical physics: filamentation of high-peak-power ultrashort laser pulses

Science.gov (United States)

Voronin, A. A.; Panchenko, V. Ya; Zheltikov, A. M.

2016-06-01

High-intensity ultrashort laser pulses propagating in gas media or in condensed matter undergo complex nonlinear spatiotemporal evolution where temporal transformations of optical field waveforms are strongly coupled to an intricate beam dynamics and ultrafast field-induced ionization processes. At the level of laser peak powers orders of magnitude above the critical power of self-focusing, the beam exhibits modulation instabilities, producing random field hot spots and breaking up into multiple noise-seeded filaments. This problem is described by a (3 + 1)-dimensional nonlinear field evolution equation, which needs to be solved jointly with the equation for ultrafast ionization of a medium. Analysis of this problem, which is equivalent to solving a billion-dimensional evolution problem, is only possible by means of supercomputer simulations augmented with coordinated big-data processing of large volumes of information acquired through theory-guiding experiments and supercomputations. Here, we review the main challenges of supercomputations and big-data processing encountered in strong-field ultrafast optical physics and discuss strategies to confront these challenges.
Quantum Hamiltonian Physics with Supercomputers

International Nuclear Information System (INIS)

Vary, James P.

2014-01-01

The vision of solving the nuclear many-body problem in a Hamiltonian framework with fundamental interactions tied to QCD via Chiral Perturbation Theory is gaining support. The goals are to preserve the predictive power of the underlying theory, to test fundamental symmetries with the nucleus as laboratory and to develop new understandings of the full range of complex quantum phenomena. Advances in theoretical frameworks (renormalization and many-body methods) as well as in computational resources (new algorithms and leadership-class parallel computers) signal a new generation of theory and simulations that will yield profound insights into the origins of nuclear shell structure, collective phenomena and complex reaction dynamics. Fundamental discovery opportunities also exist in such areas as physics beyond the Standard Model of Elementary Particles, the transition between hadronic and quark–gluon dominated dynamics in nuclei and signals that characterize dark matter. I will review some recent achievements and present ambitious consensus plans along with their challenges for a coming decade of research that will build new links between theory, simulations and experiment. Opportunities for graduate students to embark upon careers in the fast developing field of supercomputer simulations is also discussed
Quantum Hamiltonian Physics with Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Vary, James P.

2014-06-15

The vision of solving the nuclear many-body problem in a Hamiltonian framework with fundamental interactions tied to QCD via Chiral Perturbation Theory is gaining support. The goals are to preserve the predictive power of the underlying theory, to test fundamental symmetries with the nucleus as laboratory and to develop new understandings of the full range of complex quantum phenomena. Advances in theoretical frameworks (renormalization and many-body methods) as well as in computational resources (new algorithms and leadership-class parallel computers) signal a new generation of theory and simulations that will yield profound insights into the origins of nuclear shell structure, collective phenomena and complex reaction dynamics. Fundamental discovery opportunities also exist in such areas as physics beyond the Standard Model of Elementary Particles, the transition between hadronic and quark–gluon dominated dynamics in nuclei and signals that characterize dark matter. I will review some recent achievements and present ambitious consensus plans along with their challenges for a coming decade of research that will build new links between theory, simulations and experiment. Opportunities for graduate students to embark upon careers in the fast developing field of supercomputer simulations is also discussed.
Object-Oriented Scientific Programming with Fortran 90

Science.gov (United States)

Norton, C.

1998-01-01

Fortran 90 is a modern language that introduces many important new features beneficial for scientific programming. We discuss our experiences in plasma particle simulation and unstructured adaptive mesh refinement on supercomputers, illustrating the features of Fortran 90 that support the object-oriented methodology.
Coherent 40 Gb/s SP-16QAM and 80 Gb/s PDM-16QAM in an Optimal Supercomputer Optical Switch Fabric

DEFF Research Database (Denmark)

Karinou, Fotini; Borkowski, Robert; Zibar, Darko

2013-01-01

We demonstrate, for the first time, the feasibility of using 40 Gb/s SP-16QAM and 80 Gb/s PDM-16QAM in an optimized cell switching supercomputer optical interconnect architecture based on semiconductor optical amplifiers as ON/OFF gates.......We demonstrate, for the first time, the feasibility of using 40 Gb/s SP-16QAM and 80 Gb/s PDM-16QAM in an optimized cell switching supercomputer optical interconnect architecture based on semiconductor optical amplifiers as ON/OFF gates....
PROCEEDINGS OF RIKEN BNL RESEARCH CENTER WORKSHOP, VOLUME 66

International Nuclear Information System (INIS)

OGAWA, A.

2005-01-01

The RIKEN BNL Research Center (RSRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the 'Rikagaku Kenkyusho (RIKEN, The Institute of Physical and Chemical Research) of Japan. The Center is dedicated to the study of strong interactions, including spin physics, lattice QCD, and RHIC physics through the nurturing of a new generation of young physicists. The RBRC has both a theory and experimental component. At present the theoretical group has 4 Fellows and 3 Research Associates as well as 11 RHIC Physics/University Fellows (academic year 2003-2004). To date there are approximately 30 graduates from the program of which 13 have attained tenure positions at major institutions worldwide. The experimental group is smaller and has 2 Fellows and 3 RHIC Physics/University Fellows and 3 Research Associates, and historically 6 individuals have attained permanent positions. Beginning in 2001 a new RIKEN Spin Program (RSP) category was implemented at RBRC. These appointments are joint positions of RBRC and RIKEN and include the following positions in theory and experiment: RSP Researchers, RSP Research Associates, and Young Researchers, who are mentored by senior RBRC Scientists, A number of RIKEN Jr. Research Associates and Visiting Scientists also contribute to the physics program at the Center. RBRC has an active workshop program on strong interaction physics with each workshop focused on a specific physics problem. Each workshop speaker is encouraged to select a few of the most important transparencies from his or her presentation, accompanied by a page of explanation. This material is collected at the end of the workshop by the organizer to form proceedings, which can therefore be available within a short time. To date there are sixty nine proceedings volumes available. The construction of a 0.6 teraflops parallel processor, dedicated to lattice QCD, begun at the Center on February 19, 1998, was completed on August 28, 1998 and is still
Cooperative visualization and simulation in a supercomputer environment

International Nuclear Information System (INIS)

Ruehle, R.; Lang, U.; Wierse, A.

1993-01-01

The article takes a closer look on the requirements being imposed by the idea to integrate all the components into a homogeneous software environment. To this end several methods for the distribtuion of applications in dependence of certain problem types are discussed. The currently available methods at the University of Stuttgart Computer Center for the distribution of applications are further explained. Finally the aims and characteristics of a European sponsored project, called PAGEIN, are explained, which fits perfectly into the line of developments at RUS. The aim of the project is to experiment with future cooperative working modes of aerospace scientists in a high speed distributed supercomputing environment. Project results will have an impact on the development of real future scientific application environments. (orig./DG)
Accelerating Science Impact through Big Data Workflow Management and Supercomputing

Directory of Open Access Journals (Sweden)

De K.

2016-01-01

Full Text Available The Large Hadron Collider (LHC, operating at the international CERN Laboratory in Geneva, Switzerland, is leading Big Data driven scientific explorations. ATLAS, one of the largest collaborations ever assembled in the the history of science, is at the forefront of research at the LHC. To address an unprecedented multi-petabyte data processing challenge, the ATLAS experiment is relying on a heterogeneous distributed computational infrastructure. To manage the workflow for all data processing on hundreds of data centers the PanDA (Production and Distributed AnalysisWorkload Management System is used. An ambitious program to expand PanDA to all available computing resources, including opportunistic use of commercial and academic clouds and Leadership Computing Facilities (LCF, is realizing within BigPanDA and megaPanDA projects. These projects are now exploring how PanDA might be used for managing computing jobs that run on supercomputers including OLCF’s Titan and NRC-KI HPC2. The main idea is to reuse, as much as possible, existing components of the PanDA system that are already deployed on the LHC Grid for analysis of physics data. The next generation of PanDA will allow many data-intensive sciences employing a variety of computing platforms to benefit from ATLAS experience and proven tools in highly scalable processing.
Use of high performance networks and supercomputers for real-time flight simulation

Science.gov (United States)

Cleveland, Jeff I., II

1993-01-01

In order to meet the stringent time-critical requirements for real-time man-in-the-loop flight simulation, computer processing operations must be consistent in processing time and be completed in as short a time as possible. These operations include simulation mathematical model computation and data input/output to the simulators. In 1986, in response to increased demands for flight simulation performance, NASA's Langley Research Center (LaRC), working with the contractor, developed extensions to the Computer Automated Measurement and Control (CAMAC) technology which resulted in a factor of ten increase in the effective bandwidth and reduced latency of modules necessary for simulator communication. This technology extension is being used by more than 80 leading technological developers in the United States, Canada, and Europe. Included among the commercial applications are nuclear process control, power grid analysis, process monitoring, real-time simulation, and radar data acquisition. Personnel at LaRC are completing the development of the use of supercomputers for mathematical model computation to support real-time flight simulation. This includes the development of a real-time operating system and development of specialized software and hardware for the simulator network. This paper describes the data acquisition technology and the development of supercomputing for flight simulation.
Federal Market Information Technology in the Post Flash Crash Era: Roles for Supercomputing

Energy Technology Data Exchange (ETDEWEB)

Bethel, E. Wes; Leinweber, David; Ruebel, Oliver; Wu, Kesheng

2011-09-16

This paper describes collaborative work between active traders, regulators, economists, and supercomputing researchers to replicate and extend investigations of the Flash Crash and other market anomalies in a National Laboratory HPC environment. Our work suggests that supercomputing tools and methods will be valuable to market regulators in achieving the goal of market safety, stability, and security. Research results using high frequency data and analytics are described, and directions for future development are discussed. Currently the key mechanism for preventing catastrophic market action are “circuit breakers.” We believe a more graduated approach, similar to the “yellow light” approach in motorsports to slow down traffic, might be a better way to achieve the same goal. To enable this objective, we study a number of indicators that could foresee hazards in market conditions and explore options to confirm such predictions. Our tests confirm that Volume Synchronized Probability of Informed Trading (VPIN) and a version of volume Herfindahl-Hirschman Index (HHI) for measuring market fragmentation can indeed give strong signals ahead of the Flash Crash event on May 6 2010. This is a preliminary step toward a full-fledged early-warning system for unusual market conditions.
A fast random number generator for the Intel Paragon supercomputer

Science.gov (United States)

Gutbrod, F.

1995-06-01

A pseudo-random number generator is presented which makes optimal use of the architecture of the i860-microprocessor and which is expected to have a very long period. It is therefore a good candidate for use on the parallel supercomputer Paragon XP. In the assembler version, it needs 6.4 cycles for a real∗4 random number. There is a FORTRAN routine which yields identical numbers up to rare and minor rounding discrepancies, and it needs 28 cycles. The FORTRAN performance on other microprocessors is somewhat better. Arguments for the quality of the generator and some numerical tests are given.
Frequently updated noise threat maps created with use of supercomputing grid

Directory of Open Access Journals (Sweden)

Szczodrak Maciej

2014-09-01

Full Text Available An innovative supercomputing grid services devoted to noise threat evaluation were presented. The services described in this paper concern two issues, first is related to the noise mapping, while the second one focuses on assessment of the noise dose and its influence on the human hearing system. The discussed serviceswere developed within the PL-Grid Plus Infrastructure which accumulates Polish academic supercomputer centers. Selected experimental results achieved by the usage of the services proposed were presented. The assessment of the environmental noise threats includes creation of the noise maps using either ofline or online data, acquired through a grid of the monitoring stations. A concept of estimation of the source model parameters based on the measured sound level for the purpose of creating frequently updated noise maps was presented. Connecting the noise mapping grid service with a distributed sensor network enables to automatically update noise maps for a specified time period. Moreover, a unique attribute of the developed software is the estimation of the auditory effects evoked by the exposure to noise. The estimation method uses a modified psychoacoustic model of hearing and is based on the calculated noise level values and on the given exposure period. Potential use scenarios of the grid services for research or educational purpose were introduced. Presentation of the results of predicted hearing threshold shift caused by exposure to excessive noise can raise the public awareness of the noise threats.
Using the LANSCE irradiation facility to predict the number of fatal soft errors in one of the world's fastest supercomputers

International Nuclear Information System (INIS)

Michalak, S.E.; Harris, K.W.; Hengartner, N.W.; Takala, B.E.; Wender, S.A.

2005-01-01

Los Alamos National Laboratory (LANL) is home to the Los Alamos Neutron Science Center (LANSCE). LANSCE is a unique facility because its neutron spectrum closely mimics the neutron spectrum at terrestrial and aircraft altitudes, but is many times more intense. Thus, LANSCE provides an ideal setting for accelerated testing of semiconductor and other devices that are susceptible to cosmic ray induced neutrons. Many industrial companies use LANSCE to estimate device susceptibility to cosmic ray induced neutrons, and it has also been used to test parts from one of LANL's supercomputers, the ASC (Advanced Simulation and Computing Program) Q. This paper discusses our use of the LANSCE facility to study components in Q including a comparison with failure data from Q

Evaluating the networking characteristics of the Cray XC-40 Intel Knights Landing-based Cori supercomputer at NERSC

Energy Technology Data Exchange (ETDEWEB)

Doerfler, Douglas [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Austin, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Cook, Brandon [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Deslippe, Jack [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Kandalla, Krishna [Cray Inc, Bloomington, MN (United States); Mendygral, Peter [Cray Inc, Bloomington, MN (United States)

2017-09-12

There are many potential issues associated with deploying the Intel Xeon Phi™ (code named Knights Landing [KNL]) manycore processor in a large-scale supercomputer. One in particular is the ability to fully utilize the high-speed communications network, given that the serial performance of a Xeon Phi TM core is a fraction of a Xeon®core. In this paper, we take a look at the trade-offs associated with allocating enough cores to fully utilize the Aries high-speed network versus cores dedicated to computation, e.g., the trade-off between MPI and OpenMP. In addition, we evaluate new features of Cray MPI in support of KNL, such as internode optimizations. We also evaluate one-sided programming models such as Unified Parallel C. We quantify the impact of the above trade-offs and features using a suite of National Energy Research Scientific Computing Center applications.
PKU-RBRC Workshop on Transverse Spin

Energy Technology Data Exchange (ETDEWEB)

Avakian,H.; Bunce, G.; Yuan, F.

2008-06-30

Understanding the structure of the nucleon is a fundamental question in subatomic physics, and it has been under intensive investigation for the last several years. Modern research focuses in particular on the spin structure of the nucleon. Experimental and theoretical investigations worldwide over the last few decades have established that, contrary to nave quark model expectations, quarks carry only about 30% of the totd spin of the proton. The origin of the remaining spin is the key question in current hadronic physics and also the major driving forces for the current and future experiments, such as RHIC and CEBAF in US, JPARC in Japan, COMPASS at CERN in Europe, FAIR at GSI in Germany. Among these studies, the transverse-spin physics develops actively and rapidly in the last few years. Recent studies reveal that transverse-spin physics is closely related to many fundamental properties of the QCD dynamics such as the factorization, the non-trivial universality of the parton distribution and fragmentation functions. It was very timely to bring together the theorists and experimentalists in this field at this workshop to review and discuss the latest developments and future perspective in hadronic spin physics. This workshop was very success iu many aspects. First of all, it attracted almost every expert working in this field. We had more than eighty participants in total, among them 27 came from the US institutes, 13 from Europe, 3 from Korea, and 2 from Japan. The rest participants came from local institutes in China. Second, we arranged plenty physics presentations, and the program covers all recent progresses made in the last few years. In total, we had 47 physics presentations, and two round table discussions. The discussion sessions were especially very useful and very much appreciated by all participants. In addition, we also scheduled plenty time for discussion in each presentation, and the living discussions impressed and benefited all participants.
Performance Analysis and Scaling Behavior of the Terrestrial Systems Modeling Platform TerrSysMP in Large-Scale Supercomputing Environments

Science.gov (United States)

Kollet, S. J.; Goergen, K.; Gasper, F.; Shresta, P.; Sulis, M.; Rihani, J.; Simmer, C.; Vereecken, H.

2013-12-01

In studies of the terrestrial hydrologic, energy and biogeochemical cycles, integrated multi-physics simulation platforms take a central role in characterizing non-linear interactions, variances and uncertainties of system states and fluxes in reciprocity with observations. Recently developed integrated simulation platforms attempt to honor the complexity of the terrestrial system across multiple time and space scales from the deeper subsurface including groundwater dynamics into the atmosphere. Technically, this requires the coupling of atmospheric, land surface, and subsurface-surface flow models in supercomputing environments, while ensuring a high-degree of efficiency in the utilization of e.g., standard Linux clusters and massively parallel resources. A systematic performance analysis including profiling and tracing in such an application is crucial in the understanding of the runtime behavior, to identify optimum model settings, and is an efficient way to distinguish potential parallel deficiencies. On sophisticated leadership-class supercomputers, such as the 28-rack 5.9 petaFLOP IBM Blue Gene/Q 'JUQUEEN' of the Jülich Supercomputing Centre (JSC), this is a challenging task, but even more so important, when complex coupled component models are to be analysed. Here we want to present our experience from coupling, application tuning (e.g. 5-times speedup through compiler optimizations), parallel scaling and performance monitoring of the parallel Terrestrial Systems Modeling Platform TerrSysMP. The modeling platform consists of the weather prediction system COSMO of the German Weather Service; the Community Land Model, CLM of NCAR; and the variably saturated surface-subsurface flow code ParFlow. The model system relies on the Multiple Program Multiple Data (MPMD) execution model where the external Ocean-Atmosphere-Sea-Ice-Soil coupler (OASIS3) links the component models. TerrSysMP has been instrumented with the performance analysis tool Scalasca and analyzed
Communication Characterization and Optimization of Applications Using Topology-Aware Task Mapping on Large Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sreepathi, Sarat [ORNL; D' Azevedo, Eduardo [ORNL; Philip, Bobby [ORNL; Worley, Patrick H [ORNL

2016-01-01

On large supercomputers, the job scheduling systems may assign a non-contiguous node allocation for user applications depending on available resources. With parallel applications using MPI (Message Passing Interface), the default process ordering does not take into account the actual physical node layout available to the application. This contributes to non-locality in terms of physical network topology and impacts communication performance of the application. In order to mitigate such performance penalties, this work describes techniques to identify suitable task mapping that takes the layout of the allocated nodes as well as the application's communication behavior into account. During the first phase of this research, we instrumented and collected performance data to characterize communication behavior of critical US DOE (United States - Department of Energy) applications using an augmented version of the mpiP tool. Subsequently, we developed several reordering methods (spectral bisection, neighbor join tree etc.) to combine node layout and application communication data for optimized task placement. We developed a tool called mpiAproxy to facilitate detailed evaluation of the various reordering algorithms without requiring full application executions. This work presents a comprehensive performance evaluation (14,000 experiments) of the various task mapping techniques in lowering communication costs on Titan, the leadership class supercomputer at Oak Ridge National Laboratory.
Plasma turbulence calculations on supercomputers

International Nuclear Information System (INIS)

Carreras, B.A.; Charlton, L.A.; Dominguez, N.; Drake, J.B.; Garcia, L.; Leboeuf, J.N.; Lee, D.K.; Lynch, V.E.; Sidikman, K.

1991-01-01

Although the single-particle picture of magnetic confinement is helpful in understanding some basic physics of plasma confinement, it does not give a full description. Collective effects dominate plasma behavior. Any analysis of plasma confinement requires a self-consistent treatment of the particles and fields. The general picture is further complicated because the plasma, in general, is turbulent. The study of fluid turbulence is a rather complex field by itself. In addition to the difficulties of classical fluid turbulence, plasma turbulence studies face the problems caused by the induced magnetic turbulence, which couples field by itself. In addition to the difficulties of classical fluid turbulence, plasma turbulence studies face the problems caused by the induced magnetic turbulence, which couples back to the fluid. Since the fluid is not a perfect conductor, this turbulence can lead to changes in the topology of the magnetic field structure, causing the magnetic field lines to wander radially. Because the plasma fluid flows along field lines, they carry the particles with them, and this enhances the losses caused by collisions. The changes in topology are critical for the plasma confinement. The study of plasma turbulence and the concomitant transport is a challenging problem. Because of the importance of solving the plasma turbulence problem for controlled thermonuclear research, the high complexity of the problem, and the necessity of attacking the problem with supercomputers, the study of plasma turbulence in magnetic confinement devices is a Grand Challenge problem
Reactive flow simulations in complex geometries with high-performance supercomputing

International Nuclear Information System (INIS)

Rehm, W.; Gerndt, M.; Jahn, W.; Vogelsang, R.; Binninger, B.; Herrmann, M.; Olivier, H.; Weber, M.

2000-01-01

In this paper, we report on a modern field code cluster consisting of state-of-the-art reactive Navier-Stokes- and reactive Euler solvers that has been developed on vector- and parallel supercomputers at the research center Juelich. This field code cluster is used for hydrogen safety analyses of technical systems, for example, in the field of nuclear reactor safety and conventional hydrogen demonstration plants with fuel cells. Emphasis is put on the assessment of combustion loads, which could result from slow, fast or rapid flames, including transition from deflagration to detonation. As a sample of proof tests, the special tools have been tested for specific tasks, based on the comparison of experimental and numerical results, which are in reasonable agreement. (author)
Parallel Multivariate Spatio-Temporal Clustering of Large Ecological Datasets on Hybrid Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Sreepathi, Sarat [ORNL; Kumar, Jitendra [ORNL; Mills, Richard T. [Argonne National Laboratory; Hoffman, Forrest M. [ORNL; Sripathi, Vamsi [Intel Corporation; Hargrove, William Walter [United States Department of Agriculture (USDA), United States Forest Service (USFS)

2017-09-01

A proliferation of data from vast networks of remote sensing platforms (satellites, unmanned aircraft systems (UAS), airborne etc.), observational facilities (meteorological, eddy covariance etc.), state-of-the-art sensors, and simulation models offer unprecedented opportunities for scientific discovery. Unsupervised classification is a widely applied data mining approach to derive insights from such data. However, classification of very large data sets is a complex computational problem that requires efficient numerical algorithms and implementations on high performance computing (HPC) platforms. Additionally, increasing power, space, cooling and efficiency requirements has led to the deployment of hybrid supercomputing platforms with complex architectures and memory hierarchies like the Titan system at Oak Ridge National Laboratory. The advent of such accelerated computing architectures offers new challenges and opportunities for big data analytics in general and specifically, large scale cluster analysis in our case. Although there is an existing body of work on parallel cluster analysis, those approaches do not fully meet the needs imposed by the nature and size of our large data sets. Moreover, they had scaling limitations and were mostly limited to traditional distributed memory computing platforms. We present a parallel Multivariate Spatio-Temporal Clustering (MSTC) technique based on k-means cluster analysis that can target hybrid supercomputers like Titan. We developed a hybrid MPI, CUDA and OpenACC implementation that can utilize both CPU and GPU resources on computational nodes. We describe performance results on Titan that demonstrate the scalability and efficacy of our approach in processing large ecological data sets.
High temporal resolution mapping of seismic noise sources using heterogeneous supercomputers

Science.gov (United States)

Gokhberg, Alexey; Ermert, Laura; Paitz, Patrick; Fichtner, Andreas

2017-04-01

Time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems. Significant interest in seismic noise source maps with high temporal resolution (days) is expected to come from a number of domains, including natural resources exploration, analysis of active earthquake fault zones and volcanoes, as well as geothermal and hydrocarbon reservoir monitoring. Currently, knowledge of noise sources is insufficient for high-resolution subsurface monitoring applications. Near-real-time seismic data, as well as advanced imaging methods to constrain seismic noise sources have recently become available. These methods are based on the massive cross-correlation of seismic noise records from all available seismic stations in the region of interest and are therefore very computationally intensive. Heterogeneous massively parallel supercomputing systems introduced in the recent years combine conventional multi-core CPU with GPU accelerators and provide an opportunity for manifold increase and computing performance. Therefore, these systems represent an efficient platform for implementation of a noise source mapping solution. We present the first results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service that provides seismic noise source maps for Central Europe with high temporal resolution (days to few weeks depending on frequency and data availability). The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept in order to provide the interested external researchers the regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for
Harnessing Petaflop-Scale Multi-Core Supercomputing for Problems in Space Science

Science.gov (United States)

Albright, B. J.; Yin, L.; Bowers, K. J.; Daughton, W.; Bergen, B.; Kwan, T. J.

2008-12-01

The particle-in-cell kinetic plasma code VPIC has been migrated successfully to the world's fastest supercomputer, Roadrunner, a hybrid multi-core platform built by IBM for the Los Alamos National Laboratory. How this was achieved will be described and examples of state-of-the-art calculations in space science, in particular, the study of magnetic reconnection, will be presented. With VPIC on Roadrunner, we have performed, for the first time, plasma PIC calculations with over one trillion particles, >100× larger than calculations considered "heroic" by community standards. This allows examination of physics at unprecedented scale and fidelity. Roadrunner is an example of an emerging paradigm in supercomputing: the trend toward multi-core systems with deep hierarchies and where memory bandwidth optimization is vital to achieving high performance. Getting VPIC to perform well on such systems is a formidable challenge: the core algorithm is memory bandwidth limited with low compute-to-data ratio and requires random access to memory in its inner loop. That we were able to get VPIC to perform and scale well, achieving >0.374 Pflop/s and linear weak scaling on real physics problems on up to the full 12240-core Roadrunner machine, bodes well for harnessing these machines for our community's needs in the future. Many of the design considerations encountered commute to other multi-core and accelerated (e.g., via GPU) platforms and we modified VPIC with flexibility in mind. These will be summarized and strategies for how one might adapt a code for such platforms will be shared. Work performed under the auspices of the U.S. DOE by the LANS LLC Los Alamos National Laboratory. Dr. Bowers is a LANL Guest Scientist; he is presently at D. E. Shaw Research LLC, 120 W 45th Street, 39th Floor, New York, NY 10036.
Visualization at supercomputing centers: the tale of little big iron and the three skinny guys.

Science.gov (United States)

Bethel, E W; van Rosendale, J; Southard, D; Gaither, K; Childs, H; Brugger, E; Ahern, S

2011-01-01

Supercomputing centers are unique resources that aim to enable scientific knowledge discovery by employing large computational resources-the "Big Iron." Design, acquisition, installation, and management of the Big Iron are carefully planned and monitored. Because these Big Iron systems produce a tsunami of data, it's natural to colocate the visualization and analysis infrastructure. This infrastructure consists of hardware (Little Iron) and staff (Skinny Guys). Our collective experience suggests that design, acquisition, installation, and management of the Little Iron and Skinny Guys doesn't receive the same level of treatment as that of the Big Iron. This article explores the following questions about the Little Iron: How should we size the Little Iron to adequately support visualization and analysis of data coming off the Big Iron? What sort of capabilities must it have? Related questions concern the size of visualization support staff: How big should a visualization program be-that is, how many Skinny Guys should it have? What should the staff do? How much of the visualization should be provided as a support service, and how much should applications scientists be expected to do on their own?
Use of QUADRICS supercomputer as embedded simulator in emergency management systems

International Nuclear Information System (INIS)

Bove, R.; Di Costanzo, G.; Ziparo, A.

1996-07-01

The experience related to the implementation of a MRBT, atmospheric spreading model with a short duration releasing, are reported. This model was implemented on a QUADRICS-Q1 supercomputer. First is reported a description of the MRBT model. It is an analytical model to study the speadings of light gases realised in the atmosphere cause incidental releasing. The solution of diffusion equation is Gaussian like. It yield the concentration of pollutant substance released. The concentration is function of space and time. Thus the QUADRICS architecture is introduced. And the implementation of the model is described. At the end it will be consider the integration of the QUADRICS-based model as simulator in a emergency management system
SUPERCOMPUTER SIMULATION OF CRITICAL PHENOMENA IN COMPLEX SOCIAL SYSTEMS

Directory of Open Access Journals (Sweden)

Petrus M.A. Sloot

2014-09-01

Full Text Available The paper describes a problem of computer simulation of critical phenomena in complex social systems on a petascale computing systems in frames of complex networks approach. The three-layer system of nested models of complex networks is proposed including aggregated analytical model to identify critical phenomena, detailed model of individualized network dynamics and model to adjust a topological structure of a complex network. The scalable parallel algorithm covering all layers of complex networks simulation is proposed. Performance of the algorithm is studied on different supercomputing systems. The issues of software and information infrastructure of complex networks simulation are discussed including organization of distributed calculations, crawling the data in social networks and results visualization. The applications of developed methods and technologies are considered including simulation of criminal networks disruption, fast rumors spreading in social networks, evolution of financial networks and epidemics spreading.
Lectures in Supercomputational Neurosciences Dynamics in Complex Brain Networks

CERN Document Server

Graben, Peter beim; Thiel, Marco; Kurths, Jürgen

2008-01-01

Computational Neuroscience is a burgeoning field of research where only the combined effort of neuroscientists, biologists, psychologists, physicists, mathematicians, computer scientists, engineers and other specialists, e.g. from linguistics and medicine, seem to be able to expand the limits of our knowledge. The present volume is an introduction, largely from the physicists' perspective, to the subject matter with in-depth contributions by system neuroscientists. A conceptual model for complex networks of neurons is introduced that incorporates many important features of the real brain, such as various types of neurons, various brain areas, inhibitory and excitatory coupling and the plasticity of the network. The computational implementation on supercomputers, which is introduced and discussed in detail in this book, will enable the readers to modify and adapt the algortihm for their own research. Worked-out examples of applications are presented for networks of Morris-Lecar neurons to model the cortical co...
Parallel supercomputing: Advanced methods, algorithms, and software for large-scale linear and nonlinear problems

Energy Technology Data Exchange (ETDEWEB)

Carey, G.F.; Young, D.M.

1993-12-31

The program outlined here is directed to research on methods, algorithms, and software for distributed parallel supercomputers. Of particular interest are finite element methods and finite difference methods together with sparse iterative solution schemes for scientific and engineering computations of very large-scale systems. Both linear and nonlinear problems will be investigated. In the nonlinear case, applications with bifurcation to multiple solutions will be considered using continuation strategies. The parallelizable numerical methods of particular interest are a family of partitioning schemes embracing domain decomposition, element-by-element strategies, and multi-level techniques. The methods will be further developed incorporating parallel iterative solution algorithms with associated preconditioners in parallel computer software. The schemes will be implemented on distributed memory parallel architectures such as the CRAY MPP, Intel Paragon, the NCUBE3, and the Connection Machine. We will also consider other new architectures such as the Kendall-Square (KSQ) and proposed machines such as the TERA. The applications will focus on large-scale three-dimensional nonlinear flow and reservoir problems with strong convective transport contributions. These are legitimate grand challenge class computational fluid dynamics (CFD) problems of significant practical interest to DOE. The methods developed and algorithms will, however, be of wider interest.
Large scale simulations of lattice QCD thermodynamics on Columbia Parallel Supercomputers

International Nuclear Information System (INIS)

Ohta, Shigemi

1989-01-01

The Columbia Parallel Supercomputer project aims at the construction of a parallel processing, multi-gigaflop computer optimized for numerical simulations of lattice QCD. The project has three stages; 16-node, 1/4GF machine completed in April 1985, 64-node, 1GF machine completed in August 1987, and 256-node, 16GF machine now under construction. The machines all share a common architecture; a two dimensional torus formed from a rectangular array of N 1 x N 2 independent and identical processors. A processor is capable of operating in a multi-instruction multi-data mode, except for periods of synchronous interprocessor communication with its four nearest neighbors. Here the thermodynamics simulations on the two working machines are reported. (orig./HSI)
High Temporal Resolution Mapping of Seismic Noise Sources Using Heterogeneous Supercomputers

Science.gov (United States)

Paitz, P.; Gokhberg, A.; Ermert, L. A.; Fichtner, A.

2017-12-01

The time- and space-dependent distribution of seismic noise sources is becoming a key ingredient of modern real-time monitoring of various geo-systems like earthquake fault zones, volcanoes, geothermal and hydrocarbon reservoirs. We present results of an ongoing research project conducted in collaboration with the Swiss National Supercomputing Centre (CSCS). The project aims at building a service providing seismic noise source maps for Central Europe with high temporal resolution. We use source imaging methods based on the cross-correlation of seismic noise records from all seismic stations available in the region of interest. The service is hosted on the CSCS computing infrastructure; all computationally intensive processing is performed on the massively parallel heterogeneous supercomputer "Piz Daint". The solution architecture is based on the Application-as-a-Service concept to provide the interested researchers worldwide with regular access to the noise source maps. The solution architecture includes the following sub-systems: (1) data acquisition responsible for collecting, on a periodic basis, raw seismic records from the European seismic networks, (2) high-performance noise source mapping application responsible for the generation of source maps using cross-correlation of seismic records, (3) back-end infrastructure for the coordination of various tasks and computations, (4) front-end Web interface providing the service to the end-users and (5) data repository. The noise source mapping itself rests on the measurement of logarithmic amplitude ratios in suitably pre-processed noise correlations, and the use of simplified sensitivity kernels. During the implementation we addressed various challenges, in particular, selection of data sources and transfer protocols, automation and monitoring of daily data downloads, ensuring the required data processing performance, design of a general service-oriented architecture for coordination of various sub-systems, and
An Optimized Parallel FDTD Topology for Challenging Electromagnetic Simulations on Supercomputers

Directory of Open Access Journals (Sweden)

Shugang Jiang

2015-01-01

Full Text Available It may not be a challenge to run a Finite-Difference Time-Domain (FDTD code for electromagnetic simulations on a supercomputer with more than 10 thousands of CPU cores; however, to make FDTD code work with the highest efficiency is a challenge. In this paper, the performance of parallel FDTD is optimized through MPI (message passing interface virtual topology, based on which a communication model is established. The general rules of optimal topology are presented according to the model. The performance of the method is tested and analyzed on three high performance computing platforms with different architectures in China. Simulations including an airplane with a 700-wavelength wingspan, and a complex microstrip antenna array with nearly 2000 elements are performed very efficiently using a maximum of 10240 CPU cores.
Speedup predictions on large scientific parallel programs

International Nuclear Information System (INIS)

Williams, E.; Bobrowicz, F.

1985-01-01

How much speedup can we expect for large scientific parallel programs running on supercomputers. For insight into this problem we extend the parallel processing environment currently existing on the Cray X-MP (a shared memory multiprocessor with at most four processors) to a simulated N-processor environment, where N greater than or equal to 1. Several large scientific parallel programs from Los Alamos National Laboratory were run in this simulated environment, and speedups were predicted. A speedup of 14.4 on 16 processors was measured for one of the three most used codes at the Laboratory
Development of a high performance eigensolver on the peta-scale next generation supercomputer system

International Nuclear Information System (INIS)

Imamura, Toshiyuki; Yamada, Susumu; Machida, Masahiko

2010-01-01

For the present supercomputer systems, a multicore and multisocket processors are necessary to build a system, and choice of interconnection is essential. In addition, for effective development of a new code, high performance, scalable, and reliable numerical software is one of the key items. ScaLAPACK and PETSc are well-known software on distributed memory parallel computer systems. It is needless to say that highly tuned software towards new architecture like many-core processors must be chosen for real computation. In this study, we present a high-performance and high-scalable eigenvalue solver towards the next-generation supercomputer system, so called 'K-computer' system. We have developed two versions, the standard version (eigen s) and enhanced performance version (eigen sx), which are developed on the T2K cluster system housed at University of Tokyo. Eigen s employs the conventional algorithms; Householder tridiagonalization, divide and conquer (DC) algorithm, and Householder back-transformation. They are carefully implemented with blocking technique and flexible two-dimensional data-distribution to reduce the overhead of memory traffic and data transfer, respectively. Eigen s performs excellently on the T2K system with 4096 cores (theoretical peak is 37.6 TFLOPS), and it shows fine performance 3.0 TFLOPS with a two hundred thousand dimensional matrix. The enhanced version, eigen sx, uses more advanced algorithms; the narrow-band reduction algorithm, DC for band matrices, and the block Householder back-transformation with WY-representation. Even though this version is still on a test stage, it shows 4.7 TFLOPS with the same dimensional matrix on eigen s. (author)
Watson will see you now: a supercomputer to help clinicians make informed treatment decisions.

Science.gov (United States)

Doyle-Lindrud, Susan

2015-02-01

IBM has collaborated with several cancer care providers to develop and train the IBM supercomputer Watson to help clinicians make informed treatment decisions. When a patient is seen in clinic, the oncologist can input all of the clinical information into the computer system. Watson will then review all of the data and recommend treatment options based on the latest evidence and guidelines. Once the oncologist makes the treatment decision, this information can be sent directly to the insurance company for approval. Watson has the ability to standardize care and accelerate the approval process, a benefit to the healthcare provider and the patient.

Affordable and accurate large-scale hybrid-functional calculations on GPU-accelerated supercomputers

Science.gov (United States)

Ratcliff, Laura E.; Degomme, A.; Flores-Livas, José A.; Goedecker, Stefan; Genovese, Luigi

2018-03-01

Performing high accuracy hybrid functional calculations for condensed matter systems containing a large number of atoms is at present computationally very demanding or even out of reach if high quality basis sets are used. We present a highly optimized multiple graphics processing unit implementation of the exact exchange operator which allows one to perform fast hybrid functional density-functional theory (DFT) calculations with systematic basis sets without additional approximations for up to a thousand atoms. With this method hybrid DFT calculations of high quality become accessible on state-of-the-art supercomputers within a time-to-solution that is of the same order of magnitude as traditional semilocal-GGA functionals. The method is implemented in a portable open-source library.
Semi-infinite fractional programming

CERN Document Server

Verma, Ram U

2017-01-01

This book presents a smooth and unified transitional framework from generalised fractional programming, with a finite number of variables and a finite number of constraints, to semi-infinite fractional programming, where a number of variables are finite but with infinite constraints. It focuses on empowering graduate students, faculty and other research enthusiasts to pursue more accelerated research advances with significant interdisciplinary applications without borders. In terms of developing general frameworks for theoretical foundations and real-world applications, it discusses a number of new classes of generalised second-order invex functions and second-order univex functions, new sets of second-order necessary optimality conditions, second-order sufficient optimality conditions, and second-order duality models for establishing numerous duality theorems for discrete minmax (or maxmin) semi-infinite fractional programming problems. In the current interdisciplinary supercomputer-oriented research envi...
Re-inventing electromagnetics - Supercomputing solution of Maxwell's equations via direct time integration on space grids

International Nuclear Information System (INIS)

Taflove, A.

1992-01-01

This paper summarizes the present state and future directions of applying finite-difference and finite-volume time-domain techniques for Maxwell's equations on supercomputers to model complex electromagnetic wave interactions with structures. Applications so far have been dominated by radar cross section technology, but by no means are limited to this area. In fact, the gains we have made place us on the threshold of being able to make tremendous contributions to non-defense electronics and optical technology. Some of the most interesting research in these commercial areas is summarized. 47 refs
NASA's Climate in a Box: Desktop Supercomputing for Open Scientific Model Development

Science.gov (United States)

Wojcik, G. S.; Seablom, M. S.; Lee, T. J.; McConaughy, G. R.; Syed, R.; Oloso, A.; Kemp, E. M.; Greenseid, J.; Smith, R.

2009-12-01

NASA's High Performance Computing Portfolio in cooperation with its Modeling, Analysis, and Prediction program intends to make its climate and earth science models more accessible to a larger community. A key goal of this effort is to open the model development and validation process to the scientific community at large such that a natural selection process is enabled and results in a more efficient scientific process. One obstacle to others using NASA models is the complexity of the models and the difficulty in learning how to use them. This situation applies not only to scientists who regularly use these models but also non-typical users who may want to use the models such as scientists from different domains, policy makers, and teachers. Another obstacle to the use of these models is that access to high performance computing (HPC) accounts, from which the models are implemented, can be restrictive with long wait times in job queues and delays caused by an arduous process of obtaining an account, especially for foreign nationals. This project explores the utility of using desktop supercomputers in providing a complete ready-to-use toolkit of climate research products to investigators and on demand access to an HPC system. One objective of this work is to pre-package NASA and NOAA models so that new users will not have to spend significant time porting the models. In addition, the prepackaged toolkit will include tools, such as workflow, visualization, social networking web sites, and analysis tools, to assist users in running the models and analyzing the data. The system architecture to be developed will allow for automatic code updates for each user and an effective means with which to deal with data that are generated. We plan to investigate several desktop systems, but our work to date has focused on a Cray CX1. Currently, we are investigating the potential capabilities of several non-traditional development environments. While most NASA and NOAA models are
Earth and environmental science in the 1980's: Part 1: Environmental data systems, supercomputer facilities and networks

Science.gov (United States)

1986-01-01

Overview descriptions of on-line environmental data systems, supercomputer facilities, and networks are presented. Each description addresses the concepts of content, capability, and user access relevant to the point of view of potential utilization by the Earth and environmental science community. The information on similar systems or facilities is presented in parallel fashion to encourage and facilitate intercomparison. In addition, summary sheets are given for each description, and a summary table precedes each section.
Efficient development of memory bounded geo-applications to scale on modern supercomputers

Science.gov (United States)

Räss, Ludovic; Omlin, Samuel; Licul, Aleksandar; Podladchikov, Yuri; Herman, Frédéric

2016-04-01

Numerical modeling is an actual key tool in the area of geosciences. The current challenge is to solve problems that are multi-physics and for which the length scale and the place of occurrence might not be known in advance. Also, the spatial extend of the investigated domain might strongly vary in size, ranging from millimeters for reactive transport to kilometers for glacier erosion dynamics. An efficient way to proceed is to develop simple but robust algorithms that perform well and scale on modern supercomputers and permit therefore very high-resolution simulations. We propose an efficient approach to solve memory bounded real-world applications on modern supercomputers architectures. We optimize the software to run on our newly acquired state-of-the-art GPU cluster "octopus". Our approach shows promising preliminary results on important geodynamical and geomechanical problematics: we have developed a Stokes solver for glacier flow and a poromechanical solver including complex rheologies for nonlinear waves in stressed rocks porous rocks. We solve the system of partial differential equations on a regular Cartesian grid and use an iterative finite difference scheme with preconditioning of the residuals. The MPI communication happens only locally (point-to-point); this method is known to scale linearly by construction. The "octopus" GPU cluster, which we use for the computations, has been designed to achieve maximal data transfer throughput at minimal hardware cost. It is composed of twenty compute nodes, each hosting four Nvidia Titan X GPU accelerators. These high-density nodes are interconnected with a parallel (dual-rail) FDR InfiniBand network. Our efforts show promising preliminary results for the different physics investigated. The glacier flow solver achieves good accuracy in the relevant benchmarks and the coupled poromechanical solver permits to explain previously unresolvable focused fluid flow as a natural outcome of the porosity setup. In both cases
Research center Juelich to install Germany's most powerful supercomputer new IBM System for science and research will achieve 5.8 trillion computations per second

CERN Multimedia

2002-01-01

"The Research Center Juelich, Germany, and IBM today announced that they have signed a contract for the delivery and installation of a new IBM supercomputer at the Central Institute for Applied Mathematics" (1/2 page).
MEGADOCK 4.0: an ultra-high-performance protein-protein docking software for heterogeneous supercomputers.

Science.gov (United States)

Ohue, Masahito; Shimoda, Takehiro; Suzuki, Shuji; Matsuzaki, Yuri; Ishida, Takashi; Akiyama, Yutaka

2014-11-15

The application of protein-protein docking in large-scale interactome analysis is a major challenge in structural bioinformatics and requires huge computing resources. In this work, we present MEGADOCK 4.0, an FFT-based docking software that makes extensive use of recent heterogeneous supercomputers and shows powerful, scalable performance of >97% strong scaling. MEGADOCK 4.0 is written in C++ with OpenMPI and NVIDIA CUDA 5.0 (or later) and is freely available to all academic and non-profit users at: http://www.bi.cs.titech.ac.jp/megadock. akiyama@cs.titech.ac.jp Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
A high level language for a high performance computer

Science.gov (United States)

Perrott, R. H.

1978-01-01

The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.
Wavelet transform-vector quantization compression of supercomputer ocean model simulation output

Energy Technology Data Exchange (ETDEWEB)

Bradley, J N; Brislawn, C M

1992-11-12

We describe a new procedure for efficient compression of digital information for storage and transmission purposes. The algorithm involves a discrete wavelet transform subband decomposition of the data set, followed by vector quantization of the wavelet transform coefficients using application-specific vector quantizers. The new vector quantizer design procedure optimizes the assignment of both memory resources and vector dimensions to the transform subbands by minimizing an exponential rate-distortion functional subject to constraints on both overall bit-rate and encoder complexity. The wavelet-vector quantization method, which originates in digital image compression. is applicable to the compression of other multidimensional data sets possessing some degree of smoothness. In this paper we discuss the use of this technique for compressing the output of supercomputer simulations of global climate models. The data presented here comes from Semtner-Chervin global ocean models run at the National Center for Atmospheric Research and at the Los Alamos Advanced Computing Laboratory.
Scalable geocomputation: evolving an environmental model building platform from single-core to supercomputers

Science.gov (United States)

Schmitz, Oliver; de Jong, Kor; Karssenberg, Derek

2017-04-01

There is an increasing demand to run environmental models on a big scale: simulations over large areas at high resolution. The heterogeneity of available computing hardware such as multi-core CPUs, GPUs or supercomputer potentially provides significant computing power to fulfil this demand. However, this requires detailed knowledge of the underlying hardware, parallel algorithm design and the implementation thereof in an efficient system programming language. Domain scientists such as hydrologists or ecologists often lack this specific software engineering knowledge, their emphasis is (and should be) on exploratory building and analysis of simulation models. As a result, models constructed by domain specialists mostly do not take full advantage of the available hardware. A promising solution is to separate the model building activity from software engineering by offering domain specialists a model building framework with pre-programmed building blocks that they combine to construct a model. The model building framework, consequently, needs to have built-in capabilities to make full usage of the available hardware. Developing such a framework providing understandable code for domain scientists and being runtime efficient at the same time poses several challenges on developers of such a framework. For example, optimisations can be performed on individual operations or the whole model, or tasks need to be generated for a well-balanced execution without explicitly knowing the complexity of the domain problem provided by the modeller. Ideally, a modelling framework supports the optimal use of available hardware whichsoever combination of model building blocks scientists use. We demonstrate our ongoing work on developing parallel algorithms for spatio-temporal modelling and demonstrate 1) PCRaster, an environmental software framework (http://www.pcraster.eu) providing spatio-temporal model building blocks and 2) parallelisation of about 50 of these building blocks using
Evolution of a minimal parallel programming model

International Nuclear Information System (INIS)

Lusk, Ewing; Butler, Ralph; Pieper, Steven C.

2017-01-01

Here, we take a historical approach to our presentation of self-scheduled task parallelism, a programming model with its origins in early irregular and nondeterministic computations encountered in automated theorem proving and logic programming. We show how an extremely simple task model has evolved into a system, asynchronous dynamic load balancing (ADLB), and a scalable implementation capable of supporting sophisticated applications on today’s (and tomorrow’s) largest supercomputers; and we illustrate the use of ADLB with a Green’s function Monte Carlo application, a modern, mature nuclear physics code in production use. Our lesson is that by surrendering a certain amount of generality and thus applicability, a minimal programming model (in terms of its basic concepts and the size of its application programmer interface) can achieve extreme scalability without introducing complexity.
Computational fluid dynamics: complex flows requiring supercomputers. January 1975-July 1988 (Citations from the INSPEC: Information Services for the Physics and Engineering Communities data base). Report for January 1975-July 1988

International Nuclear Information System (INIS)

1988-08-01

This bibliography contains citations concerning computational fluid dynamics (CFD), a new method in computational science to perform complex flow simulations in three dimensions. Applications include aerodynamic design and analysis for aircraft, rockets, and missiles, and automobiles; heat-transfer studies; and combustion processes. Included are references to supercomputers, array processors, and parallel processors where needed for complete, integrated design. Also included are software packages and grid-generation techniques required to apply CFD numerical solutions. Numerical methods for fluid dynamics, not requiring supercomputers, are found in a separate published search. (Contains 83 citations fully indexed and including a title list.)
Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

KAUST Repository

Wu, Xingfu; Taylor, Valerie

2013-01-01

In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.
Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

KAUST Repository

Wu, Xingfu

2013-07-01

In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.
Research to application: Supercomputing trends for the 90's - Opportunities for interdisciplinary computations

International Nuclear Information System (INIS)

Shankar, V.

1991-01-01

The progression of supercomputing is reviewed from the point of view of computational fluid dynamics (CFD), and multidisciplinary problems impacting the design of advanced aerospace configurations are addressed. The application of full potential and Euler equations to transonic and supersonic problems in the 70s and early 80s is outlined, along with Navier-Stokes computations widespread during the late 80s and early 90s. Multidisciplinary computations currently in progress are discussed, including CFD and aeroelastic coupling for both static and dynamic flexible computations, CFD, aeroelastic, and controls coupling for flutter suppression and active control, and the development of a computational electromagnetics technology based on CFD methods. Attention is given to computational challenges standing in a way of the concept of establishing a computational environment including many technologies. 40 refs
Performance evaluation of scientific programs on advanced architecture computers

International Nuclear Information System (INIS)

Walker, D.W.; Messina, P.; Baille, C.F.

1988-01-01

Recently a number of advanced architecture machines have become commercially available. These new machines promise better cost-performance then traditional computers, and some of them have the potential of competing with current supercomputers, such as the Cray X/MP, in terms of maximum performance. This paper describes an on-going project to evaluate a broad range of advanced architecture computers using a number of complete scientific application programs. The computers to be evaluated include distributed- memory machines such as the NCUBE, INTEL and Caltech/JPL hypercubes, and the MEIKO computing surface, shared-memory, bus architecture machines such as the Sequent Balance and the Alliant, very long instruction word machines such as the Multiflow Trace 7/200 computer, traditional supercomputers such as the Cray X.MP and Cray-2, and SIMD machines such as the Connection Machine. Currently 11 application codes from a number of scientific disciplines have been selected, although it is not intended to run all codes on all machines. Results are presented for two of the codes (QCD and missile tracking), and future work is proposed
Car2x with software defined networks, network functions virtualization and supercomputers technical and scientific preparations for the Amsterdam Arena telecoms fieldlab

NARCIS (Netherlands)

Meijer R.J.; Cushing R.; De Laat C.; Jackson P.; Klous S.; Koning R.; Makkes M.X.; Meerwijk A.

2015-01-01

In the invited talk 'Car2x with SDN, NFV and supercomputers' we report about how our past work with SDN [1, 2] allows the design of a smart mobility fieldlab in the huge parking lot the Amsterdam Arena. We explain how we can engineer and test software that handle the complex conditions of the Car2X
Programs Lucky and LuckyC - 3D parallel transport codes for the multi-group transport equation solution for XYZ geometry by Pm Sn method

International Nuclear Information System (INIS)

Moriakov, A.; Vasyukhno, V.; Netecha, M.; Khacheresov, G.

2003-01-01

Powerful supercomputers are available today. MBC-1000M is one of Russian supercomputers that may be used by distant way access. Programs LUCKY and LUCKY C were created to work for multi-processors systems. These programs have algorithms created especially for these computers and used MPI (message passing interface) service for exchanges between processors. LUCKY may resolved shielding tasks by multigroup discreet ordinate method. LUCKY C may resolve critical tasks by same method. Only XYZ orthogonal geometry is available. Under little space steps to approximate discreet operator this geometry may be used as universal one to describe complex geometrical structures. Cross section libraries are used up to P8 approximation by Legendre polynomials for nuclear data in GIT format. Programming language is Fortran-90. 'Vector' processors may be used that lets get a time profit up to 30 times. But unfortunately MBC-1000M has not these processors. Nevertheless sufficient value for efficiency of parallel calculations was obtained under 'space' (LUCKY) and 'space and energy' (LUCKY C ) paralleling. AUTOCAD program is used to control geometry after a treatment of input data. Programs have powerful geometry module, it is a beautiful tool to achieve any geometry. Output results may be processed by graphic programs on personal computer. (authors)
A user-friendly web portal for T-Coffee on supercomputers

Directory of Open Access Journals (Sweden)

Koetsier Jos

2011-05-01

Full Text Available Abstract Background Parallel T-Coffee (PTC was the first parallel implementation of the T-Coffee multiple sequence alignment tool. It is based on MPI and RMA mechanisms. Its purpose is to reduce the execution time of the large-scale sequence alignments. It can be run on distributed memory clusters allowing users to align data sets consisting of hundreds of proteins within a reasonable time. However, most of the potential users of this tool are not familiar with the use of grids or supercomputers. Results In this paper we show how PTC can be easily deployed and controlled on a super computer architecture using a web portal developed using Rapid. Rapid is a tool for efficiently generating standardized portlets for a wide range of applications and the approach described here is generic enough to be applied to other applications, or to deploy PTC on different HPC environments. Conclusions The PTC portal allows users to upload a large number of sequences to be aligned by the parallel version of TC that cannot be aligned by a single machine due to memory and execution time constraints. The web portal provides a user-friendly solution.

Portable implementation model for CFD simulations. Application to hybrid CPU/GPU supercomputers

Science.gov (United States)

Oyarzun, Guillermo; Borrell, Ricard; Gorobets, Andrey; Oliva, Assensi

2017-10-01

Nowadays, high performance computing (HPC) systems experience a disruptive moment with a variety of novel architectures and frameworks, without any clarity of which one is going to prevail. In this context, the portability of codes across different architectures is of major importance. This paper presents a portable implementation model based on an algebraic operational approach for direct numerical simulation (DNS) and large eddy simulation (LES) of incompressible turbulent flows using unstructured hybrid meshes. The strategy proposed consists in representing the whole time-integration algorithm using only three basic algebraic operations: sparse matrix-vector product, a linear combination of vectors and dot product. The main idea is based on decomposing the nonlinear operators into a concatenation of two SpMV operations. This provides high modularity and portability. An exhaustive analysis of the proposed implementation for hybrid CPU/GPU supercomputers has been conducted with tests using up to 128 GPUs. The main objective consists in understanding the challenges of implementing CFD codes on new architectures.
EDF's experience with supercomputing and challenges ahead - towards multi-physics and multi-scale approaches

International Nuclear Information System (INIS)

Delbecq, J.M.; Banner, D.

2003-01-01

Nuclear power plants are a major asset of the EDF company. To remain so, in particular in a context of deregulation, competitiveness, safety and public acceptance are three conditions. These stakes apply both to existing plants and to future reactors. The purpose of the presentation is to explain how supercomputing can help EDF to satisfy these requirements. Three examples are described in detail: ensuring optimal use of nuclear fuel under wholly safe conditions, understanding and simulating the material deterioration mechanisms and moving forward with numerical simulation for the performance of EDF's activities. In conclusion, a broader vision of EDF long term R and D in the field of numerical simulation is given and especially of five challenges taken up by EDF together with its industrial and scientific partners. (author)
SOFTWARE FOR SUPERCOMPUTER SKIF “ProLit-lC” and “ProNRS-lC” FOR FOUNDRY AND METALLURGICAL PRODUCTIONS

Directory of Open Access Journals (Sweden)

A. N. Chichko

2008-01-01

Full Text Available The data of modeling on supercomputer system SKIF of technological process of molds filling by means of computer system 'ProLIT-lc', and also data of modeling of the steel pouring process by means ofTroNRS-lc'are presented. The influence of number of processors of multinuclear computer system SKIF on acceleration and time of modeling of technological processes, connected with production of castings and slugs, is shown.
Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

Energy Technology Data Exchange (ETDEWEB)

Widener, Patrick (University of New Mexico); Jaconette, Steven (Northwestern University); Bridges, Patrick G. (University of New Mexico); Xia, Lei (Northwestern University); Dinda, Peter (Northwestern University); Cui, Zheng.; Lange, John (Northwestern University); Hudson, Trammell B.; Levenhagen, Michael J.; Pedretti, Kevin Thomas Tauke; Brightwell, Ronald Brian

2009-09-01

Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.
ANSTO - program of research 1991-1992

International Nuclear Information System (INIS)

1991-01-01

The direction and priorities of the Australian Nuclear Science and Technology Organisation (ANSTO) research program are outlined. During the period under review. Many of the initiatives of previous years come to fruition, adding significant strength and dimension to the Organisation's research capabilities. The advent of Australian Supercomputing Technology, a joint venture between Fujitsu Australia and ANSTO, will enable the grand challenges of computational science to underpin Ansto research generally but specifically in environmental science. The development of the accelerator mass spectrometry facilities on the tandem accelerator supported new initiatives in environmental research and management. The National Medical Cyclotron opens a new era in radiopharmaceutical research and development. Finally, the recently commissioned hot isostatic press provides a unique national resource for the development of new ceramics and their applications. The direction and priorities of Ansto's research program are determined through a combination of external and internal review. The Program Advisory Committees provide external evaluation against national objectives. New Committees have been formed and membership reflects the national and international nature of the ANSTO research programs. ills
The BlueGene/L Supercomputer and Quantum ChromoDynamics

International Nuclear Information System (INIS)

Vranas, P; Soltz, R

2006-01-01

In summary our update contains: (1) Perfect speedup sustaining 19.3% of peak for the Wilson D D-slash Dirac operator. (2) Measurements of the full Conjugate Gradient (CG) inverter that inverts the Dirac operator. The CG inverter contains two global sums over the entire machine. Nevertheless, our measurements retain perfect speedup scaling demonstrating the robustness of our methods. (3) We ran on the largest BG/L system, the LLNL 64 rack BG/L supercomputer, and obtained a sustained speed of 59.1 TFlops. Furthermore, the speedup scaling of the Dirac operator and of the CG inverter are perfect all the way up to the full size of the machine, 131,072 cores (please see Figure II). The local lattice is rather small (4 x 4 x 4 x 16) while the total lattice has been a lattice QCD vision for thermodynamic studies (a total of 128 x 128 x 256 x 32 lattice sites). This speed is about five times larger compared to the speed we quoted in our submission. As we have pointed out in our paper QCD is notoriously sensitive to network and memory latencies, has a relatively high communication to computation ratio which can not be overlapped in BGL in virtual node mode, and as an application is in a class of its own. The above results are thrilling to us and a 30 year long dream for lattice QCD
Supercomputer debugging workshop `92

Energy Technology Data Exchange (ETDEWEB)

Brown, J.S.

1993-02-01

This report contains papers or viewgraphs on the following topics: The ABCs of Debugging in the 1990s; Cray Computer Corporation; Thinking Machines Corporation; Cray Research, Incorporated; Sun Microsystems, Inc; Kendall Square Research; The Effects of Register Allocation and Instruction Scheduling on Symbolic Debugging; Debugging Optimized Code: Currency Determination with Data Flow; A Debugging Tool for Parallel and Distributed Programs; Analyzing Traces of Parallel Programs Containing Semaphore Synchronization; Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs; Direct Manipulation Techniques for Parallel Debuggers; Transparent Observation of XENOOPS Objects; A Parallel Software Monitor for Debugging and Performance Tools on Distributed Memory Multicomputers; Profiling Performance of Inter-Processor Communications in an iWarp Torus; The Application of Code Instrumentation Technology in the Los Alamos Debugger; and CXdb: The Road to Remote Debugging.
[Teacher enhancement at Supercomputing `96

Energy Technology Data Exchange (ETDEWEB)

NONE

1998-02-13

The SC`96 Education Program provided a three-day professional development experience for middle and high school science, mathematics, and computer technology teachers. The program theme was Computers at Work in the Classroom, and a majority of the sessions were presented by classroom teachers who have had several years experience in using these technologies with their students. The teachers who attended the program were introduced to classroom applications of computing and networking technologies and were provided to the greatest extent possible with lesson plans, sample problems, and other resources that could immediately be used in their own classrooms. The attached At a Glance Schedule and Session Abstracts describes in detail the three-day SC`96 Education Program. Also included is the SC`96 Education Program evaluation report and the financial report.
Solving sparse linear least squares problems on some supercomputers by using large dense blocks

DEFF Research Database (Denmark)

Hansen, Per Christian; Ostromsky, T; Sameh, A

1997-01-01

technique is preferable to sparse matrix technique when the matrices are not large, because the high computational speed compensates fully the disadvantages of using more arithmetic operations and more storage. For very large matrices the computations must be organized as a sequence of tasks in each......Efficient subroutines for dense matrix computations have recently been developed and are available on many high-speed computers. On some computers the speed of many dense matrix operations is near to the peak-performance. For sparse matrices storage and operations can be saved by operating only...... and storing only nonzero elements. However, the price is a great degradation of the speed of computations on supercomputers (due to the use of indirect addresses, to the need to insert new nonzeros in the sparse storage scheme, to the lack of data locality, etc.). On many high-speed computers a dense matrix...
HEP Computing Tools, Grid and Supercomputers for Genome Sequencing Studies

Science.gov (United States)

De, K.; Klimentov, A.; Maeno, T.; Mashinistov, R.; Novikov, A.; Poyda, A.; Tertychnyy, I.; Wenaus, T.

2017-10-01

PanDA - Production and Distributed Analysis Workload Management System has been developed to address ATLAS experiment at LHC data processing and analysis challenges. Recently PanDA has been extended to run HEP scientific applications on Leadership Class Facilities and supercomputers. The success of the projects to use PanDA beyond HEP and Grid has drawn attention from other compute intensive sciences such as bioinformatics. Recent advances of Next Generation Genome Sequencing (NGS) technology led to increasing streams of sequencing data that need to be processed, analysed and made available for bioinformaticians worldwide. Analysis of genomes sequencing data using popular software pipeline PALEOMIX can take a month even running it on the powerful computer resource. In this paper we will describe the adaptation the PALEOMIX pipeline to run it on a distributed computing environment powered by PanDA. To run pipeline we split input files into chunks which are run separately on different nodes as separate inputs for PALEOMIX and finally merge output file, it is very similar to what it done by ATLAS to process and to simulate data. We dramatically decreased the total walltime because of jobs (re)submission automation and brokering within PanDA. Using software tools developed initially for HEP and Grid can reduce payload execution time for Mammoths DNA samples from weeks to days.
How to program 122,400 heterogeneous cores and retain your sanity

International Nuclear Information System (INIS)

Patkin, Scott

2010-01-01

Current technology trends favor hybrid architectures, typically with each node in a cluster containing both general-purpose and specialized 'accelerator' processors. The typical model for programming such systems is host-centric: The general-purpose processor orchestrates the computation, offloading performance-critical work to the accelerator, and data is communicated only among general-purpose processors. In this talk we propose a radically different hybrid-programming approach, which we call the 'reverse-acceleration model'. In this model the accelerators orchestrate the computation, offloading unacceleratable work to the general-purpose processors. Data is communicated among accelerators, not among general-purpose processors. We present the Cell Messaging Layer (CML), an implementation of the reverse-acceleration model for Los Alamos National Laboratory's Roadrunner supercomputer, a complex conglomerate of 122,400 processor cores of various types, multiple memory domains, and multiple network types, all with radically different performance characteristics but which together make Roadrunner the world's second-fastest supercomputer. CML demonstrates a new messaging-layer implementation technique called 'receiver-initiated message passing', which reduces communication latency by up to a third. Our thesis is that the reverse-acceleration model simplifies porting codes to heterogeneous systems and facilitates performance optimization. We present a case study of a legacy neutron-transport code that we modified to use reverse acceleration. Performance results from running this code across the full Roadrunner system indicate a substantial performance improvement over the unaccelerated version of the code.
EDF's experience with supercomputing and challenges ahead - towards multi-physics and multi-scale approaches

Energy Technology Data Exchange (ETDEWEB)

Delbecq, J.M.; Banner, D. [Electricite de France (EDF)- R and D Division, 92 - Clamart (France)

2003-07-01

Nuclear power plants are a major asset of the EDF company. To remain so, in particular in a context of deregulation, competitiveness, safety and public acceptance are three conditions. These stakes apply both to existing plants and to future reactors. The purpose of the presentation is to explain how supercomputing can help EDF to satisfy these requirements. Three examples are described in detail: ensuring optimal use of nuclear fuel under wholly safe conditions, understanding and simulating the material deterioration mechanisms and moving forward with numerical simulation for the performance of EDF's activities. In conclusion, a broader vision of EDF long term R and D in the field of numerical simulation is given and especially of five challenges taken up by EDF together with its industrial and scientific partners. (author)
Performance Evaluation of an Intel Haswell- and Ivy Bridge-Based Supercomputer Using Scientific and Engineering Applications

Science.gov (United States)

Saini, Subhash; Hood, Robert T.; Chang, Johnny; Baron, John

2016-01-01

We present a performance evaluation conducted on a production supercomputer of the Intel Xeon Processor E5- 2680v3, a twelve-core implementation of the fourth-generation Haswell architecture, and compare it with Intel Xeon Processor E5-2680v2, an Ivy Bridge implementation of the third-generation Sandy Bridge architecture. Several new architectural features have been incorporated in Haswell including improvements in all levels of the memory hierarchy as well as improvements to vector instructions and power management. We critically evaluate these new features of Haswell and compare with Ivy Bridge using several low-level benchmarks including subset of HPCC, HPCG and four full-scale scientific and engineering applications. We also present a model to predict the performance of HPCG and Cart3D within 5%, and Overflow within 10% accuracy.
369 TFlop/s molecular dynamics simulations on the Roadrunner general-purpose heterogeneous supercomputer

Energy Technology Data Exchange (ETDEWEB)

Swaminarayan, Sriram [Los Alamos National Laboratory; Germann, Timothy C [Los Alamos National Laboratory; Kadau, Kai [Los Alamos National Laboratory; Fossum, Gordon C [IBM CORPORATION

2008-01-01

The authors present timing and performance numbers for a short-range parallel molecular dynamics (MD) code, SPaSM, that has been rewritten for the heterogeneous Roadrunner supercomputer. Each Roadrunner compute node consists of two AMD Opteron dual-core microprocessors and four PowerXCell 8i enhanced Cell microprocessors, so that there are four MPI ranks per node, each with one Opteron and one Cell. The interatomic forces are computed on the Cells (each with one PPU and eight SPU cores), while the Opterons are used to direct inter-rank communication and perform I/O-heavy periodic analysis, visualization, and checkpointing tasks. The performance measured for our initial implementation of a standard Lennard-Jones pair potential benchmark reached a peak of 369 Tflop/s double-precision floating-point performance on the full Roadrunner system (27.7% of peak), corresponding to 124 MFlop/Watt/s at a price of approximately 3.69 MFlops/dollar. They demonstrate an initial target application, the jetting and ejection of material from a shocked surface.
Parallelization for X-ray crystal structural analysis program

Energy Technology Data Exchange (ETDEWEB)

Watanabe, Hiroshi [Japan Atomic Energy Research Inst., Tokyo (Japan); Minami, Masayuki; Yamamoto, Akiji

1997-10-01

In this report we study vectorization and parallelization for X-ray crystal structural analysis program. The target machine is NEC SX-4 which is a distributed/shared memory type vector parallel supercomputer. X-ray crystal structural analysis is surveyed, and a new multi-dimensional discrete Fourier transform method is proposed. The new method is designed to have a very long vector length, so that it enables to obtain the 12.0 times higher performance result that the original code. Besides the above-mentioned vectorization, the parallelization by micro-task functions on SX-4 reaches 13.7 times acceleration in the part of multi-dimensional discrete Fourier transform with 14 CPUs, and 3.0 times acceleration in the whole program. Totally 35.9 times acceleration to the original 1CPU scalar version is achieved with vectorization and parallelization on SX-4. (author)
Distributed interactive graphics applications in computational fluid dynamics

International Nuclear Information System (INIS)

Rogers, S.E.; Buning, P.G.; Merritt, F.J.

1987-01-01

Implementation of two distributed graphics programs used in computational fluid dynamics is discussed. Both programs are interactive in nature. They run on a CRAY-2 supercomputer and use a Silicon Graphics Iris workstation as the front-end machine. The hardware and supporting software are from the Numerical Aerodynamic Simulation project. The supercomputer does all numerically intensive work and the workstation, as the front-end machine, allows the user to perform real-time interactive transformations on the displayed data. The first program was written as a distributed program that computes particle traces for fluid flow solutions existing on the supercomputer. The second is an older post-processing and plotting program modified to run in a distributed mode. Both programs have realized a large increase in speed over that obtained using a single machine. By using these programs, one can learn quickly about complex features of a three-dimensional flow field. Some color results are presented
Argonne Leadership Computing Facility 2011 annual report : Shaping future supercomputing.

Energy Technology Data Exchange (ETDEWEB)

Papka, M.; Messina, P.; Coffey, R.; Drugan, C. (LCF)

2012-08-16

The ALCF's Early Science Program aims to prepare key applications for the architecture and scale of Mira and to solidify libraries and infrastructure that will pave the way for other future production applications. Two billion core-hours have been allocated to 16 Early Science projects on Mira. The projects, in addition to promising delivery of exciting new science, are all based on state-of-the-art, petascale, parallel applications. The project teams, in collaboration with ALCF staff and IBM, have undertaken intensive efforts to adapt their software to take advantage of Mira's Blue Gene/Q architecture, which, in a number of ways, is a precursor to future high-performance-computing architecture. The Argonne Leadership Computing Facility (ALCF) enables transformative science that solves some of the most difficult challenges in biology, chemistry, energy, climate, materials, physics, and other scientific realms. Users partnering with ALCF staff have reached research milestones previously unattainable, due to the ALCF's world-class supercomputing resources and expertise in computation science. In 2011, the ALCF's commitment to providing outstanding science and leadership-class resources was honored with several prestigious awards. Research on multiscale brain blood flow simulations was named a Gordon Bell Prize finalist. Intrepid, the ALCF's BG/P system, ranked No. 1 on the Graph 500 list for the second consecutive year. The next-generation BG/Q prototype again topped the Green500 list. Skilled experts at the ALCF enable researchers to conduct breakthrough science on the Blue Gene system in key ways. The Catalyst Team matches project PIs with experienced computational scientists to maximize and accelerate research in their specific scientific domains. The Performance Engineering Team facilitates the effective use of applications on the Blue Gene system by assessing and improving the algorithms used by applications and the techniques used to
Parallel simulation of tsunami inundation on a large-scale supercomputer

Science.gov (United States)

Oishi, Y.; Imamura, F.; Sugawara, D.

2013-12-01

An accurate prediction of tsunami inundation is important for disaster mitigation purposes. One approach is to approximate the tsunami wave source through an instant inversion analysis using real-time observation data (e.g., Tsushima et al., 2009) and then use the resulting wave source data in an instant tsunami inundation simulation. However, a bottleneck of this approach is the large computational cost of the non-linear inundation simulation and the computational power of recent massively parallel supercomputers is helpful to enable faster than real-time execution of a tsunami inundation simulation. Parallel computers have become approximately 1000 times faster in 10 years (www.top500.org), and so it is expected that very fast parallel computers will be more and more prevalent in the near future. Therefore, it is important to investigate how to efficiently conduct a tsunami simulation on parallel computers. In this study, we are targeting very fast tsunami inundation simulations on the K computer, currently the fastest Japanese supercomputer, which has a theoretical peak performance of 11.2 PFLOPS. One computing node of the K computer consists of 1 CPU with 8 cores that share memory, and the nodes are connected through a high-performance torus-mesh network. The K computer is designed for distributed-memory parallel computation, so we have developed a parallel tsunami model. Our model is based on TUNAMI-N2 model of Tohoku University, which is based on a leap-frog finite difference method. A grid nesting scheme is employed to apply high-resolution grids only at the coastal regions. To balance the computation load of each CPU in the parallelization, CPUs are first allocated to each nested layer in proportion to the number of grid points of the nested layer. Using CPUs allocated to each layer, 1-D domain decomposition is performed on each layer. In the parallel computation, three types of communication are necessary: (1) communication to adjacent neighbours for the
Automatic program generation: future of software engineering

Energy Technology Data Exchange (ETDEWEB)

Robinson, J.H.

1979-01-01

At this moment software development is still more of an art than an engineering discipline. Each piece of software is lovingly engineered, nurtured, and presented to the world as a tribute to the writer's skill. When will this change. When will the craftsmanship be removed and the programs be turned out like so many automobiles from an assembly line. Sooner or later it will happen: economic necessities will demand it. With the advent of cheap microcomputers and ever more powerful supercomputers doubling capacity, much more software must be produced. The choices are to double the number of programers, double the efficiency of each programer, or find a way to produce the needed software automatically. Producing software automatically is the only logical choice. How will automatic programing come about. Some of the preliminary actions which need to be done and are being done are to encourage programer plagiarism of existing software through public library mechanisms, produce well understood packages such as compiler automatically, develop languages capable of producing software as output, and learn enough about the whole process of programing to be able to automate it. Clearly, the emphasis must not be on efficiency or size, since ever larger and faster hardware is coming.
Reliability Lessons Learned From GPU Experience With The Titan Supercomputer at Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Gallarno, George [Christian Brothers University; Rogers, James H [ORNL; Maxwell, Don E [ORNL

2015-01-01

The high computational capability of graphics processing units (GPUs) is enabling and driving the scientific discovery process at large-scale. The world s second fastest supercomputer for open science, Titan, has more than 18,000 GPUs that computational scientists use to perform scientific simu- lations and data analysis. Understanding of GPU reliability characteristics, however, is still in its nascent stage since GPUs have only recently been deployed at large-scale. This paper presents a detailed study of GPU errors and their impact on system operations and applications, describing experiences with the 18,688 GPUs on the Titan supercom- puter as well as lessons learned in the process of efficient operation of GPUs at scale. These experiences are helpful to HPC sites which already have large-scale GPU clusters or plan to deploy GPUs in the future.

A criticality safety analysis code using a vectorized Monte Carlo method on the HITAC S-810 supercomputer

International Nuclear Information System (INIS)

Morimoto, Y.; Maruyama, H.

1987-01-01

A vectorized Monte Carlo criticality safety analysis code has been developed on the vector supercomputer HITAC S-810. In this code, a multi-particle tracking algorithm was adopted for effective utilization of the vector processor. A flight analysis with pseudo-scattering was developed to reduce the computational time needed for flight analysis, which represents the bulk of computational time. This new algorithm realized a speed-up of factor 1.5 over the conventional flight analysis. The code also adopted the multigroup cross section constants library of the Bodarenko type with 190 groups, with 132 groups being for fast and epithermal regions and 58 groups being for the thermal region. Evaluation work showed that this code reproduce the experimental results to an accuracy of about 1 % for the effective neutron multiplication factor. (author)
Supercomputer debugging workshop '92

Energy Technology Data Exchange (ETDEWEB)

Brown, J.S.

1993-01-01

This report contains papers or viewgraphs on the following topics: The ABCs of Debugging in the 1990s; Cray Computer Corporation; Thinking Machines Corporation; Cray Research, Incorporated; Sun Microsystems, Inc; Kendall Square Research; The Effects of Register Allocation and Instruction Scheduling on Symbolic Debugging; Debugging Optimized Code: Currency Determination with Data Flow; A Debugging Tool for Parallel and Distributed Programs; Analyzing Traces of Parallel Programs Containing Semaphore Synchronization; Compile-time Support for Efficient Data Race Detection in Shared-Memory Parallel Programs; Direct Manipulation Techniques for Parallel Debuggers; Transparent Observation of XENOOPS Objects; A Parallel Software Monitor for Debugging and Performance Tools on Distributed Memory Multicomputers; Profiling Performance of Inter-Processor Communications in an iWarp Torus; The Application of Code Instrumentation Technology in the Los Alamos Debugger; and CXdb: The Road to Remote Debugging.
Benchmarking Further Single Board Computers for Building a Mini Supercomputer for Simulation of Telecommunication Systems

Directory of Open Access Journals (Sweden)

Gábor Lencse

2016-01-01

Full Text Available Parallel Discrete Event Simulation (PDES with the conservative synchronization method can be efficiently used for the performance analysis of telecommunication systems because of their good lookahead properties. For PDES, a cost effective execution platform may be built by using single board computers (SBCs, which offer relatively high computation capacity compared to their price or power consumption and especially to the space they take up. A benchmarking method is proposed and its operation is demonstrated by benchmarking ten different SBCs, namely Banana Pi, Beaglebone Black, Cubieboard2, Odroid-C1+, Odroid-U3+, Odroid-XU3 Lite, Orange Pi Plus, Radxa Rock Lite, Raspberry Pi Model B+, and Raspberry Pi 2 Model B+. Their benchmarking results are compared to find out which one should be used for building a mini supercomputer for parallel discrete-event simulation of telecommunication systems. The SBCs are also used to build a heterogeneous cluster and the performance of the cluster is tested, too.
Fast and Accurate Simulation of the Cray XMT Multithreaded Supercomputer

Energy Technology Data Exchange (ETDEWEB)

Villa, Oreste; Tumeo, Antonino; Secchi, Simone; Manzano Franco, Joseph B.

2012-12-31

Irregular applications, such as data mining and analysis or graph-based computations, show unpredictable memory/network access patterns and control structures. Highly multithreaded architectures with large processor counts, like the Cray MTA-1, MTA-2 and XMT, appear to address their requirements better than commodity clusters. However, the research on highly multithreaded systems is currently limited by the lack of adequate architectural simulation infrastructures due to issues such as size of the machines, memory footprint, simulation speed, accuracy and customization. At the same time, Shared-memory MultiProcessors (SMPs) with multi-core processors have become an attractive platform to simulate large scale machines. In this paper, we introduce a cycle-level simulator of the highly multithreaded Cray XMT supercomputer. The simulator runs unmodified XMT applications. We discuss how we tackled the challenges posed by its development, detailing the techniques introduced to make the simulation as fast as possible while maintaining a high accuracy. By mapping XMT processors (ThreadStorm with 128 hardware threads) to host computing cores, the simulation speed remains constant as the number of simulated processors increases, up to the number of available host cores. The simulator supports zero-overhead switching among different accuracy levels at run-time and includes a network model that takes into account contention. On a modern 48-core SMP host, our infrastructure simulates a large set of irregular applications 500 to 2000 times slower than real time when compared to a 128-processor XMT, while remaining within 10\\% of accuracy. Emulation is only from 25 to 200 times slower than real time.
PFLOTRAN: Reactive Flow & Transport Code for Use on Laptops to Leadership-Class Supercomputers

Energy Technology Data Exchange (ETDEWEB)

Hammond, Glenn E.; Lichtner, Peter C.; Lu, Chuan; Mills, Richard T.

2012-04-18

PFLOTRAN, a next-generation reactive flow and transport code for modeling subsurface processes, has been designed from the ground up to run efficiently on machines ranging from leadership-class supercomputers to laptops. Based on an object-oriented design, the code is easily extensible to incorporate additional processes. It can interface seamlessly with Fortran 9X, C and C++ codes. Domain decomposition parallelism is employed, with the PETSc parallel framework used to manage parallel solvers, data structures and communication. Features of the code include a modular input file, implementation of high-performance I/O using parallel HDF5, ability to perform multiple realization simulations with multiple processors per realization in a seamless manner, and multiple modes for multiphase flow and multicomponent geochemical transport. Chemical reactions currently implemented in the code include homogeneous aqueous complexing reactions and heterogeneous mineral precipitation/dissolution, ion exchange, surface complexation and a multirate kinetic sorption model. PFLOTRAN has demonstrated petascale performance using 2{sup 17} processor cores with over 2 billion degrees of freedom. Accomplishments achieved to date include applications to the Hanford 300 Area and modeling CO{sub 2} sequestration in deep geologic formations.
Environmental Systems Research Candidates Program--FY2000 Annual report

Energy Technology Data Exchange (ETDEWEB)

Piet, Steven James

2001-01-01

The Environmental Systems Research Candidates (ESRC) Program, which is scheduled to end September 2001, was established in April 2000 as part of the Environmental Systems Research and Analysis Program at the Idaho National Engineering and Environmental Laboratory (INEEL) to provide key science and technology to meet the clean-up mission of the U.S. Department of Energy Office of Environmental Management, and perform research and development that will help solve current legacy problems and enhance the INEEL’s scientific and technical capability for solving longer-term challenges. This report documents the progress and accomplishments of the ESRC Program from April through September 2000. The ESRC Program consists of 24 tasks subdivided within four research areas: A. Environmental Characterization Science and Technology. This research explores new data acquisition, processing, and interpretation methods that support cleanup and long-term stewardship decisions. B. Subsurface Understanding. This research expands understanding of the biology, chemistry, physics, hydrology, and geology needed to improve models of contamination problems in the earth’s subsurface. C. Environmental Computational Modeling. This research develops INEEL computing capability for modeling subsurface contaminants and contaminated facilities. D. Environmental Systems Science and Technology. This research explores novel processes to treat waste and decontaminate facilities. Our accomplishments during FY 2000 include the following: • We determined, through analysis of samples taken in and around the INEEL site, that mercury emissions from the INEEL calciner have not raised regional off-INEEL mercury contamination levels above normal background. • We have initially demonstrated the use of x-ray fluorescence to image uranium and heavy metal concentrations in soil samples. • We increased our understanding of the subsurface environment; applying mathematical complexity theory to the problem of
ASCI's Vision for supercomputing future

International Nuclear Information System (INIS)

Nowak, N.D.

2003-01-01

The full text of publication follows. Advanced Simulation and Computing (ASC, formerly Accelerated Strategic Computing Initiative [ASCI]) was established in 1995 to help Defense Programs shift from test-based confidence to simulation-based confidence. Specifically, ASC is a focused and balanced program that is accelerating the development of simulation capabilities needed to analyze and predict the performance, safety, and reliability of nuclear weapons and certify their functionality - far exceeding what might have been achieved in the absence of a focused initiative. To realize its vision, ASC is creating simulation and proto-typing capabilities, based on advanced weapon codes and high-performance computing
A uniform approach for programming distributed heterogeneous computing systems.

Science.gov (United States)

Grasso, Ivan; Pellegrini, Simone; Cosenza, Biagio; Fahringer, Thomas

2014-12-01

Large-scale compute clusters of heterogeneous nodes equipped with multi-core CPUs and GPUs are getting increasingly popular in the scientific community. However, such systems require a combination of different programming paradigms making application development very challenging. In this article we introduce libWater, a library-based extension of the OpenCL programming model that simplifies the development of heterogeneous distributed applications. libWater consists of a simple interface, which is a transparent abstraction of the underlying distributed architecture, offering advanced features such as inter-context and inter-node device synchronization. It provides a runtime system which tracks dependency information enforced by event synchronization to dynamically build a DAG of commands, on which we automatically apply two optimizations: collective communication pattern detection and device-host-device copy removal. We assess libWater's performance in three compute clusters available from the Vienna Scientific Cluster, the Barcelona Supercomputing Center and the University of Innsbruck, demonstrating improved performance and scaling with different test applications and configurations.
The design and implementation of cost-effective algorithms for direct solution of banded linear systems on the vector processor system 32 supercomputer

Science.gov (United States)

Samba, A. S.

1985-01-01

The problem of solving banded linear systems by direct (non-iterative) techniques on the Vector Processor System (VPS) 32 supercomputer is considered. Two efficient direct methods for solving banded linear systems on the VPS 32 are described. The vector cyclic reduction (VCR) algorithm is discussed in detail. The performance of the VCR on a three parameter model problem is also illustrated. The VCR is an adaptation of the conventional point cyclic reduction algorithm. The second direct method is the Customized Reduction of Augmented Triangles' (CRAT). CRAT has the dominant characteristics of an efficient VPS 32 algorithm. CRAT is tailored to the pipeline architecture of the VPS 32 and as a consequence the algorithm is implicitly vectorizable.
Sandia`s network for Supercomputing `94: Linking the Los Alamos, Lawrence Livermore, and Sandia National Laboratories using switched multimegabit data service

Energy Technology Data Exchange (ETDEWEB)

Vahle, M.O.; Gossage, S.A.; Brenkosh, J.P. [Sandia National Labs., Albuquerque, NM (United States). Advanced Networking Integration Dept.

1995-01-01

Supercomputing `94, a high-performance computing and communications conference, was held November 14th through 18th, 1994 in Washington DC. For the past four years, Sandia National Laboratories has used this conference to showcase and focus its communications and networking endeavors. At the 1994 conference, Sandia built a Switched Multimegabit Data Service (SMDS) network running at 44.736 megabits per second linking its private SMDS network between its facilities in Albuquerque, New Mexico and Livermore, California to the convention center in Washington, D.C. For the show, the network was also extended from Sandia, New Mexico to Los Alamos National Laboratory and from Sandia, California to Lawrence Livermore National Laboratory. This paper documents and describes this network and how it was used at the conference.
MECCA coordinated research program: analysis of climate models uncertainties used for climatic changes study

International Nuclear Information System (INIS)

Caneill, J.Y.; Hakkarinen, C.

1992-01-01

An international consortium, called MECCA, (Model Evaluation Consortium for Climate Assessment) has been created in 1991 by different partners including electric utilities, government and academic groups to make available to the international scientific community, a super-computer facility for climate evolution studies. The first phase of the program consists to assess uncertainties of climate model simulations in the framework of global climate change studies. Fourteen scientific projects have been accepted on an international basis in this first phase. The second phase of the program will consist in the evaluation of a set of long climate simulations realized with coupled ocean/atmosphere models, in order to study the transient aspects of climate changes and the associated uncertainties. A particular attention will be devoted, on the consequences of these assessments on climate impact studies, and on the regional aspects of climate changes
MaMiCo: Transient multi-instance molecular-continuum flow simulation on supercomputers

Science.gov (United States)

Neumann, Philipp; Bian, Xin

2017-11-01

We present extensions of the macro-micro-coupling tool MaMiCo, which was designed to couple continuum fluid dynamics solvers with discrete particle dynamics. To enable local extraction of smooth flow field quantities especially on rather short time scales, sampling over an ensemble of molecular dynamics simulations is introduced. We provide details on these extensions including the transient coupling algorithm, open boundary forcing, and multi-instance sampling. Furthermore, we validate the coupling in Couette flow using different particle simulation software packages and particle models, i.e. molecular dynamics and dissipative particle dynamics. Finally, we demonstrate the parallel scalability of the molecular-continuum simulations by using up to 65 536 compute cores of the supercomputer Shaheen II located at KAUST. Program Files doi:http://dx.doi.org/10.17632/w7rgdrhb85.1 Licensing provisions: BSD 3-clause Programming language: C, C++ External routines/libraries: For compiling: SCons, MPI (optional) Subprograms used: ESPResSo, LAMMPS, ls1 mardyn, waLBerla For installation procedures of the MaMiCo interfaces, see the README files in the respective code directories located in coupling/interface/impl. Journal reference of previous version: P. Neumann, H. Flohr, R. Arora, P. Jarmatz, N. Tchipev, H.-J. Bungartz. MaMiCo: Software design for parallel molecular-continuum flow simulations, Computer Physics Communications 200: 324-335, 2016 Does the new version supersede the previous version?: Yes. The functionality of the previous version is completely retained in the new version. Nature of problem: Coupled molecular-continuum simulation for multi-resolution fluid dynamics: parts of the domain are resolved by molecular dynamics or another particle-based solver whereas large parts are covered by a mesh-based CFD solver, e.g. a lattice Boltzmann automaton. Solution method: We couple existing MD and CFD solvers via MaMiCo (macro-micro coupling tool). Data exchange and
Publisher Correction

DEFF Research Database (Denmark)

Bonàs-Guarch, Sílvia; Guindo-Martínez, Marta; Miguel-Escalada, Irene

2018-01-01

In the originally published version of this Article, the affiliation details for Santi González, Jian'an Luan and Claudia Langenberg were inadvertently omitted. Santi González should have been affiliated with 'Barcelona Supercomputing Center (BSC), Joint BSC-CRG-IRB Research Program in Computatio......In the originally published version of this Article, the affiliation details for Santi González, Jian'an Luan and Claudia Langenberg were inadvertently omitted. Santi González should have been affiliated with 'Barcelona Supercomputing Center (BSC), Joint BSC-CRG-IRB Research Program...
A portable grid-enabled computing system for a nuclear material study

International Nuclear Information System (INIS)

Tsujita, Yuichi; Arima, Tatsumi; Takekawa, Takayuki; Suzuki, Yoshio

2010-01-01

We have built a portable grid-enabled computing system specialized for our molecular dynamics (MD) simulation program to study Pu material easily. Experimental approach to reveal properties of Pu materials is often accompanied by some difficulties such as radiotoxicity of actinides. Since a computational approach reveals new aspects to researchers without such radioactive facilities, we address an MD computation. In order to have more realistic results about e.g., melting point or thermal conductivity, we need a large scale of parallel computations. Most of application users who don't have supercomputers in their institutes should use a remote supercomputer. For such users, we have developed the portable and secured grid-enabled computing system to utilize a grid computing infrastructure provided by Information Technology Based Laboratory (ITBL). This system enables us to access remote supercomputers in the ITBL system seamlessly from a client PC through its graphical user interface (GUI). Typically it enables seamless file accesses on the GUI. Furthermore monitoring of standard output or standard error is available to see progress of an executed program. Since the system provides fruitful functionalities which are useful for parallel computing on a remote supercomputer, application users can concentrate on their researches. (author)
Combining density functional theory calculations, supercomputing, and data-driven methods to design new materials (Conference Presentation)

Science.gov (United States)

Jain, Anubhav

2017-04-01

Density functional theory (DFT) simulations solve for the electronic structure of materials starting from the Schrödinger equation. Many case studies have now demonstrated that researchers can often use DFT to design new compounds in the computer (e.g., for batteries, catalysts, and hydrogen storage) before synthesis and characterization in the lab. In this talk, I will focus on how DFT calculations can be executed on large supercomputing resources in order to generate very large data sets on new materials for functional applications. First, I will briefly describe the Materials Project, an effort at LBNL that has virtually characterized over 60,000 materials using DFT and has shared the results with over 17,000 registered users. Next, I will talk about how such data can help discover new materials, describing how preliminary computational screening led to the identification and confirmation of a new family of bulk AMX2 thermoelectric compounds with measured zT reaching 0.8. I will outline future plans for how such data-driven methods can be used to better understand the factors that control thermoelectric behavior, e.g., for the rational design of electronic band structures, in ways that are different from conventional approaches.
A Parallel Supercomputer Implementation of a Biological Inspired Neural Network and its use for Pattern Recognition

International Nuclear Information System (INIS)

De Ladurantaye, Vincent; Lavoie, Jean; Bergeron, Jocelyn; Parenteau, Maxime; Lu Huizhong; Pichevar, Ramin; Rouat, Jean

2012-01-01

A parallel implementation of a large spiking neural network is proposed and evaluated. The neural network implements the binding by synchrony process using the Oscillatory Dynamic Link Matcher (ODLM). Scalability, speed and performance are compared for 2 implementations: Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA) running on clusters of multicore supercomputers and NVIDIA graphical processing units respectively. A global spiking list that represents at each instant the state of the neural network is described. This list indexes each neuron that fires during the current simulation time so that the influence of their spikes are simultaneously processed on all computing units. Our implementation shows a good scalability for very large networks. A complex and large spiking neural network has been implemented in parallel with success, thus paving the road towards real-life applications based on networks of spiking neurons. MPI offers a better scalability than CUDA, while the CUDA implementation on a GeForce GTX 285 gives the best cost to performance ratio. When running the neural network on the GTX 285, the processing speed is comparable to the MPI implementation on RQCHP's Mammouth parallel with 64 notes (128 cores).
High Performance Simulation of Large-Scale Red Sea Ocean Bottom Seismic Data on the Supercomputer Shaheen II

KAUST Repository

Tonellot, Thierry

2017-02-27

A combination of both shallow and deepwater, plus islands and coral reefs, are some of the main features contributing to the complexity of subsalt seismic exploration in the Red Sea transition zone. These features often result in degrading effects on seismic images. State-of-the-art ocean bottom acquisition technologies are therefore required to record seismic data with optimal fold and offset, as well as advanced processing and imaging techniques. Numerical simulations of such complex seismic data can help improve acquisition design and also help in customizing, validating and benchmarking the processing and imaging workflows that will be applied on the field data. Subsequently, realistic simulation of wave propagation is a computationally intensive process requiring a realistic model and an efficient 3D wave equation solver. Large-scale computing resources are also required to meet turnaround time compatible with a production time frame. In this work, we present the numerical simulation of an ocean bottom seismic survey to be acquired in the Red Sea transition zone starting in summer 2016. The survey\\'s acquisition geometry comprises nearly 300,000 unique shot locations and 21,000 unique receiver locations, covering about 760 km2. Using well log measurements and legacy 2D seismic lines in this area, a 3D P-wave velocity model was built, with a maximum depth of 7 km. The model was sampled at 10 m in each direction, resulting in more than 5 billion cells. Wave propagation in this model was performed using a 3D finite difference solver in the time domain based on a staggered grid velocity-pressure formulation of acoustodynamics. To ensure that the resulting data could be generated sufficiently fast, the King Abdullah University of Science and Technology (KAUST) supercomputer Shaheen II Cray XC40 was used. A total of 21,000 three-component (pressure and vertical and horizontal velocity) common receiver gathers with a 50 Hz maximum frequency were computed in less
High Performance Simulation of Large-Scale Red Sea Ocean Bottom Seismic Data on the Supercomputer Shaheen II

KAUST Repository

Tonellot, Thierry; Etienne, Vincent; Gashawbeza, Ewenet; Curiel, Emesto Sandoval; Khan, Azizur; Feki, Saber; Kortas, Samuel

2017-01-01

A combination of both shallow and deepwater, plus islands and coral reefs, are some of the main features contributing to the complexity of subsalt seismic exploration in the Red Sea transition zone. These features often result in degrading effects on seismic images. State-of-the-art ocean bottom acquisition technologies are therefore required to record seismic data with optimal fold and offset, as well as advanced processing and imaging techniques. Numerical simulations of such complex seismic data can help improve acquisition design and also help in customizing, validating and benchmarking the processing and imaging workflows that will be applied on the field data. Subsequently, realistic simulation of wave propagation is a computationally intensive process requiring a realistic model and an efficient 3D wave equation solver. Large-scale computing resources are also required to meet turnaround time compatible with a production time frame. In this work, we present the numerical simulation of an ocean bottom seismic survey to be acquired in the Red Sea transition zone starting in summer 2016. The survey's acquisition geometry comprises nearly 300,000 unique shot locations and 21,000 unique receiver locations, covering about 760 km2. Using well log measurements and legacy 2D seismic lines in this area, a 3D P-wave velocity model was built, with a maximum depth of 7 km. The model was sampled at 10 m in each direction, resulting in more than 5 billion cells. Wave propagation in this model was performed using a 3D finite difference solver in the time domain based on a staggered grid velocity-pressure formulation of acoustodynamics. To ensure that the resulting data could be generated sufficiently fast, the King Abdullah University of Science and Technology (KAUST) supercomputer Shaheen II Cray XC40 was used. A total of 21,000 three-component (pressure and vertical and horizontal velocity) common receiver gathers with a 50 Hz maximum frequency were computed in less than
Lattice gauge theory using parallel processors

International Nuclear Information System (INIS)

Lee, T.D.; Chou, K.C.; Zichichi, A.

1987-01-01

The book's contents include: Lattice Gauge Theory Lectures: Introduction and Current Fermion Simulations; Monte Carlo Algorithms for Lattice Gauge Theory; Specialized Computers for Lattice Gauge Theory; Lattice Gauge Theory at Finite Temperature: A Monte Carlo Study; Computational Method - An Elementary Introduction to the Langevin Equation, Present Status of Numerical Quantum Chromodynamics; Random Lattice Field Theory; The GF11 Processor and Compiler; and The APE Computer and First Physics Results; Columbia Supercomputer Project: Parallel Supercomputer for Lattice QCD; Statistical and Systematic Errors in Numerical Simulations; Monte Carlo Simulation for LGT and Programming Techniques on the Columbia Supercomputer; Food for Thought: Five Lectures on Lattice Gauge Theory
Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer

Energy Technology Data Exchange (ETDEWEB)

Xu, Chuanfu, E-mail: xuchuanfu@nudt.edu.cn [College of Computer Science, National University of Defense Technology, Changsha 410073 (China); Deng, Xiaogang; Zhang, Lilun [College of Computer Science, National University of Defense Technology, Changsha 410073 (China); Fang, Jianbin [Parallel and Distributed Systems Group, Delft University of Technology, Delft 2628CD (Netherlands); Wang, Guangxue; Jiang, Yi [State Key Laboratory of Aerodynamics, P.O. Box 211, Mianyang 621000 (China); Cao, Wei; Che, Yonggang; Wang, Yongxian; Wang, Zhenghua; Liu, Wei; Cheng, Xinghua [College of Computer Science, National University of Defense Technology, Changsha 410073 (China)

2014-12-01

Programming and optimizing complex, real-world CFD codes on current many-core accelerated HPC systems is very challenging, especially when collaborating CPUs and accelerators to fully tap the potential of heterogeneous systems. In this paper, with a tri-level hybrid and heterogeneous programming model using MPI + OpenMP + CUDA, we port and optimize our high-order multi-block structured CFD software HOSTA on the GPU-accelerated TianHe-1A supercomputer. HOSTA adopts two self-developed high-order compact definite difference schemes WCNS and HDCS that can simulate flows with complex geometries. We present a dual-level parallelization scheme for efficient multi-block computation on GPUs and perform particular kernel optimizations for high-order CFD schemes. The GPU-only approach achieves a speedup of about 1.3 when comparing one Tesla M2050 GPU with two Xeon X5670 CPUs. To achieve a greater speedup, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present a novel scheme to balance the loads between the store-poor GPU and the store-rich CPU. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per TianHe-1A node for HOSTA by 2.3×, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Further, to scale HOSTA on TianHe-1A, we propose a gather/scatter optimization to minimize PCI-e data transfer times for ghost and singularity data of 3D grid blocks, and overlap the collaborative computation and communication as far as possible using some advanced CUDA and MPI features. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 TianHe-1A nodes. With our method, we have successfully simulated an EET high-lift airfoil configuration containing 800M cells and China's large civil airplane configuration containing 150M cells. To our best knowledge, those are the largest-scale CPU–GPU collaborative simulations

Collaborating CPU and GPU for large-scale high-order CFD simulations with complex grids on the TianHe-1A supercomputer

International Nuclear Information System (INIS)

Xu, Chuanfu; Deng, Xiaogang; Zhang, Lilun; Fang, Jianbin; Wang, Guangxue; Jiang, Yi; Cao, Wei; Che, Yonggang; Wang, Yongxian; Wang, Zhenghua; Liu, Wei; Cheng, Xinghua

2014-01-01

Programming and optimizing complex, real-world CFD codes on current many-core accelerated HPC systems is very challenging, especially when collaborating CPUs and accelerators to fully tap the potential of heterogeneous systems. In this paper, with a tri-level hybrid and heterogeneous programming model using MPI + OpenMP + CUDA, we port and optimize our high-order multi-block structured CFD software HOSTA on the GPU-accelerated TianHe-1A supercomputer. HOSTA adopts two self-developed high-order compact definite difference schemes WCNS and HDCS that can simulate flows with complex geometries. We present a dual-level parallelization scheme for efficient multi-block computation on GPUs and perform particular kernel optimizations for high-order CFD schemes. The GPU-only approach achieves a speedup of about 1.3 when comparing one Tesla M2050 GPU with two Xeon X5670 CPUs. To achieve a greater speedup, we collaborate CPU and GPU for HOSTA instead of using a naive GPU-only approach. We present a novel scheme to balance the loads between the store-poor GPU and the store-rich CPU. Taking CPU and GPU load balance into account, we improve the maximum simulation problem size per TianHe-1A node for HOSTA by 2.3×, meanwhile the collaborative approach can improve the performance by around 45% compared to the GPU-only approach. Further, to scale HOSTA on TianHe-1A, we propose a gather/scatter optimization to minimize PCI-e data transfer times for ghost and singularity data of 3D grid blocks, and overlap the collaborative computation and communication as far as possible using some advanced CUDA and MPI features. Scalability tests show that HOSTA can achieve a parallel efficiency of above 60% on 1024 TianHe-1A nodes. With our method, we have successfully simulated an EET high-lift airfoil configuration containing 800M cells and China's large civil airplane configuration containing 150M cells. To our best knowledge, those are the largest-scale CPU–GPU collaborative simulations
Mantle Convection on Modern Supercomputers

Science.gov (United States)

Weismüller, J.; Gmeiner, B.; Huber, M.; John, L.; Mohr, M.; Rüde, U.; Wohlmuth, B.; Bunge, H. P.

2015-12-01

Mantle convection is the cause for plate tectonics, the formation of mountains and oceans, and the main driving mechanism behind earthquakes. The convection process is modeled by a system of partial differential equations describing the conservation of mass, momentum and energy. Characteristic to mantle flow is the vast disparity of length scales from global to microscopic, turning mantle convection simulations into a challenging application for high-performance computing. As system size and technical complexity of the simulations continue to increase, design and implementation of simulation models for next generation large-scale architectures is handled successfully only in an interdisciplinary context. A new priority program - named SPPEXA - by the German Research Foundation (DFG) addresses this issue, and brings together computer scientists, mathematicians and application scientists around grand challenges in HPC. Here we report from the TERRA-NEO project, which is part of the high visibility SPPEXA program, and a joint effort of four research groups. TERRA-NEO develops algorithms for future HPC infrastructures, focusing on high computational efficiency and resilience in next generation mantle convection models. We present software that can resolve the Earth's mantle with up to 1012 grid points and scales efficiently to massively parallel hardware with more than 50,000 processors. We use our simulations to explore the dynamic regime of mantle convection and assess the impact of small scale processes on global mantle flow.
RIKEN accelerator progress report, vol. 36. January - December 2002

International Nuclear Information System (INIS)

Asahi, K.; Abe, T.; Ichihara, T.

2003-03-01

This issue of RIKEN Accelerator Progress Report reports research activities of the RIKEN Accelerator Research Facility (RARF) during the calendar year of 2002. The research programs have been coordinated in the framework of the project entitled Multidisciplinary Researches on Heavy Ion Science. The project involves a variety of fields such as: nuclear physics, nuclear astrophysics, atomic physics, nuclear chemistry, radiation biology, condensed matter physics in terms of accelerator or radiation application, plant mutation, material characterization, application to space science, accelerator physics and engineering, laser technology, and computational technology. These activities involved ten laboratories, five Centers involving seven divisions, the RIKEN-RAL (Rutherford-Appleton Laboratory) Center, and the RBRC (RIKEN-Brookhaven Research Center at Brookhaven National Laboratory), and more than 350 researchers from domestic and foreign institutions. Thirty-six universities and institutes from within Japan and 33 institutes from 10 countries are involved. (J.P.N.)
VR system CompleXcope programming guide

International Nuclear Information System (INIS)

Kageyama, Akira; Sato, Tetsuya

1998-09-01

A CAVE virtual reality system CompleXcope is installed in Theory and Computer Center, National Institute for Fusion Science, for the purpose of the interactive analysis/visualization of 3-dimensional complex data of supercomputer simulations. This guide explains how to make a CompleXcope application with Open GL and CAVE library. (author)
A highly scalable particle tracking algorithm using partitioned global address space (PGAS) programming for extreme-scale turbulence simulations

Science.gov (United States)

Buaria, D.; Yeung, P. K.

2017-12-01

A new parallel algorithm utilizing a partitioned global address space (PGAS) programming model to achieve high scalability is reported for particle tracking in direct numerical simulations of turbulent fluid flow. The work is motivated by the desire to obtain Lagrangian information necessary for the study of turbulent dispersion at the largest problem sizes feasible on current and next-generation multi-petaflop supercomputers. A large population of fluid particles is distributed among parallel processes dynamically, based on instantaneous particle positions such that all of the interpolation information needed for each particle is available either locally on its host process or neighboring processes holding adjacent sub-domains of the velocity field. With cubic splines as the preferred interpolation method, the new algorithm is designed to minimize the need for communication, by transferring between adjacent processes only those spline coefficients determined to be necessary for specific particles. This transfer is implemented very efficiently as a one-sided communication, using Co-Array Fortran (CAF) features which facilitate small data movements between different local partitions of a large global array. The cost of monitoring transfer of particle properties between adjacent processes for particles migrating across sub-domain boundaries is found to be small. Detailed benchmarks are obtained on the Cray petascale supercomputer Blue Waters at the University of Illinois, Urbana-Champaign. For operations on the particles in a 81923 simulation (0.55 trillion grid points) on 262,144 Cray XE6 cores, the new algorithm is found to be orders of magnitude faster relative to a prior algorithm in which each particle is tracked by the same parallel process at all times. This large speedup reduces the additional cost of tracking of order 300 million particles to just over 50% of the cost of computing the Eulerian velocity field at this scale. Improving support of PGAS models on
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

2017-05-15

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
Massively Parallel QCD

International Nuclear Information System (INIS)

Soltz, R; Vranas, P; Blumrich, M; Chen, D; Gara, A; Giampap, M; Heidelberger, P; Salapura, V; Sexton, J; Bhanot, G

2007-01-01

The theory of the strong nuclear force, Quantum Chromodynamics (QCD), can be numerically simulated from first principles on massively-parallel supercomputers using the method of Lattice Gauge Theory. We describe the special programming requirements of lattice QCD (LQCD) as well as the optimal supercomputer hardware architectures that it suggests. We demonstrate these methods on the BlueGene massively-parallel supercomputer and argue that LQCD and the BlueGene architecture are a natural match. This can be traced to the simple fact that LQCD is a regular lattice discretization of space into lattice sites while the BlueGene supercomputer is a discretization of space into compute nodes, and that both are constrained by requirements of locality. This simple relation is both technologically important and theoretically intriguing. The main result of this paper is the speedup of LQCD using up to 131,072 CPUs on the largest BlueGene/L supercomputer. The speedup is perfect with sustained performance of about 20% of peak. This corresponds to a maximum of 70.5 sustained TFlop/s. At these speeds LQCD and BlueGene are poised to produce the next generation of strong interaction physics theoretical results
PHENIX Spinfest School 2009 at BNL

Energy Technology Data Exchange (ETDEWEB)

Foster,S.P.; Foster,S.; Seidl, R.; Goto, Y.; Okada, K.

2009-08-07

Since 2005, the PHENIX Spin Physics Working Group has set aside several weeks each summer for the purposes of training and integrating recent members of the working group as well as coordinating and making rapid progress on support tasks and data analysis. One week is dedicated to more formal didactic lectures by outside speakers. The location has so far alternated between BNL and the RIKEN campus in Wako, Japan, with support provided by RBRC and LANL.
Use of QUADRICS supercomputer as embedded simulator in emergency management systems; Utilizzo del calcolatore QUADRICS come simulatore in linea in un sistema di gestione delle emergenze

Energy Technology Data Exchange (ETDEWEB)

Bove, R.; Di Costanzo, G.; Ziparo, A. [ENEA, Centro Ricerche Casaccia, Rome (Italy). Dip. Energia

1996-07-01

The experience related to the implementation of a MRBT, atmospheric spreading model with a short duration releasing, are reported. This model was implemented on a QUADRICS-Q1 supercomputer. First is reported a description of the MRBT model. It is an analytical model to study the speadings of light gases realised in the atmosphere cause incidental releasing. The solution of diffusion equation is Gaussian like. It yield the concentration of pollutant substance released. The concentration is function of space and time. Thus the QUADRICS architecture is introduced. And the implementation of the model is described. At the end it will be consider the integration of the QUADRICS-based model as simulator in a emergency management system.
Explanation of the Rb-Rc discrepancy using new physics

International Nuclear Information System (INIS)

Bhattacharyya, G.; Branco, G.C.; Hou, W.

1996-01-01

The experimental values of R b and R c are the only data which do not seem to agree with standard model predictions. Although it is still premature to draw any definite conclusions, it is timely to look for new physics which could explain the excess in R b and deficit in R c . We investigate this problem in a simple extension of the standard model, where a charge +2/3 isosinglet quark is added to the standard spectrum. Upon the further introduction of an extra scalar doublet, one finds a solution with interesting consequences. copyright 1996 The American Physical Society
Summaries of research and development activities by using supercomputer system of JAEA in FY2015. April 1, 2015 - March 31, 2016

International Nuclear Information System (INIS)

2017-01-01

Japan Atomic Energy Agency (JAEA) conducts research and development (R and D) in various fields related to nuclear power as a comprehensive institution of nuclear energy R and Ds, and utilizes computational science and technology in many activities. As shown in the fact that about 20 percent of papers published by JAEA are concerned with R and D using computational science, the supercomputer system of JAEA has become an important infrastructure to support computational science and technology. In FY2015, the system was used for R and D aiming to restore Fukushima (nuclear plant decommissioning and environmental restoration) as a priority issue, as well as for JAEA's major projects such as Fast Reactor Cycle System, Fusion R and D and Quantum Beam Science. This report presents a great number of R and D results accomplished by using the system in FY2015, as well as user support, operational records and overviews of the system, and so on. (author)
Summaries of research and development activities by using supercomputer system of JAEA in FY2014. April 1, 2014 - March 31, 2015

International Nuclear Information System (INIS)

2016-02-01

Japan Atomic Energy Agency (JAEA) conducts research and development (R and D) in various fields related to nuclear power as a comprehensive institution of nuclear energy R and Ds, and utilizes computational science and technology in many activities. As shown in the fact that about 20 percent of papers published by JAEA are concerned with R and D using computational science, the supercomputer system of JAEA has become an important infrastructure to support computational science and technology. In FY2014, the system was used for R and D aiming to restore Fukushima (nuclear plant decommissioning and environmental restoration) as a priority issue, as well as for JAEA's major projects such as Fast Reactor Cycle System, Fusion R and D and Quantum Beam Science. This report presents a great number of R and D results accomplished by using the system in FY2014, as well as user support, operational records and overviews of the system, and so on. (author)
Summaries of research and development activities by using supercomputer system of JAEA in FY2013. April 1, 2013 - March 31, 2014

International Nuclear Information System (INIS)

2015-02-01

Japan Atomic Energy Agency (JAEA) conducts research and development (R and D) in various fields related to nuclear power as a comprehensive institution of nuclear energy R and Ds, and utilizes computational science and technology in many activities. About 20 percent of papers published by JAEA are concerned with R and D using computational science, the supercomputer system of JAEA has become an important infrastructure to support computational science and technology utilization. In FY2013, the system was used not only for JAEA's major projects such as Fast Reactor Cycle System, Fusion R and D and Quantum Beam Science, but also for R and D aiming to restore Fukushima (nuclear plant decommissioning and environmental restoration) as a priority issue. This report presents a great amount of R and D results accomplished by using the system in FY2013, as well as user support, operational records and overviews of the system, and so on. (author)
Summaries of research and development activities by using supercomputer system of JAEA in FY2012. April 1, 2012 - March 31, 2013

International Nuclear Information System (INIS)

2014-01-01

Japan Atomic Energy Agency (JAEA) conducts research and development (R and D) in various fields related to nuclear power as a comprehensive institution of nuclear energy R and Ds, and utilizes computational science and technology in many activities. As more than 20 percent of papers published by JAEA are concerned with R and D using computational science, the supercomputer system of JAEA has become an important infrastructure to support computational science and technology utilization. In FY2012, the system was used not only for JAEA's major projects such as Fast Reactor Cycle System, Fusion R and D and Quantum Beam Science, but also for R and D aiming to restore Fukushima (nuclear plant decommissioning and environmental restoration) as apriority issue. This report presents a great amount of R and D results accomplished by using the system in FY2012, as well as user support, operational records and overviews of the system, and so on. (author)
Summaries of research and development activities by using supercomputer system of JAEA in FY2011. April 1, 2011 - March 31, 2012

International Nuclear Information System (INIS)

2013-01-01

Japan Atomic Energy Agency (JAEA) conducts research and development (R and D) in various fields related to nuclear power as a comprehensive institution of nuclear energy R and Ds, and utilizes computational science and technology in many activities. As more than 20 percent of papers published by JAEA are concerned with R and D using computational science, the supercomputer system of JAEA has become an important infrastructure to support computational science and technology utilization. In FY2011, the system was used for analyses of the accident at the Fukushima Daiichi Nuclear Power Station and establishment of radioactive decontamination plan, as well as the JAEA's major projects such as Fast Reactor Cycle System, Fusion R and D and Quantum Beam Science. This report presents a great amount of R and D results accomplished by using the system in FY2011, as well as user support structure, operational records and overviews of the system, and so on. (author)
Evaluation of existing and proposed computer architectures for future ground-based systems

Science.gov (United States)

Schulbach, C.

1985-01-01

Parallel processing architectures and techniques used in current supercomputers are described and projections are made of future advances. Presently, the von Neumann sequential processing pattern has been accelerated by having separate I/O processors, interleaved memories, wide memories, independent functional units and pipelining. Recent supercomputers have featured single-input, multiple data stream architectures, which have different processors for performing various operations (vector or pipeline processors). Multiple input, multiple data stream machines have also been developed. Data flow techniques, wherein program instructions are activated only when data are available, are expected to play a large role in future supercomputers, along with increased parallel processor arrays. The enhanced operational speeds are essential for adequately treating data from future spacecraft remote sensing instruments such as the Thematic Mapper.
Study on the climate system and mass transport by a climate model

International Nuclear Information System (INIS)

Numaguti, A.; Sugata, S.; Takahashi, M.; Nakajima, T.; Sumi, A.

1997-01-01

The Center for Global Environmental Research (CGER), an organ of the National Institute for Environmental Studies of the Environment Agency of Japan, was established in October 1990 to contribute broadly to the scientific understanding of global change, and to the elucidation of and solution for our pressing environmental problems. CGER conducts environmental research from interdisciplinary, multiagency, and international perspective, provides research support facilities such as a supercomputer and databases, and offers its own data from long-term monitoring of the global environment. In March 1992, CGER installed a supercomputer system (NEC SX-3, Model 14) to facilitate research on global change. The system is open to environmental researchers worldwide. Proposed research programs are evaluated by the Supercomputer Steering Committee which consists of leading scientists in climate modeling, atmospheric chemistry, oceanic circulation, and computer science. After project approval, authorization for system usage is provided. In 1995 and 1996, several research proposals were designated as priority research and allocated larger shares of computer resources. The CGER supercomputer monograph report Vol. 3 is a report of priority research of CGER's supercomputer. The report covers the description of CCSR-NIES atmospheric general circulation model, lagragian general circulation based on the time-scale of particle motion, and ability of the CCSR-NIES atmospheric general circulation model in the stratosphere. The results obtained from these three studies are described in three chapters. We hope this report provides you with useful information on the global environmental research conducted on our supercomputer
Web interface for plasma analysis codes

Energy Technology Data Exchange (ETDEWEB)

Emoto, M. [National Institute for Fusion Science, 322-6 Oroshi, Toki, Gifu 509-5292 (Japan)], E-mail: emo@nifs.ac.jp; Murakami, S. [Kyoto University, Yoshida-Honmachi, Sakyo-ku, Kyoto 606-8501 (Japan); Yoshida, M.; Funaba, H.; Nagayama, Y. [National Institute for Fusion Science, 322-6 Oroshi, Toki, Gifu 509-5292 (Japan)

2008-04-15

There are many analysis codes that analyze various aspects of plasma physics. However, most of them are FORTRAN programs that are written to be run in supercomputers. On the other hand, many scientists use GUI (graphical user interface)-based operating systems. For those who are not familiar with supercomputers, it is a difficult task to run analysis codes in supercomputers, and they often hesitate to use these programs to substantiate their ideas. Furthermore, these analysis codes are written for personal use, and the programmers do not expect these programs to be run by other users. In order to make these programs to be widely used by many users, the authors developed user-friendly interfaces using a Web interface. Since the Web browser is one of the most common applications, it is useful for both the users and developers. In order to realize interactive Web interface, AJAX technique is widely used, and the authors also adopted AJAX. To build such an AJAX based Web system, Ruby on Rails plays an important role in this system. Since this application framework, which is written in Ruby, abstracts the Web interfaces necessary to implement AJAX and database functions, it enables the programmers to efficiently develop the Web-based application. In this paper, the authors will introduce the system and demonstrate the usefulness of this approach.
Web interface for plasma analysis codes

International Nuclear Information System (INIS)

Emoto, M.; Murakami, S.; Yoshida, M.; Funaba, H.; Nagayama, Y.

2008-01-01

There are many analysis codes that analyze various aspects of plasma physics. However, most of them are FORTRAN programs that are written to be run in supercomputers. On the other hand, many scientists use GUI (graphical user interface)-based operating systems. For those who are not familiar with supercomputers, it is a difficult task to run analysis codes in supercomputers, and they often hesitate to use these programs to substantiate their ideas. Furthermore, these analysis codes are written for personal use, and the programmers do not expect these programs to be run by other users. In order to make these programs to be widely used by many users, the authors developed user-friendly interfaces using a Web interface. Since the Web browser is one of the most common applications, it is useful for both the users and developers. In order to realize interactive Web interface, AJAX technique is widely used, and the authors also adopted AJAX. To build such an AJAX based Web system, Ruby on Rails plays an important role in this system. Since this application framework, which is written in Ruby, abstracts the Web interfaces necessary to implement AJAX and database functions, it enables the programmers to efficiently develop the Web-based application. In this paper, the authors will introduce the system and demonstrate the usefulness of this approach
Experimental HEP supercomputing at F.S.U

International Nuclear Information System (INIS)

Levinthal, D.; Goldman, H.; Hodous, M.F.

1987-01-01

We have developed a track reconstruction algorithm that will work with any 2-dimensional detector as long as the 2-dimensional projections of that detector are ''true'' projections. The program is implemented on a 2-pipe CDC CYBER-205 with 4 million words of memory. (orig./HSI)

Fault Tolerance Assistant (FTA): An Exception Handling Programming Model for MPI Applications

Energy Technology Data Exchange (ETDEWEB)

Fang, Aiman [Univ. of Chicago, IL (United States). Dept. of Computer Science; Laguna, Ignacio [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sato, Kento [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Islam, Tanzima [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mohror, Kathryn [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-05-23

Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enables failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.
Metabolomics Workbench (MetWB)

Data.gov (United States)

U.S. Department of Health & Human Services — The Metabolomics Program's Data Repository and Coordinating Center (DRCC), housed at the San Diego Supercomputer Center (SDSC), University of California, San Diego,...
Methodologies and Tools for Tuning Parallel Programs: 80% Art, 20% Science, and 10% Luck

Science.gov (United States)

Yan, Jerry C.; Bailey, David (Technical Monitor)

1996-01-01

The need for computing power has forced a migration from serial computation on a single processor to parallel processing on multiprocessors. However, without effective means to monitor (and analyze) program execution, tuning the performance of parallel programs becomes exponentially difficult as program complexity and machine size increase. In the past few years, the ubiquitous introduction of performance tuning tools from various supercomputer vendors (Intel's ParAide, TMC's PRISM, CRI's Apprentice, and Convex's CXtrace) seems to indicate the maturity of performance instrumentation/monitor/tuning technologies and vendors'/customers' recognition of their importance. However, a few important questions remain: What kind of performance bottlenecks can these tools detect (or correct)? How time consuming is the performance tuning process? What are some important technical issues that remain to be tackled in this area? This workshop reviews the fundamental concepts involved in analyzing and improving the performance of parallel and heterogeneous message-passing programs. Several alternative strategies will be contrasted, and for each we will describe how currently available tuning tools (e.g. AIMS, ParAide, PRISM, Apprentice, CXtrace, ATExpert, Pablo, IPS-2) can be used to facilitate the process. We will characterize the effectiveness of the tools and methodologies based on actual user experiences at NASA Ames Research Center. Finally, we will discuss their limitations and outline recent approaches taken by vendors and the research community to address them.
Report on the achievements in fiscal 1999 of research and development of fusion zones. Calculation science; 1999 nendo yugo ryoiki kenkyu kaihatsu seika hokokusho. Keisan kagaku

Energy Technology Data Exchange (ETDEWEB)

NONE

2000-03-01

In order to elucidate the mechanism for specific phenomena from the atom scale to the meso-scale, developmental research has been performed on a fusion type calculation method. This paper reports the achievements in fiscal 1999. In planning the policy of structuring the program package TACPAC and its application method, the program for materials STATE using the classical molecule dynamics method and the biological program, Peach, were selected as the programs to act as the nuclei. In continuing performance improvements respectively in the computer programs to be fused, and improving the performance of the software fused with them, the programs were transferred into a super-computer. As a result, it was found that the classical and the first principle quantum molecular dynamics method program STATE can be executed at high efficiency on the super-computer. In executing the challenging computer simulation and developing a calculation method for large scale systems, researches were performed on alcohol synthesizing reaction on copper surface and the status of association of gene DNA molecules with enzymes, and development was carried out on a matrix generation method to calculate non-empirical molecule trajectories of the large scale systems. (NEDO)
Status of the Fermilab lattice supercomputer project

International Nuclear Information System (INIS)

Mackenzie, P.; Eichten, E.; Hockney, G.

1988-10-01

Fermilab has completed construction of a sixteen node (320 megaflop peak speed) parallel computer for lattice gauge theory calculations. The architecture was designed to provide the highest possible cost effectiveness while maintaining a high level of programmability and constraining as little as possible the types of lattice problems which can be done on it. The machine is programmed in C. It is a prototype for a 256 node (5 gigaflop peak speed) computer which will be assembled this winter. 6 refs
The ASCI Network for SC '99: A Step on the Path to a 100 Gigabit Per Second Supercomputing Network

Energy Technology Data Exchange (ETDEWEB)

PRATT,THOMAS J.; TARMAN,THOMAS D.; MARTINEZ,LUIS M.; MILLER,MARC M.; ADAMS,ROGER L.; CHEN,HELEN Y.; BRANDT,JAMES M.; WYCKOFF,PETER S.

2000-07-24

This document highlights the Discom{sup 2}'s Distance computing and communication team activities at the 1999 Supercomputing conference in Portland, Oregon. This conference is sponsored by the IEEE and ACM. Sandia, Lawrence Livermore and Los Alamos National laboratories have participated in this conference for eleven years. For the last four years the three laboratories have come together at the conference under the DOE's ASCI, Accelerated Strategic Computing Initiatives rubric. Communication support for the ASCI exhibit is provided by the ASCI DISCOM{sup 2} project. The DISCOM{sup 2} communication team uses this forum to demonstrate and focus communication and networking developments within the community. At SC 99, DISCOM built a prototype of the next generation ASCI network demonstrated remote clustering techniques, demonstrated the capabilities of the emerging Terabit Routers products, demonstrated the latest technologies for delivering visualization data to the scientific users, and demonstrated the latest in encryption methods including IP VPN technologies and ATM encryption research. The authors also coordinated the other production networking activities within the booth and between their demonstration partners on the exhibit floor. This paper documents those accomplishments, discusses the details of their implementation, and describes how these demonstrations support Sandia's overall strategies in ASCI networking.
Accelerating Science with the NERSC Burst Buffer Early User Program

Energy Technology Data Exchange (ETDEWEB)

Bhimji, Wahid [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bard, Debbie [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Romanus, Melissa [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Rutgers Univ., New Brunswick, NJ (United States); Paul, David [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Ovsyannikov, Andrey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Friesen, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Bryson, Matt [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Correa, Joaquin [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Lockwood, Glenn K. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Tsulaia, Vakho [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Byna, Suren [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Farrell, Steve [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Gursoy, Doga [Argonne National Lab. (ANL), Argonne, IL (United States). Advanced Photon Source (APS); Daley, Chris [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Beckner, Vince [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Van Straalen, Brian [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Trebotich, David [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Tull, Craig [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Weber, Gunther H. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wright, Nicholas J. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Antypas, Katie [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Prabhat, none [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2016-01-01

NVRAM-based Burst Buffers are an important part of the emerging HPC storage landscape. The National Energy Research Scientific Computing Center (NERSC) at Lawrence Berkeley National Laboratory recently installed one of the first Burst Buffer systems as part of its new Cori supercomputer, collaborating with Cray on the development of the DataWarp software. NERSC has a diverse user base comprised of over 6500 users in 700 different projects spanning a wide variety of scientific computing applications. The use-cases of the Burst Buffer at NERSC are therefore also considerable and diverse. We describe here performance measurements and lessons learned from the Burst Buffer Early User Program at NERSC, which selected a number of research projects to gain early access to the Burst Buffer and exercise its capability to enable new scientific advancements. To the best of our knowledge this is the first time a Burst Buffer has been stressed at scale by diverse, real user workloads and therefore these lessons will be of considerable benefit to shaping the developing use of Burst Buffers at HPC centers.
The computational physics program of the National MFE Computer Center

International Nuclear Information System (INIS)

Mirin, A.A.

1988-01-01

The principal objective of the Computational Physics Group is to develop advanced numerical models for the investigation of plasma phenomena and the simulation of present and future magnetic confinement devices. Another major objective of the group is to develop efficient algorithms and programming techniques for current and future generation of supercomputers. The computational physics group is involved in several areas of fusion research. One main area is the application of Fokker-Planck/quasilinear codes to tokamaks. Another major area is the investigation of resistive magnetohydrodynamics in three dimensions, with applications to compact toroids. Another major area is the investigation of kinetic instabilities using a 3-D particle code. This work is often coupled with the task of numerically generating equilibria which model experimental devices. Ways to apply statistical closure approximations to study tokamak-edge plasma turbulence are being examined. In addition to these computational physics studies, the group has developed a number of linear systems solvers for general classes of physics problems and has been making a major effort at ascertaining how to efficiently utilize multiprocessor computers
Large scale computing in the Energy Research Programs

International Nuclear Information System (INIS)

1991-05-01

The Energy Research Supercomputer Users Group (ERSUG) comprises all investigators using resources of the Department of Energy Office of Energy Research supercomputers. At the December 1989 meeting held at Florida State University (FSU), the ERSUG executive committee determined that the continuing rapid advances in computational sciences and computer technology demanded a reassessment of the role computational science should play in meeting DOE's commitments. Initial studies were to be performed for four subdivisions: (1) Basic Energy Sciences (BES) and Applied Mathematical Sciences (AMS), (2) Fusion Energy, (3) High Energy and Nuclear Physics, and (4) Health and Environmental Research. The first two subgroups produced formal subreports that provided a basis for several sections of this report. Additional information provided in the AMS/BES is included as Appendix C in an abridged form that eliminates most duplication. Additionally, each member of the executive committee was asked to contribute area-specific assessments; these assessments are included in the next section. In the following sections, brief assessments are given for specific areas, a conceptual model is proposed that the entire computational effort for energy research is best viewed as one giant nation-wide computer, and then specific recommendations are made for the appropriate evolution of the system
The QCDOC Project

International Nuclear Information System (INIS)

Boyle, P.; Chen, D.; Christ, N.; Clark, M.; Cohen, S.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L.; Liao, X.; Liu, G.; Li, S.; Lin, H.; Mawhinney, R.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A.

2005-01-01

The QCDOC project has developed a supercomputer optimised for the needs of Lattice QCD simulations. It provides a very competitive price to sustained performance ratio of around $1 USD per sustained Megaflop/s in combination with outstanding scalability. Thus very large systems delivering over 5 TFlop/s of performance on the evolution of a single lattice is possible. Large prototypes have been built and are functioning correctly. The software environment raises the state of the art in such custom supercomputers. It is based on a lean custom node operating system that eliminates many unnecessary overheads that plague other systems. Despite the custom nature, the operating system implements a standards compliant UNIX-like programming environment easing the porting of software from other systems. The SciDAC QMP interface adds internode communication in a fashion that provides a uniform cross-platform programming environment
Scientific visualization and radiology

International Nuclear Information System (INIS)

Lawrance, D.P.; Hoyer, C.E.; Wrestler, F.A.; Kuhn, M.J.; Moore, W.D.; Anderson, D.R.

1989-01-01

Scientific visualization is the visual presentation of numerical data. The National Center for Supercomputing Applications (NCSA) has developed methods for visualizing computerbased simulations of digital imaging data. The applicability of these various tools for unique and potentially medical beneficial display of MR images is investigated. Raw data are obtained from MR images of the brain, neck, spine, and brachial plexus obtained on a 1.5-T imager with multiple pulse sequences. A supercomputer and other mainframe resources run a variety of graphic and imaging programs using this data. An interdisciplinary team of imaging scientists, computer graphic programmers, an physicians works together to achieve useful information
Resonance – Journal of Science Education | News

Indian Academy of Sciences (India)

Programming Languages - A Brief Review. V Rajaraman ... V Rajaraman1 2. IBM Professor of Information Technology, Jawaharlal Nehru Centre for Advanced Scientific Research, Bangalore 560012, India; Hon.Professor, Supercomputer Education & Research Centre Indian Institute of Science, Bangalore 560012, India ...
Compiler and Runtime Support for Programming in Adaptive Parallel Environments

Science.gov (United States)

1998-10-15

noother job is waiting for resources, and use a smaller number of processors when other jobs needresources. Setia et al. [15, 20] have shown that such...15] Vijay K. Naik, Sanjeev Setia , and Mark Squillante. Performance analysis of job scheduling policiesin parallel supercomputing environments. In...on networks ofheterogeneous workstations. Technical Report CSE-94-012, Oregon Graduate Institute of Scienceand Technology, 1994.[20] Sanjeev Setia
Know Your Personal Computer

Indian Academy of Sciences (India)

Home; Journals; Resonance – Journal of Science Education; Volume 1; Issue 11. Know Your Personal Computer 5. The CPU Base Instruction Set and Assembly Language Programming. Siddhartha Kumar Ghoshal ... Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India ...
Evaluating computer program performance on the CRAY-1

International Nuclear Information System (INIS)

Rudsinski, L.; Pieper, G.W.

1979-01-01

The Advanced Scientific Computers Project of Argonne's Applied Mathematics Division has two objectives: to evaluate supercomputers and to determine their effect on Argonne's computing workload. Initial efforts have focused on the CRAY-1, which is the only advanced computer currently available. Users from seven Argonne divisions executed test programs on the CRAY and made performance comparisons with the IBM 370/195 at Argonne. This report describes these experiences and discusses various techniques for improving run times on the CRAY. Direct translations of code from scalar to vector processor reduced running times as much as two-fold, and this reduction will become more pronounced as the CRAY compiler is developed. Further improvement (two- to ten-fold) was realized by making minor code changes to facilitate compiler recognition of the parallel and vector structure within the programs. Finally, extensive rewriting of the FORTRAN code structure reduced execution times dramatically, in three cases by a factor of more than 20; and even greater reduction should be possible by changing algorithms within a production code. It is condluded that the CRAY-1 would be of great benefit to Argonne researchers. Existing codes could be modified with relative ease to run significantly faster than on the 370/195. More important, the CRAY would permit scientists to investigate complex problems currently deemed infeasibile on traditional scalar machines. Finally, an interface between the CRAY-1 and IBM computers such as the 370/195, scheduled by Cray Research for the first quarter of 1979, would considerably facilitate the task of integrating the CRAY into Argonne's Central Computing Facility. 13 tables
The Computational Physics Program of the national MFE Computer Center

International Nuclear Information System (INIS)

Mirin, A.A.

1989-01-01

Since June 1974, the MFE Computer Center has been engaged in a significant computational physics effort. The principal objective of the Computational Physics Group is to develop advanced numerical models for the investigation of plasma phenomena and the simulation of present and future magnetic confinement devices. Another major objective of the group is to develop efficient algorithms and programming techniques for current and future generations of supercomputers. The Computational Physics Group has been involved in several areas of fusion research. One main area is the application of Fokker-Planck/quasilinear codes to tokamaks. Another major area is the investigation of resistive magnetohydrodynamics in three dimensions, with applications to tokamaks and compact toroids. A third area is the investigation of kinetic instabilities using a 3-D particle code; this work is often coupled with the task of numerically generating equilibria which model experimental devices. Ways to apply statistical closure approximations to study tokamak-edge plasma turbulence have been under examination, with the hope of being able to explain anomalous transport. Also, we are collaborating in an international effort to evaluate fully three-dimensional linear stability of toroidal devices. In addition to these computational physics studies, the group has developed a number of linear systems solvers for general classes of physics problems and has been making a major effort at ascertaining how to efficiently utilize multiprocessor computers. A summary of these programs are included in this paper. 6 tabs
Tracking and computing

International Nuclear Information System (INIS)

Niederer, J.

1983-01-01

This note outlines several ways in which large scale simulation computing and programming support may be provided to the SSC design community. One aspect of the problem is getting supercomputer power without the high cost and long lead times of large scale institutional computing. Another aspect is the blending of modern programming practices with more conventional accelerator design programs in ways that do not also swamp designers with the details of complicated computer technology
Energy and technology review

Energy Technology Data Exchange (ETDEWEB)

1984-03-01

The Lawrence Livermore National Laboratory publishes the Energy and Technology Review Monthly. This periodical reviews progress mode is selected programs at the laboratory. This issue includes articles on in-situ coal gasification, on chromosomal aberrations in human sperm, on high speed cell sorting and on supercomputers.
Energy and technology review

International Nuclear Information System (INIS)

1984-03-01

The Lawrence Livermore National Laboratory publishes the Energy and Technology Review Monthly. This periodical reviews progress mode is selected programs at the laboratory. This issue includes articles on in-situ coal gasification, on chromosomal aberrations in human sperm, on high speed cell sorting and on supercomputers
High-performance computing — an overview

Science.gov (United States)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

DCA++: A case for science driven application development for leadership computing platforms

International Nuclear Information System (INIS)

Summers, Michael S; Alvarez, Gonzalo; Meredith, Jeremy; Maier, Thomas A; Schulthess, Thomas C

2009-01-01

The DCA++ code was one of the early science applications that ran on jaguar at the National Center for Computational Sciences, and the first application code to sustain a petaflop/s under production conditions on a general-purpose supercomputer. The code implements a quantum cluster method with a Quantum Monte Carlo kernel to solve the 2D Hubbard model for high-temperature superconductivity. It is implemented in C++, making heavy use of the generic programming model. In this paper, we discuss how this code was developed, reaching scalability and high efficiency on the world's fastest supercomputer in only a few years. We show how the use of generic concepts combined with systematic refactoring of codes is a better strategy for computational sciences than a comprehensive upfront design.
AUTODYN - an interactive non-linear dynamic analysis program for microcomputers through supercomputers

International Nuclear Information System (INIS)

Birnbaum, N.K.; Cowler, M.S.; Itoh, M.; Katayama, M.; Obata, H.

1987-01-01

AUTODYN uses a two dimensional coupled finite difference approach similar to the one described by Cowler and Hancock (1979). Both translational and axial symmetry are treated. The scheme allows alternative numerical processors to be selectively used to model different components/regions of a problem. Finite difference grids operated on by these processors can be coupled together in space and time to efficiently compute structural (or fluid-structure) interactions. AUTODYN currently includes a Lagrange processor for modeling solid continua and structures, an Euler processor for modeling fluids and the large distortion of solids, an ALE (Arbitrary Lagrange Euler) processor for specialized flow models and a shell processor for modeling thin structures. At present, all four processors use explicit time integration but implicit options will be added to the Lagrange and ALE processors in the near future. Material models are included for solids, liquids and gases (including HE detonation products). (orig.)
Grassroots Supercomputing

CERN Multimedia

Buchanan, Mark

2005-01-01

What started out as a way for SETI to plow through its piles or radio-signal data from deep space has turned into a powerful research tool as computer users acrosse the globe donate their screen-saver time to projects as diverse as climate-change prediction, gravitational-wave searches, and protein folding (4 pages)
Proceedings of RIKEN BNL Research Center Workshop, RHIC Spin Physics V, Volume 32, February 21, 2001

International Nuclear Information System (INIS)

BUNCE, G.; SAITO, N.; VIGDOR, S.; ROSER, T.; SPINKA, H.; ENYO, H.; BLAND, L.C.; GURYN, W.

2001-01-01

The RIKEN BNL Research Center (RBRC) was established in April 1997 at Brookhaven National Laboratory. It is funded by the ''Rikagaku Kenkysho'' (RIKEN, The Institute of Physical and Chemical Research) of Japan. The Center is dedicated to the study of strong interactions, including spin physics, lattice QCD and RHIC physics through the nurturing of a new generation of young physicists. During the fast year, the Center had only a Theory Group. In the second year, an Experimental Group was also established at the Center. At present, there are seven Fellows and nine post dots in these two groups. During the third year, we started a new Tenure Track Strong Interaction Theory RHIC Physics Fellow Program, with six positions in the academic year 1999-2000; this program will increase to include eleven theorists in the next academic year, and, in the year after, also be extended to experimental physics. In addition, the Center has an active workshop program on strong interaction physics, about ten workshops a year, with each workshop focused on a specific physics problem. Each workshop speaker is encouraged to select few of the most important transparencies from his or her presentation, accompanied by a page of explanation. This material is collected at the end of the workshop by the organizer to form proceedings, which can therefore be available within a short time. The construction of a 0.6 teraflop parallel processor, which was begun at the Center on February 19, 1998, was completed on August 28, 1998
EX6AFS: A data acquisition system for high-speed dispersive EXAFS measurements implemented using object-oriented programming techniques

International Nuclear Information System (INIS)

Jennings, G.; Lee, P.L.

1995-01-01

In this paper we describe the design and implementation of a computerized data-acquisition system for high-speed energy-dispersive EXAFS experiments on the X6A beamline at the National Synchrotron Light Source. The acquisition system drives the stepper motors used to move the components of the experimental setup and controls the readout of the EXAFS spectra. The system runs on a Macintosh IIfx computer and is written entirely in the object-oriented language C++. Large segments of the system are implemented by means of commercial class libraries, specifically the MacApp application framework from Apple, the Rogue Wave class library, and the Hierarchical Data Format datafile format library from the National Center for Supercomputing Applications. This reduces the amount of code that must be written and enhances reliability. The system makes use of several advanced features of C++: Multiple inheritance allows the code to be decomposed into independent software components and the use of exception handling allows the system to be much more reliable in the event of unexpected errors. Object-oriented techniques allow the program to be extended easily as new requirements develop. All sections of the program related to a particular concept are located in a small set of source files. The program will also be used as a prototype for future software development plans for the Basic Energy Science Synchrotron Radiation Center Collaborative Access Team beamlines being designed and built at the Advanced Photon Source
Vector and parallel processors in computational science

International Nuclear Information System (INIS)

Duff, I.S.; Reid, J.K.

1985-01-01

This book presents the papers given at a conference which reviewed the new developments in parallel and vector processing. Topics considered at the conference included hardware (array processors, supercomputers), programming languages, software aids, numerical methods (e.g., Monte Carlo algorithms, iterative methods, finite elements, optimization), and applications (e.g., neutron transport theory, meteorology, image processing)
New frontiers in nuclear structure studies

International Nuclear Information System (INIS)

Zwarts, D.; Walet, N.R.; Wolters, A.A.; Glaudemans, P.W.M.; VandeGraff, R.J.

1985-01-01

The need to go to larger model spaces for more detailed studies of the atomic nucleus has led to the introduction of the supercomputer to nuclear physics. In this report a brief survey of the nuclear shell model is presented and the performance of some of the relevant programs on different computer systems is compared
DCA++: A case for science driven application development for leadership computing platforms

Energy Technology Data Exchange (ETDEWEB)

Summers, Michael S; Alvarez, Gonzalo; Meredith, Jeremy; Maier, Thomas A [Computer Science and Mathematics Division, Oak Ridge National Laboratory, P. O. Box 2008, Mail Stop 6164, Oak Ridge, TN 37831 (United States); Schulthess, Thomas C, E-mail: schulthess@cscs.c [Swiss National Supercomputer Center and Institute for Theoretical Physics, ETH Zurich, CSCS MAN E 133, Galeria 2, CH-9628 Manno (Switzerland)

2009-07-01

The DCA++ code was one of the early science applications that ran on jaguar at the National Center for Computational Sciences, and the first application code to sustain a petaflop/s under production conditions on a general-purpose supercomputer. The code implements a quantum cluster method with a Quantum Monte Carlo kernel to solve the 2D Hubbard model for high-temperature superconductivity. It is implemented in C++, making heavy use of the generic programming model. In this paper, we discuss how this code was developed, reaching scalability and high efficiency on the world's fastest supercomputer in only a few years. We show how the use of generic concepts combined with systematic refactoring of codes is a better strategy for computational sciences than a comprehensive upfront design.
Partial Overhaul and Initial Parallel Optimization of KINETICS, a Coupled Dynamics and Chemistry Atmosphere Model

Science.gov (United States)

Nguyen, Howard; Willacy, Karen; Allen, Mark

2012-01-01

KINETICS is a coupled dynamics and chemistry atmosphere model that is data intensive and computationally demanding. The potential performance gain from using a supercomputer motivates the adaptation from a serial version to a parallelized one. Although the initial parallelization had been done, bottlenecks caused by an abundance of communication calls between processors led to an unfavorable drop in performance. Before starting on the parallel optimization process, a partial overhaul was required because a large emphasis was placed on streamlining the code for user convenience and revising the program to accommodate the new supercomputers at Caltech and JPL. After the first round of optimizations, the partial runtime was reduced by a factor of 23; however, performance gains are dependent on the size of the data, the number of processors requested, and the computer used.
Proposals now being accepted for 'INCITE'

CERN Multimedia

2003-01-01

Secretary of Energy, Spencer Abraham, has announced that proposals are now being accepted for a new DOE Office of Science program to support innovative, large-scale computational science projects. The program, entitled "Innovative and Novel Computational Impact on Theory and Experiment" will award a total of 4.5 million supercomputer processor hours and 100 trillion bytes of data storage space at the National Energy Research Scientific Computing Center at DOE's Lawrence Berkeley National Laboratory (1 page).
ステンシル系プログラムの低メモリバンド幅CPU 向け高速化手法の検討

OpenAIRE

高木, 亮治; 杉崎, 由典; 鈴木, 清文; Takaki, Ryoji; Sugisaki, Yoshinori; Suzuki, Kiyofumi

2016-01-01

Stencil programs, which are mainly used for numerical simulations of continuum dynamics like fluid mechanics, require relatively high memory bandwidth of CPUs. On the other hand, current supercomputers have relatively low memory bandwidth compared to high computational performance of CPUs. It is called a memory wall problem, namely low B/F (Bytes/s per FLOP/s, FLoating OPeration/s) ratio. This paper makes a study on how to increase computational performance of stencil programs on current CPUs...
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets.

Science.gov (United States)

Shrimankar, D D; Sathe, S R

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today's supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures.
Analysis of Parallel Algorithms on SMP Node and Cluster of Workstations Using Parallel Programming Models with New Tile-based Method for Large Biological Datasets

Science.gov (United States)

Shrimankar, D. D.; Sathe, S. R.

2016-01-01

Sequence alignment is an important tool for describing the relationships between DNA sequences. Many sequence alignment algorithms exist, differing in efficiency, in their models of the sequences, and in the relationship between sequences. The focus of this study is to obtain an optimal alignment between two sequences of biological data, particularly DNA sequences. The algorithm is discussed with particular emphasis on time, speedup, and efficiency optimizations. Parallel programming presents a number of critical challenges to application developers. Today’s supercomputer often consists of clusters of SMP nodes. Programming paradigms such as OpenMP and MPI are used to write parallel codes for such architectures. However, the OpenMP programs cannot be scaled for more than a single SMP node. However, programs written in MPI can have more than single SMP nodes. But such a programming paradigm has an overhead of internode communication. In this work, we explore the tradeoffs between using OpenMP and MPI. We demonstrate that the communication overhead incurs significantly even in OpenMP loop execution and increases with the number of cores participating. We also demonstrate a communication model to approximate the overhead from communication in OpenMP loops. Our results are astonishing and interesting to a large variety of input data files. We have developed our own load balancing and cache optimization technique for message passing model. Our experimental results show that our own developed techniques give optimum performance of our parallel algorithm for various sizes of input parameter, such as sequence size and tile size, on a wide variety of multicore architectures. PMID:27932868
SERS internship fall 1995 abstracts and research papers

Energy Technology Data Exchange (ETDEWEB)

Davis, Beverly

1996-05-01

This report is a compilation of twenty abstracts and their corresponding full papers of research projects done under the US Department of Energy Science and Engineering Research Semester (SERS) program. Papers cover a broad range of topics, for example, environmental transport, supercomputers, databases, biology. Selected papers were indexed separately for inclusion the the Energy Science and Technology Database.
Proceedings of the ninth annual conference of the IEEE Engineering in Medicine and Biology Society

International Nuclear Information System (INIS)

Anon.

1987-01-01

This book contains over 100 papers. Some of the titles are: Angular integrations and inter-projections correlation effects in CT reconstruction; Supercomputing environment for biomedical research; Program towards a computational molecular biology; Current problems in molecular biology computing; Signal averaging applied to positron emission tomography; First experimental results from a high spatial resolution PET prototype; and A coherent approach in computer-aided radiotherapy
FooPar

DEFF Research Database (Denmark)

Hargreaves, F. P.; Merkle, D.

2013-01-01

We present FooPar, an extension for highly efficient Parallel Computing in the multi-paradigm programming language Scala. Scala offers concise and clean syntax and integrates functional programming features. Our framework FooPar combines these features with parallel computing techniques. Foo......, results based on a empirical analysis on two supercomputers are given. We achieve close-to-optimal performance wrt. theoretical peak performance. Based on this result we conclude that FooPar allows programmers to fully access Scalas design features without suffering from performance drops when compared...
Contribution to the algorithmic and efficient programming of new parallel architectures including accelerators for neutron physics and shielding computations

International Nuclear Information System (INIS)

Dubois, J.

2011-01-01

In science, simulation is a key process for research or validation. Modern computer technology allows faster numerical experiments, which are cheaper than real models. In the field of neutron simulation, the calculation of eigenvalues is one of the key challenges. The complexity of these problems is such that a lot of computing power may be necessary. The work of this thesis is first the evaluation of new computing hardware such as graphics card or massively multi-core chips, and their application to eigenvalue problems for neutron simulation. Then, in order to address the massive parallelism of supercomputers national, we also study the use of asynchronous hybrid methods for solving eigenvalue problems with this very high level of parallelism. Then we experiment the work of this research on several national supercomputers such as the Titane hybrid machine of the Computing Center, Research and Technology (CCRT), the Curie machine of the Very Large Computing Centre (TGCC), currently being installed, and the Hopper machine at the Lawrence Berkeley National Laboratory (LBNL). We also do our experiments on local workstations to illustrate the interest of this research in an everyday use with local computing resources. (author) [fr
Plasma physics modeling and the Cray-2 multiprocessor

International Nuclear Information System (INIS)

Killeen, J.

1985-01-01

The importance of computer modeling in the magnetic fusion energy research program is discussed. The need for the most advanced supercomputers is described. To meet the demand for more powerful scientific computers to solve larger and more complicated problems, the computer industry is developing multiprocessors. The role of the Cray-2 in plasma physics modeling is discussed with some examples. 28 refs., 2 figs., 1 tab
Searching for ET with Help from Three Million Volunteers: The SETI@Home, Serendip, Sevendip and Spck SETI Programs

Science.gov (United States)

Werthimer, Dan; Anderson, David; Bowyer, Stuart; Cobb, Jeff; Demorest, Paul

2002-01-01

We summarize results from two radio and two optical SETI programs based at the University of California, Berkeley. We discuss the most promising candidate signals from these searches and present plans for future SETI searches, including SERENDIP V and SETI@home II. The ongoing SERENDIP sky survey searches for radio signals at the 300 meter Arecibo Observatory. SERENDIP IV uses a 168 million channel spectrum analyser and a dedicated receiver to take data 24 hours a day, year round. The sky survey covers a 100 MHz band centered at the 21 cm line (1420 MHz) and declinations from -2 to +38 degrees. SETI@home uses desktop computers of 3.5 million volunteers to analyse 50 Terabytes of data taken at Arecibo. The SETI@home sky survey is 10 times more sensitive and searches a much wider variety of signal types than SERRENDIP IV but covers only a 2.5 MHz band. SETI@home is the planet's largest supercomputer, averaging 25 Tflops. SETI@home participants have contributed over a million years of computing time so far. The SEVENDIP optical pulse search looks for nS time scale pulses at optical wavelengths. It utilizes an automated 30 inch telescope, three ultra fast photo multiplier tubes and a coincidence detector. The target list includes F,G,K and M stars, globular cluster and galaxies. The SPOCK optical SETI program searches for narrow band continuous signals using spectra taken by Marcy and his colleagues in their planet search at Keck observatory.
Three-dimensional kinetic simulations of whistler turbulence in solar wind on parallel supercomputers

Science.gov (United States)

Chang, Ouliang

The objective of this dissertation is to study the physics of whistler turbulence evolution and its role in energy transport and dissipation in the solar wind plasmas through computational and theoretical investigations. This dissertation presents the first fully three-dimensional (3D) particle-in-cell (PIC) simulations of whistler turbulence forward cascade in a homogeneous, collisionless plasma with a uniform background magnetic field B o, and the first 3D PIC simulation of whistler turbulence with both forward and inverse cascades. Such computationally demanding research is made possible through the use of massively parallel, high performance electromagnetic PIC simulations on state-of-the-art supercomputers. Simulations are carried out to study characteristic properties of whistler turbulence under variable solar wind fluctuation amplitude (epsilon e) and electron beta (betae), relative contributions to energy dissipation and electron heating in whistler turbulence from the quasilinear scenario and the intermittency scenario, and whistler turbulence preferential cascading direction and wavevector anisotropy. The 3D simulations of whistler turbulence exhibit a forward cascade of fluctuations into broadband, anisotropic, turbulent spectrum at shorter wavelengths with wavevectors preferentially quasi-perpendicular to B o. The overall electron heating yields T ∥ > T⊥ for all epsilone and betae values, indicating the primary linear wave-particle interaction is Landau damping. But linear wave-particle interactions play a minor role in shaping the wavevector spectrum, whereas nonlinear wave-wave interactions are overall stronger and faster processes, and ultimately determine the wavevector anisotropy. Simulated magnetic energy spectra as function of wavenumber show a spectral break to steeper slopes, which scales as k⊥lambda e ≃ 1 independent of betae values, where lambdae is electron inertial length, qualitatively similar to solar wind observations. Specific

Algorithms for supercomputers

International Nuclear Information System (INIS)

Alder, B.J.

1986-01-01

Better numerical procedures, improved computational power and additional physical insights have contributed significantly to progress in dealing with classical and quantum statistical mechanics problems. Past developments are discussed and future possibilities outlined
Super-computer architecture

CERN Document Server

Hockney, R W

1977-01-01

This paper examines the design of the top-of-the-range, scientific, number-crunching computers. The market for such computers is not as large as that for smaller machines, but on the other hand it is by no means negligible. The present work-horse machines in this category are the CDC 7600 and IBM 360/195, and over fifty of the former machines have been sold. The types of installation that form the market for such machines are not only the major scientific research laboratories in the major countries-such as Los Alamos, CERN, Rutherford laboratory-but also major universities or university networks. It is also true that, as with sports cars, innovations made to satisfy the top of the market today often become the standard for the medium-scale computer of tomorrow. Hence there is considerable interest in examining present developments in this area. (0 refs).
Algorithms for supercomputers

International Nuclear Information System (INIS)

Alder, B.J.

1985-12-01

Better numerical procedures, improved computational power and additional physical insights have contributed significantly to progress in dealing with classical and quantum statistical mechanics problems. Past developments are discussed and future possibilities outlined
Multiscale Hy3S: Hybrid stochastic simulation for supercomputers

Directory of Open Access Journals (Sweden)

Kaznessis Yiannis N

2006-02-01

Full Text Available Abstract Background Stochastic simulation has become a useful tool to both study natural biological systems and design new synthetic ones. By capturing the intrinsic molecular fluctuations of "small" systems, these simulations produce a more accurate picture of single cell dynamics, including interesting phenomena missed by deterministic methods, such as noise-induced oscillations and transitions between stable states. However, the computational cost of the original stochastic simulation algorithm can be high, motivating the use of hybrid stochastic methods. Hybrid stochastic methods partition the system into multiple subsets and describe each subset as a different representation, such as a jump Markov, Poisson, continuous Markov, or deterministic process. By applying valid approximations and self-consistently merging disparate descriptions, a method can be considerably faster, while retaining accuracy. In this paper, we describe Hy3S, a collection of multiscale simulation programs. Results Building on our previous work on developing novel hybrid stochastic algorithms, we have created the Hy3S software package to enable scientists and engineers to both study and design extremely large well-mixed biological systems with many thousands of reactions and chemical species. We have added adaptive stochastic numerical integrators to permit the robust simulation of dynamically stiff biological systems. In addition, Hy3S has many useful features, including embarrassingly parallelized simulations with MPI; special discrete events, such as transcriptional and translation elongation and cell division; mid-simulation perturbations in both the number of molecules of species and reaction kinetic parameters; combinatorial variation of both initial conditions and kinetic parameters to enable sensitivity analysis; use of NetCDF optimized binary format to quickly read and write large datasets; and a simple graphical user interface, written in Matlab, to help users
Supercomputing for molecular dynamics simulations handling multi-trillion particles in nanofluidics

CERN Document Server

Heinecke, Alexander; Horsch, Martin; Bungartz, Hans-Joachim

2015-01-01

This work presents modern implementations of relevant molecular dynamics algorithms using ls1 mardyn, a simulation program for engineering applications. The text focuses strictly on HPC-related aspects, covering implementation on HPC architectures, taking Intel Xeon and Intel Xeon Phi clusters as representatives of current platforms. The work describes distributed and shared-memory parallelization on these platforms, including load balancing, with a particular focus on the efficient implementation of the compute kernels. The text also discusses the software-architecture of the resulting code.
Modeling of the charge-state separation at ITEP experimental facility for material science based on a Bernas ion source.

Science.gov (United States)

Barminova, H Y; Saratovskyh, M S

2016-02-01

The experiment automation system is supposed to be developed for experimental facility for material science at ITEP, based on a Bernas ion source. The program CAMFT is assumed to be involved into the program of the experiment automation. CAMFT is developed to simulate the intense charged particle bunch motion in the external magnetic fields with arbitrary geometry by means of the accurate solution of the particle motion equation. Program allows the consideration of the bunch intensity up to 10(10) ppb. Preliminary calculations are performed at ITEP supercomputer. The results of the simulation of the beam pre-acceleration and following turn in magnetic field are presented for different initial conditions.
SOME PARADIGMS OF ARTIFICIAL INTELLIGENCE IN FINANCIAL COMPUTER SYSTEMS

OpenAIRE

Jerzy Balicki

2015-01-01

The article discusses some paradigms of artificial intelligence in the context of their applications in computer financial systems. The proposed approach has a significant po-tential to increase the competitiveness of enterprises, including financial institutions. However, it requires the effective use of supercomputers, grids and cloud computing. A reference is made to the computing environment for Bitcoin. In addition, we characterized genetic programming and artificial neural networks to p...
Development of the real time monitor system

Energy Technology Data Exchange (ETDEWEB)

Kato, Katsumi [Research Organization for Information Science and Technology, Tokai, Ibaraki (Japan); Watanabe, Tadashi; Kaburaki, Hideo

1996-10-01

Large-scale simulation technique is studied at the Center for Promotion of Computational Science and Engineering (CCSE) for the computational science research in nuclear fields. Visualization and animation processing technique are studied and developed for efficient understanding of simulation results. The real time monitor system, in which on-going simulation results are transferred from a supercomputer or workstation to a graphic workstation and are visualized and recorded, is described in this report. This system is composed of the graphic workstation and the video equipment connected to the network. The control shell programs are the job-execution shell for simulations on supercomputers, the file-transfer shell for output files for visualization, and the shell for starting visualization tools. Special image processing technique and hardware are not necessary in this system and the standard visualization tool AVS and the UNIX commands are used, so that this system can be implemented and applied in various computer environments. (author)
Leveraging HPC resources for High Energy Physics

International Nuclear Information System (INIS)

O'Brien, B; Washbrook, A; Walker, R

2014-01-01

High Performance Computing (HPC) supercomputers provide unprecedented computing power for a diverse range of scientific applications. The most powerful supercomputers now deliver petaflop peak performance with the expectation of 'exascale' technologies available in the next five years. More recent HPC facilities use x86-based architectures managed by Linux-based operating systems which could potentially allow unmodified HEP software to be run on supercomputers. There is now a renewed interest from both the LHC experiments and the HPC community to accommodate data analysis and event simulation production on HPC facilities. This study provides an outline of the challenges faced when incorporating HPC resources for HEP software by using the HECToR supercomputer as a demonstrator.
High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2003-01-01

This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.
Installation of a new Fortran compiler and effective programming method on the vector supercomputer

International Nuclear Information System (INIS)

Nemoto, Toshiyuki; Suzuki, Koichiro; Watanabe, Kenji; Machida, Masahiko; Osanai, Seiji; Isobe, Nobuo; Harada, Hiroo; Yokokawa, Mitsuo

1992-07-01

The Fortran compiler, version 10 has been replaced with the new one, version 12 (V12) on the Fujitsu Computer system at JAERI since May, 1992. The benchmark test for the performance of the V12 compiler is carried out with 16 representative nuclear codes in advance of the installation of the compiler. The performance of the compiler is achieved by the factor of 1.13 in average. The effect of the enhanced functions of the compiler and the compatibility to the nuclear codes are also examined. The assistant tool for vectorization TOP10EX is developed. In this report, the results of the evaluation of the V12 compiler and the usage of the tools for vectorization are presented. (author)
Performance Analysis, Modeling and Scaling of HPC Applications and Tools

Energy Technology Data Exchange (ETDEWEB)

Bhatele, Abhinav [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2016-01-13

E cient use of supercomputers at DOE centers is vital for maximizing system throughput, mini- mizing energy costs and enabling science breakthroughs faster. This requires complementary e orts along several directions to optimize the performance of scienti c simulation codes and the under- lying runtimes and software stacks. This in turn requires providing scalable performance analysis tools and modeling techniques that can provide feedback to physicists and computer scientists developing the simulation codes and runtimes respectively. The PAMS project is using time allocations on supercomputers at ALCF, NERSC and OLCF to further the goals described above by performing research along the following fronts: 1. Scaling Study of HPC applications; 2. Evaluation of Programming Models; 3. Hardening of Performance Tools; 4. Performance Modeling of Irregular Codes; and 5. Statistical Analysis of Historical Performance Data. We are a team of computer and computational scientists funded by both DOE/NNSA and DOE/ ASCR programs such as ECRP, XStack (Traleika Glacier, PIPER), ExaOSR (ARGO), SDMAV II (MONA) and PSAAP II (XPACC). This allocation will enable us to study big data issues when analyzing performance on leadership computing class systems and to assist the HPC community in making the most e ective use of these resources.
Computational chemistry

Science.gov (United States)

Arnold, J. O.

1987-01-01

With the advent of supercomputers, modern computational chemistry algorithms and codes, a powerful tool was created to help fill NASA's continuing need for information on the properties of matter in hostile or unusual environments. Computational resources provided under the National Aerodynamics Simulator (NAS) program were a cornerstone for recent advancements in this field. Properties of gases, materials, and their interactions can be determined from solutions of the governing equations. In the case of gases, for example, radiative transition probabilites per particle, bond-dissociation energies, and rates of simple chemical reactions can be determined computationally as reliably as from experiment. The data are proving to be quite valuable in providing inputs to real-gas flow simulation codes used to compute aerothermodynamic loads on NASA's aeroassist orbital transfer vehicles and a host of problems related to the National Aerospace Plane Program. Although more approximate, similar solutions can be obtained for ensembles of atoms simulating small particles of materials with and without the presence of gases. Computational chemistry has application in studying catalysis, properties of polymers, all of interest to various NASA missions, including those previously mentioned. In addition to discussing these applications of computational chemistry within NASA, the governing equations and the need for supercomputers for their solution is outlined.
Oak Ridge National Laboratory Review

Energy Technology Data Exchange (ETDEWEB)

Krause, C.; Pearce, J.; Zucker, A. (eds.)

1992-01-01

This report presents brief descriptions of the following programs at Oak Ridge National Laboratory: The effects of pollution and climate change on forests; automation to improve the safety and efficiency of rearming battle tanks; new technologies for DNA sequencing; ORNL probes the human genome; ORNL as a supercomputer research center; paving the way to superconcrete made with polystyrene; a new look at supercritical water used in waste treatment; and small mammals as environmental monitors.
Oak Ridge National Laboratory Review. Volume 25, No. 1, 1992

Energy Technology Data Exchange (ETDEWEB)

Krause, C.; Pearce, J.; Zucker, A. [eds.

1992-10-01

This report presents brief descriptions of the following programs at Oak Ridge National Laboratory: The effects of pollution and climate change on forests; automation to improve the safety and efficiency of rearming battle tanks; new technologies for DNA sequencing; ORNL probes the human genome; ORNL as a supercomputer research center; paving the way to superconcrete made with polystyrene; a new look at supercritical water used in waste treatment; and small mammals as environmental monitors.
Survey and research on applications of parallel compiler; Heiretsu compiler tekiyorei ni kansuru chosa kenkyu hokokusho

Energy Technology Data Exchange (ETDEWEB)

NONE

2001-03-31

An urgent proposition is made that an advanced computing software program development and maintenance system be set up, and activities are conducted in search of strategies and guidelines for the establishment of such a system. Out of recognition that it is important to develop software programs such as operation systems for supercomputers, a survey is conducted of software technology development strategies especially involving application software programs. It is proposed that efforts be positively exerted to develop strategic software developing programs for advanced computing for concentratedly enhancing the development now under way of strategic software programs. In concrete terms, named as strategic software programs to be developed are a next-generation semiconductor TCAD (technology computer-aided design) system, protein structure/function analysis system, fatigue simulation system, next-generation fluid analysis system, chemical reaction simulator, grid computing, and a nanodevice surface analysis system. (NEDO)
A complete implementation of the conjugate gradient algorithm on a reconfigurable supercomputer

International Nuclear Information System (INIS)

Dubois, David H.; Dubois, Andrew J.; Connor, Carolyn M.; Boorman, Thomas M.; Poole, Stephen W.

2008-01-01

The conjugate gradient is a prominent iterative method for solving systems of sparse linear equations. Large-scale scientific applications often utilize a conjugate gradient solver at their computational core. In this paper we present a field programmable gate array (FPGA) based implementation of a double precision, non-preconditioned, conjugate gradient solver for fmite-element or finite-difference methods. OUf work utilizes the SRC Computers, Inc. MAPStation hardware platform along with the 'Carte' software programming environment to ease the programming workload when working with the hybrid (CPUIFPGA) environment. The implementation is designed to handle large sparse matrices of up to order N x N where N <= 116,394, with up to 7 non-zero, 64-bit elements per sparse row. This implementation utilizes an optimized sparse matrix-vector multiply operation which is critical for obtaining high performance. Direct parallel implementations of loop unrolling and loop fusion are utilized to extract performance from the various vector/matrix operations. Rather than utilize the FPGA devices as function off-load accelerators, our implementation uses the FPGAs to implement the core conjugate gradient algorithm. Measured run-time performance data is presented comparing the FPGA implementation to a software-only version showing that the FPGA can outperform processors running up to 30x the clock rate. In conclusion we take a look at the new SRC-7 system and estimate the performance of this algorithm on that architecture.
Dal CERN, flusso si dati a una media di 600 megabytes al secondo per dieci giorni consecutivi

CERN Multimedia

2005-01-01

The supercomputer Grid took up successfully its first technologic challenge. Egiht supercomputing centers have supported on internet a continuous flow of data from CERN in Geneva and directed them to seven centers in Europe and United States
SOME PARADIGMS OF ARTIFICIAL INTELLIGENCE IN FINANCIAL COMPUTER SYSTEMS

Directory of Open Access Journals (Sweden)

Jerzy Balicki

2015-12-01

Full Text Available The article discusses some paradigms of artificial intelligence in the context of their applications in computer financial systems. The proposed approach has a significant po-tential to increase the competitiveness of enterprises, including financial institutions. However, it requires the effective use of supercomputers, grids and cloud computing. A reference is made to the computing environment for Bitcoin. In addition, we characterized genetic programming and artificial neural networks to prepare investment strategies on the stock exchange market.
Performance Analysis of FEM Algorithmson GPU and Many-Core Architectures

KAUST Repository

Khurram, Rooh; Kortas, Samuel

2015-01-01

-only Exascale systems will be unsustainable, thus accelerators such as graphic processing units (GPUs) and many-integrated-core (MIC) will likely be the integral part of the TOP500 (http://www.top500.org/) supercomputers, beyond 2020. The emerging supercomputer

Hierarchical approach to optimization of parallel matrix multiplication on large-scale platforms

KAUST Repository

Hasanov, Khalid; Quintin, Jean-Noë l; Lastovetsky, Alexey

2014-01-01

-scale parallelism in mind. Indeed, while in 1990s a system with few hundred cores was considered a powerful supercomputer, modern top supercomputers have millions of cores. In this paper, we present a hierarchical approach to optimization of message-passing parallel
Massively parallel self-consistent-field calculations

International Nuclear Information System (INIS)

Tilson, J.L.

1994-01-01

The advent of supercomputers with many computational nodes each with its own independent memory makes possible extremely fast computations. The author's work, as part of the US High Performance Computing and Communications Program (HPCCP), is focused on the development of electronic structure techniques for the solution of Grand Challenge-size molecules containing hundreds of atoms. Their efforts have resulted in a fully scalable Direct-SCF program that is portable and efficient. This code, named NWCHEM, is built around a distributed-data model. This distributed data is managed by a software package called Global Arrays developed within the HPCCP. They present performance results for Direct-SCF calculations of interest to the consortium
Nevada may lose nuclear waste funds

International Nuclear Information System (INIS)

Marshall, E.

1988-01-01

The people of Nevada are concerned that a cut in DOE funding for a nuclear waste repository at Yucca Mountain, Nevada will result in cuts in the state monitoring program, e.g. dropping a seismic monitoring network and a sophisticated drilling program. Economic and social impact studies will be curtailed. Even though a provision to curtail local research forbids duplication of DOE's work and would limit the ability of Nevada to go out an collect its own data, Nevada State University at Las Vegas would receive a nice plum, a top-of-the-line supercomputer known as the ETA-10 costing almost $30 million financed by DOE
Computational Approaches to Simulation and Optimization of Global Aircraft Trajectories

Science.gov (United States)

Ng, Hok Kwan; Sridhar, Banavar

2016-01-01

This study examines three possible approaches to improving the speed in generating wind-optimal routes for air traffic at the national or global level. They are: (a) using the resources of a supercomputer, (b) running the computations on multiple commercially available computers and (c) implementing those same algorithms into NASAs Future ATM Concepts Evaluation Tool (FACET) and compares those to a standard implementation run on a single CPU. Wind-optimal aircraft trajectories are computed using global air traffic schedules. The run time and wait time on the supercomputer for trajectory optimization using various numbers of CPUs ranging from 80 to 10,240 units are compared with the total computational time for running the same computation on a single desktop computer and on multiple commercially available computers for potential computational enhancement through parallel processing on the computer clusters. This study also re-implements the trajectory optimization algorithm for further reduction of computational time through algorithm modifications and integrates that with FACET to facilitate the use of the new features which calculate time-optimal routes between worldwide airport pairs in a wind field for use with existing FACET applications. The implementations of trajectory optimization algorithms use MATLAB, Python, and Java programming languages. The performance evaluations are done by comparing their computational efficiencies and based on the potential application of optimized trajectories. The paper shows that in the absence of special privileges on a supercomputer, a cluster of commercially available computers provides a feasible approach for national and global air traffic system studies.
Migration of vectorized iterative solvers to distributed memory architectures

Energy Technology Data Exchange (ETDEWEB)

Pommerell, C. [AT& T Bell Labs., Murray Hill, NJ (United States); Ruehl, R. [CSCS-ETH, Manno (Switzerland)

1994-12-31

Both necessity and opportunity motivate the use of high-performance computers for iterative linear solvers. Necessity results from the size of the problems being solved-smaller problems are often better handled by direct methods. Opportunity arises from the formulation of the iterative methods in terms of simple linear algebra operations, even if this {open_quote}natural{close_quotes} parallelism is not easy to exploit in irregularly structured sparse matrices and with good preconditioners. As a result, high-performance implementations of iterative solvers have attracted a lot of interest in recent years. Most efforts are geared to vectorize or parallelize the dominating operation-structured or unstructured sparse matrix-vector multiplication, or to increase locality and parallelism by reformulating the algorithm-reducing global synchronization in inner products or local data exchange in preconditioners. Target architectures for iterative solvers currently include mostly vector supercomputers and architectures with one or few optimized (e.g., super-scalar and/or super-pipelined RISC) processors and hierarchical memory systems. More recently, parallel computers with physically distributed memory and a better price/performance ratio have been offered by vendors as a very interesting alternative to vector supercomputers. However, programming comfort on such distributed memory parallel processors (DMPPs) still lags behind. Here the authors are concerned with iterative solvers and their changing computing environment. In particular, they are considering migration from traditional vector supercomputers to DMPPs. Application requirements force one to use flexible and portable libraries. They want to extend the portability of iterative solvers rather than reimplementing everything for each new machine, or even for each new architecture.
Parallel implementation of the PHOENIX generalized stellar atmosphere program. II. Wavelength parallelization

International Nuclear Information System (INIS)

Baron, E.; Hauschildt, Peter H.

1998-01-01

We describe an important addition to the parallel implementation of our generalized nonlocal thermodynamic equilibrium (NLTE) stellar atmosphere and radiative transfer computer program PHOENIX. In a previous paper in this series we described data and task parallel algorithms we have developed for radiative transfer, spectral line opacity, and NLTE opacity and rate calculations. These algorithms divided the work spatially or by spectral lines, that is, distributing the radial zones, individual spectral lines, or characteristic rays among different processors and employ, in addition, task parallelism for logically independent functions (such as atomic and molecular line opacities). For finite, monotonic velocity fields, the radiative transfer equation is an initial value problem in wavelength, and hence each wavelength point depends upon the previous one. However, for sophisticated NLTE models of both static and moving atmospheres needed to accurately describe, e.g., novae and supernovae, the number of wavelength points is very large (200,000 - 300,000) and hence parallelization over wavelength can lead both to considerable speedup in calculation time and the ability to make use of the aggregate memory available on massively parallel supercomputers. Here, we describe an implementation of a pipelined design for the wavelength parallelization of PHOENIX, where the necessary data from the processor working on a previous wavelength point is sent to the processor working on the succeeding wavelength point as soon as it is known. Our implementation uses a MIMD design based on a relatively small number of standard message passing interface (MPI) library calls and is fully portable between serial and parallel computers. copyright 1998 The American Astronomical Society
The GF11 supercomputer

International Nuclear Information System (INIS)

Beetem, J.; Weingarten, D.

1986-01-01

GF11 is a parallel computer currently under construction at the IBM Yorktown Research Center. The machine incorporates 576 floating-point processors arrangedin a modified SIMD architecture. Each has space for 2 Mbytes of memory and is capable of 20 Mflops, giving the total machine a peak of 1.125 Gbytes of memory and 11.52 Gflops. The floating-point processors are interconnected by a dynamically reconfigurable non-blocking switching network. At each machine cycle any of 1024 pre-selected permutations of data can be realized among the processors. The main intended application of GF11 is a class of calculations arising from quantum chromodynamics
The GF11 supercomputer

International Nuclear Information System (INIS)

Beetem, J.; Denneau, M.; Weingarten, D.

1985-01-01

GF11 is a parallel computer currently under construction at the IBM Yorktown Research Center. The machine incorporates 576 floating- point processors arranged in a modified SIMD architecture. Each has space for 2 Mbytes of memory and is capable of 20 Mflops, giving the total machine a peak of 1.125 Gbytes of memory and 11.52 Gflops. The floating-point processors are interconnected by a dynamically reconfigurable nonblocking switching network. At each machine cycle any of 1024 pre-selected permutations of data can be realized among the processors. The main intended application of GF11 is a class of calculations arising from quantum chromodynamics
The GF11 supercomputer

International Nuclear Information System (INIS)

Beetem, J.; Denneau, M.; Weingarten, D.

1985-01-01

GF11 is a parallel computer currently under construction at the Yorktown Research Center. The machine incorporates 576 floating-point processors arranged in a modified SIMD architecture. Each processor has space for 2 Mbytes of memory and is capable of 20 MFLOPS, giving the total machine a peak of 1.125 Gbytes of memory and 11.52 GFLOPS. The floating-point processors are interconnected by a dynamically reconfigurable non-blocking switching network. At each machine cycle any of 1024 pre-selected permutations of data can be realized among the processors. The main intended application of GF11 is a class of calculations arising from quantum chromodynamics, a proposed theory of the elementary particles which participate in nuclear interactions
Easy Access to HPC Resources through the Application GUI

KAUST Repository

van Waveren, Matthijs

2016-11-01

The computing environment at the King Abdullah University of Science and Technology (KAUST) is growing in size and complexity. KAUST hosts the tenth fastest supercomputer in the world (Shaheen II) and several HPC clusters. Researchers can be inhibited by the complexity, as they need to learn new languages and execute many tasks in order to access the HPC clusters and the supercomputer. In order to simplify the access, we have developed an interface between the applications and the clusters and supercomputer that automates the transfer of input data and job submission and also the retrieval of results to the researcher’s local workstation. The innovation is that the user now submits his jobs from within the application GUI on his workstation, and does not have to directly log into the clusters or supercomputer anymore. This article details the solution and its benefits to the researchers.
Computational mechanics - Advances and trends; Proceedings of the Session - Future directions of Computational Mechanics of the ASME Winter Annual Meeting, Anaheim, CA, Dec. 7-12, 1986

Science.gov (United States)

Noor, Ahmed K. (Editor)

1986-01-01

The papers contained in this volume provide an overview of the advances made in a number of aspects of computational mechanics, identify some of the anticipated industry needs in this area, discuss the opportunities provided by new hardware and parallel algorithms, and outline some of the current government programs in computational mechanics. Papers are included on advances and trends in parallel algorithms, supercomputers for engineering analysis, material modeling in nonlinear finite-element analysis, the Navier-Stokes computer, and future finite-element software systems.
Can Complex Collective Behaviour Be Generated Through Randomness, Memory and a Pinch of Luck?

OpenAIRE

Pereira, Pedro M. F.

2017-01-01

Machine Learning techniques have been used to teach computer programs how to play games as complicated as Chess and Go. These were achieved using powerful tools such as Neural Networks and Parallel Computing on Supercomputers. In this paper, we define a model of populational growth and evolution based on the idea of Reinforcement Learning, but using only the 3 sources stated in the title processed on a low-tier laptop. The model correctly predicts the development of a population around food s...
Building the Teraflops/Petabytes Production Computing Center

International Nuclear Information System (INIS)

Kramer, William T.C.; Lucas, Don; Simon, Horst D.

1999-01-01

In just one decade, the 1990s, supercomputer centers have undergone two fundamental transitions which require rethinking their operation and their role in high performance computing. The first transition in the early to mid-1990s resulted from a technology change in high performance computing architecture. Highly parallel distributed memory machines built from commodity parts increased the operational complexity of the supercomputer center, and required the introduction of intellectual services as equally important components of the center. The second transition is happening in the late 1990s as centers are introducing loosely coupled clusters of SMPs as their premier high performance computing platforms, while dealing with an ever-increasing volume of data. In addition, increasing network bandwidth enables new modes of use of a supercomputer center, in particular, computational grid applications. In this paper we describe what steps NERSC is taking to address these issues and stay at the leading edge of supercomputing centers.; N
Exascale Data Analysis

CERN Multimedia

CERN. Geneva; Fitch, Blake

2011-01-01

Traditionaly, the primary role of supercomputers was to create data, primarily for simulation applications. Due to usage and technology trends, supercomputers are increasingly also used for data analysis. Some of this data is from simulations, but there is also a rapidly increasingly amount of real-world science and business data to be analyzed. We briefly overview Blue Gene and other current supercomputer architectures. We outline future architectures, up to the Exascale supercomputers expected in the 2020 time frame. We focus on the data analysis challenges and opportunites, especially those concerning Flash and other up-and-coming storage class memory. About the speakers Blake G. Fitch has been with IBM Research, Yorktown Heights, NY since 1987, mainly pursuing interests in parallel systems. He joined the Scalable Parallel Systems Group in 1990, contributing to research and development that culminated in the IBM scalable parallel system (SP*) product. His research interests have focused on applicatio...
Los Alamos Nuclear Plant Analyzer: an interactive power-plant simulation program

International Nuclear Information System (INIS)

Steinke, R.; Booker, C.; Giguere, P.; Liles, D.R.; Mahaffy, J.H.; Turner, M.R.

1984-01-01

The Nuclear Plant Analyzer (NPA) is a computer-software interface for executing the TRAC or RELAP5 power-plant systems codes. The NPA is designed to use advanced supercomputers, long-distance data communications, and a remote workstation terminal with interactive computer graphics to analyze power-plant thermal-hydraulic behavior. The NPA interface simplifies the running of these codes through automated procedures and dialog interaction. User understanding of simulated-plant behavior is enhanced through graphics displays of calculational results. These results are displayed concurrently with the calculation. The user has the capability to override the plant's modeled control system with hardware-adjustment commands. This gives the NPA the utility of a simulator, and at the same time, the accuracy of an advanced, best-estimate, power-plant systems code for plant operation and safety analysis
Nuclear Plant Analyzer: an interactive TRAC/RELAP Power-Plant Simulation Program

International Nuclear Information System (INIS)

Steinke, R.; Booker, C.; Giguere, P.; Liles, D.; Mahaffy, J.; Turner, M.; Wiley, R.

1984-01-01

The Nuclear Plant Analyzer (NPA) is a computer-software interface for executing the TRAC or RELAP5 power-plant systems codes. The NPA is designed to use advanced supercomputers, long-distance data communications, and a remote workstation terminal with interactive computer graphics to analyze power-plant thermal-hydraulic behavior. The NPA interface simplifies the running of these codes through automated procedures and dialog interaction. User understanding of simulated-plant behavior is enhanced through graphics displays of calculational results. These results are displayed concurrently with the calculation. The user has the capability to override the plant's modeled control system with hardware adjustment commands. This gives the NPA the utility of a simulator, and at the same time, the accuracy of an advanced, best-estimate, power-plant systems code for plant operation and safety analysis
GRID and FMPhI-UNIBA

International Nuclear Information System (INIS)

Babik, M.; Daranyi, T.; Fekete, V.; Stavina, P.; Zagiba, M.; Zenis, T.

2008-01-01

The word GRID has several meanings, so it is not an abbreviation. All of them have in common description of GRID as a form of hardware and software and software solution for distributive computing. Additionally, word GRID is also used for distributive computing of many computers and not one super computer with several processors. It, of course, does not mean that such a supercomputer cannot be a part of the GRID. Typical task for GRID is computer programs execution and to data storage. (Authors)
Plasma physics on the TI-85 calculator or Down with supercomputers

International Nuclear Information System (INIS)

Sedlacek, Z.

1998-10-01

In the Fourier transformed velocity space the Vlasov plasma oscillations may be interpreted as a wave propagation process corresponding to an imperfectly trapped (leaking) wave. The Landau damped solutions of the Vlasov-Poisson equation then become genuine Eigenmodes corresponding to complex eigenvalues. To illustrate this new interpretation we solve numerically the Fourier transformed Vlasov-Poisson equation, essentially a perturbed advective equation, on the TI-85 pocket graphics calculator. A program is described, based on the method of lines: A finite-difference scheme is utilized to discretize the transformed equation and the resulting set of ordinary differential equations is then solved in time. The user can choose from several possible finite-difference differentiation schemes differing in the total number of points and the number of downwind points. The resulting evolution of the electric field showing the Landau damped plasma oscillations is displayed on the screen of the calculator. In addition, calculation of the eigenvalues of the Fourier transformed Vlasov-Poisson operator is possible. The user can also experiment with the numerical solution of the advective equation which describes free streaming. (author)
[Supercomputer investigation of the protein-ligand system low-energy minima].

Science.gov (United States)

Oferkin, I V; Sulimov, A V; Katkova, E V; Kutov, D K; Grigoriev, F V; Kondakova, O A; Sulimov, V B

2015-01-01

The accuracy of the protein-ligand binding energy calculations and ligand positioning is strongly influenced by the choice of the docking target function. This work demonstrates the evaluation of the five different target functions used in docking: functions based on MMFF94 force field and functions based on PM7 quantum-chemical method accounting or without accounting the implicit solvent model (PCM, COSMO or SGB). For these purposes the ligand positions corresponding to the minima of the target function and the experimentally known ligand positions in the protein active site (crystal ligand positions) were compared. Each function was examined on the same test-set of 16 protein-ligand complexes. The new parallelized docking program FLM based on Monte Carlo search algorithm was developed to perform the comprehensive low-energy minima search and to calculate the protein-ligand binding energy. This study demonstrates that the docking target function based on the MMFF94 force field can be used to detect the crystal or near crystal positions of the ligand by the finding the low-energy local minima spectrum of the target function. The importance of solvent accounting in the docking process for the accurate ligand positioning is also shown. The accuracy of the ligand positioning as well as the correlation between the calculated and experimentally determined protein-ligand binding energies are improved when the MMFF94 force field is substituted by the new PM7 method with implicit solvent accounting.
Fireballs in the Sky: An Augmented Reality Citizen Science Program

Science.gov (United States)

Day, Brian

2017-01-01

Fireballs in the Sky is an innovative Australian citizen science program that connects the public with the research of the Desert Fireball Network (DFN). This research aims to understand the early workings of the solar system, and Fireballs in the Sky invites people around the world to learn about this science, contributing fireball sightings via a user-friendly augmented reality mobile app. Tens of thousands of people have downloaded the app world-wide and participated in the science of meteoritics. The Fireballs in the Sky app allows users to get involved with the Desert Fireball Network research, supplementing DFN observations and providing enhanced coverage by reporting their own meteor sightings to DFN scientists. Fireballs in the Sky reports are used to track the trajectories of meteors - from their orbit in space to where they might have landed on Earth. Led by Phil Bland at Curtin University in Australia, the Desert Fireball Network (DFN) uses automated observatories across Australia to triangulate trajectories of meteorites entering the atmosphere, determine pre-entry orbits, and pinpoint their fall positions. Each observatory is an autonomous intelligent imaging system, taking 1000 by 36 megapixel all-sky images throughout the night, using neural network algorithms to recognize events. They are capable of operating for 12 months in a harsh environment, and store all imagery collected. We developed a completely automated software pipeline for data reduction, and built a supercomputer database for storage, allowing us to process our entire archive. The DFN currently stands at 50 stations distributed across the Australian continent, covering an area of 2.5 million square kilometers. Working with DFN's partners at NASA's Solar System Exploration Research Virtual Institute, the team is expanding the network beyond Australia to locations around the world. Fireballs in the Sky allows a growing public base to learn about and participate in this exciting research.

Fireballs in the Sky: an Augmented Reality Citizen Science Program

Science.gov (United States)

Day, B. H.; Bland, P.; Sayers, R.

2017-12-01

Fireballs in the Sky is an innovative Australian citizen science program that connects the public with the research of the Desert Fireball Network (DFN). This research aims to understand the early workings of the solar system, and Fireballs in the Sky invites people around the world to learn about this science, contributing fireball sightings via a user-friendly augmented reality mobile app. Tens of thousands of people have downloaded the app world-wide and participated in the science of meteoritics. The Fireballs in the Sky app allows users to get involved with the Desert Fireball Network research, supplementing DFN observations and providing enhanced coverage by reporting their own meteor sightings to DFN scientists. Fireballs in the Sky reports are used to track the trajectories of meteors - from their orbit in space to where they might have landed on Earth. Led by Phil Bland at Curtin University in Australia, the Desert Fireball Network (DFN) uses automated observatories across Australia to triangulate trajectories of meteorites entering the atmosphere, determine pre-entry orbits, and pinpoint their fall positions. Each observatory is an autonomous intelligent imaging system, taking 1000×36Megapixel all-sky images throughout the night, using neural network algorithms to recognize events. They are capable of operating for 12 months in a harsh environment, and store all imagery collected. We developed a completely automated software pipeline for data reduction, and built a supercomputer database for storage, allowing us to process our entire archive. The DFN currently stands at 50 stations distributed across the Australian continent, covering an area of 2.5 million km^2. Working with DFN's partners at NASA's Solar System Exploration Research Virtual Institute, the team is expanding the network beyond Australia to locations around the world. Fireballs in the Sky allows a growing public base to learn about and participate in this exciting research.
ASTEC: Controls analysis for personal computers

Science.gov (United States)

Downing, John P.; Bauer, Frank H.; Thorpe, Christopher J.

1989-01-01

The ASTEC (Analysis and Simulation Tools for Engineering Controls) software is under development at Goddard Space Flight Center (GSFC). The design goal is to provide a wide selection of controls analysis tools at the personal computer level, as well as the capability to upload compute-intensive jobs to a mainframe or supercomputer. The project is a follow-on to the INCA (INteractive Controls Analysis) program that has been developed at GSFC over the past five years. While ASTEC makes use of the algorithms and expertise developed for the INCA program, the user interface was redesigned to take advantage of the capabilities of the personal computer. The design philosophy and the current capabilities of the ASTEC software are described.
Institutional Research and Development: (Annual report), FY 1986

Energy Technology Data Exchange (ETDEWEB)

Strack, B. (ed.)

1987-01-01

The Institutional Research and Development (IR and D) program was established at the Lawrence Livermore National Laboratory (LLNL) by the Director in October 1984. The IR and D program fosters exploratory work to advance science and technology; disciplinary research to create varied, innovative approaches to selected scientific fields; and long-term research in support of the defense and energy missions at LLNL. Each project in the IR and D program was selected after personal interviews by the Director and his delegates and was deemed to show unusual promise. These projects include research in the following fields: chemistry and materials science, computation, earth sciences, engineering, nuclear chemistry, biotechnology, environmental consequences of nuclear war, geophysics and planetary physics, and supercomputer research and development. A separate section of the report is devoted to research projects receiving individual awards.
Institutional Research and Development: [Annual report], FY 1986

International Nuclear Information System (INIS)

Strack, B.

1987-01-01

The Institutional Research and Development (IR and D) program was established at the Lawrence Livermore National Laboratory (LLNL) by the Director in October 1984. The IR and D program fosters exploratory work to advance science and technology; disciplinary research to create varied, innovative approaches to selected scientific fields; and long-term research in support of the defense and energy missions at LLNL. Each project in the IR and D program was selected after personal interviews by the Director and his delegates and was deemed to show unusual promise. These projects include research in the following fields: chemistry and materials science, computation, earth sciences, engineering, nuclear chemistry, biotechnology, environmental consequences of nuclear war, geophysics and planetary physics, and supercomputer research and development. A separate section of the report is devoted to research projects receiving individual awards
Designing a Scalable Fault Tolerance Model for High Performance Computational Chemistry: A Case Study with Coupled Cluster Perturbative Triples.

Science.gov (United States)

van Dam, Hubertus J J; Vishnu, Abhinav; de Jong, Wibe A

2011-01-11

In the past couple of decades, the massive computational power provided by the most modern supercomputers has resulted in simulation of higher-order computational chemistry methods, previously considered intractable. As the system sizes continue to increase, the computational chemistry domain continues to escalate this trend using parallel computing with programming models such as Message Passing Interface (MPI) and Partitioned Global Address Space (PGAS) programming models such as Global Arrays. The ever increasing scale of these supercomputers comes at a cost of reduced Mean Time Between Failures (MTBF), currently on the order of days and projected to be on the order of hours for upcoming extreme scale systems. While traditional disk-based check pointing methods are ubiquitous for storing intermediate solutions, they suffer from high overhead of writing and recovering from checkpoints. In practice, checkpointing itself often brings the system down. Clearly, methods beyond checkpointing are imperative to handling the aggravating issue of reducing MTBF. In this paper, we address this challenge by designing and implementing an efficient fault tolerant version of the Coupled Cluster (CC) method with NWChem, using in-memory data redundancy. We present the challenges associated with our design, including an efficient data storage model, maintenance of at least one consistent data copy, and the recovery process. Our performance evaluation without faults shows that the current design exhibits a small overhead. In the presence of a simulated fault, the proposed design incurs negligible overhead in comparison to the state of the art implementation without faults.
Proton decay: Numerical simulations confront grand unification

International Nuclear Information System (INIS)

Brower, R.C.; Maturana, G.; Giles, R.C.; Moriarty, K.J.M.; Samuel, S.

1985-01-01

The Grand Unified Theories of the electromagnetic, weak and strong interactions constitute a far reaching attempt to synthesize our knowledge of theoretical particle physics into a consistent and compelling whole. Unfortunately, many quantitative predictions of such unified theories are sensitive to the analytically intractible effects of the strong subnuclear theory (Quantum Chromodynamics or QCD). The consequence is that even ambitious experimental programs exploring weak and super-weak interaction effects often fail to give definitive theoretical tests. This paper describes large-scale calculations on a supercomputer which can help to overcome this gap between theoretical predictions and experimental results. Our focus here is on proton decay, though the methods described are useful for many weak processes. The basic algorithms for the numerical simulation of QCD are well known. We will discuss the advantages and challenges of applying these methods to weak transitions. The algorithms require a very large data base with regular data flow and are natural candidates for vectorization. Also, 32-bit floating point arithmetic is adequate. Thus they are most naturally approached using a supercomputer alone or in combination with a dedicated special purpose processor. (orig.)
Virtual laboratory for fusion research in Japan

International Nuclear Information System (INIS)

Tsuda, K.; Nagayama, Y.; Yamamoto, T.; Horiuchi, R.; Ishiguro, S.; Takami, S.

2008-01-01

A virtual laboratory system for nuclear fusion research in Japan has been developed using SuperSINET, which is a super high-speed network operated by National Institute of Informatics. Sixteen sites including major Japanese universities, Japan Atomic Energy Agency and National Institute for Fusion Science (NIFS) are mutually connected to SuperSINET with the speed of 1 Gbps by the end of 2006 fiscal year. Collaboration categories in this virtual laboratory are as follows: the large helical device (LHD) remote participation; the remote use of supercomputer system; and the all Japan ST (Spherical Tokamak) research program. This virtual laboratory is a closed network system, and is connected to the Internet through the NIFS firewall in order to keep higher security. Collaborators in a remote station can control their diagnostic devices at LHD and analyze the LHD data as they were at the LHD control room. Researchers in a remote station can use the supercomputer of NIFS in the same environment as NIFS. In this paper, we will describe detail of technologies and the present status of the virtual laboratory. Furthermore, the items that should be developed in the near future are also described
Advances in petascale kinetic plasma simulation with VPIC and Roadrunner

Energy Technology Data Exchange (ETDEWEB)

Bowers, Kevin J [Los Alamos National Laboratory; Albright, Brian J [Los Alamos National Laboratory; Yin, Lin [Los Alamos National Laboratory; Daughton, William S [Los Alamos National Laboratory; Roytershteyn, Vadim [Los Alamos National Laboratory; Kwan, Thomas J T [Los Alamos National Laboratory

2009-01-01

VPIC, a first-principles 3d electromagnetic charge-conserving relativistic kinetic particle-in-cell (PIC) code, was recently adapted to run on Los Alamos's Roadrunner, the first supercomputer to break a petaflop (10{sup 15} floating point operations per second) in the TOP500 supercomputer performance rankings. They give a brief overview of the modeling capabilities and optimization techniques used in VPIC and the computational characteristics of petascale supercomputers like Roadrunner. They then discuss three applications enabled by VPIC's unprecedented performance on Roadrunner: modeling laser plasma interaction in upcoming inertial confinement fusion experiments at the National Ignition Facility (NIF), modeling short pulse laser GeV ion acceleration and modeling reconnection in magnetic confinement fusion experiments.
Patterns for Parallel Software Design

CERN Document Server

Ortega-Arjona, Jorge Luis

2010-01-01

Essential reading to understand patterns for parallel programming Software patterns have revolutionized the way we think about how software is designed, built, and documented, and the design of parallel software requires you to consider other particular design aspects and special skills. From clusters to supercomputers, success heavily depends on the design skills of software developers. Patterns for Parallel Software Design presents a pattern-oriented software architecture approach to parallel software design. This approach is not a design method in the classic sense, but a new way of managin
Computational approach to large quantum dynamical problems

International Nuclear Information System (INIS)

Friesner, R.A.; Brunet, J.P.; Wyatt, R.E.; Leforestier, C.; Binkley, S.

1987-01-01

The organizational structure is described for a new program that permits computations on a variety of quantum mechanical problems in chemical dynamics and spectroscopy. Particular attention is devoted to developing and using algorithms that exploit the capabilities of current vector supercomputers. A key component in this procedure is the recursive transformation of the large sparse Hamiltonian matrix into a much smaller tridiagonal matrix. An application to time-dependent laser molecule energy transfer is presented. Rate of energy deposition in the multimode molecule for systematic variations in the molecular intermode coupling parameters is emphasized
Advances in software science and technology

CERN Document Server

Hikita, Teruo; Kakuda, Hiroyasu

1993-01-01

Advances in Software Science and Technology, Volume 4 provides information pertinent to the advancement of the science and technology of computer software. This book discusses the various applications for computer systems.Organized into two parts encompassing 10 chapters, this volume begins with an overview of the historical survey of programming languages for vector/parallel computers in Japan and describes compiling methods for supercomputers in Japan. This text then explains the model of a Japanese software factory, which is presented by the logical configuration that has been satisfied by
Lawrence Livermore National Laboratory selects Intel Itanium 2 processors for world's most powerful Linux cluster

CERN Multimedia

2003-01-01

"Intel Corporation, system manufacturer California Digital and the University of California at Lawrence Livermore National Laboratory (LLNL) today announced they are building one of the world's most powerful supercomputers. The supercomputer project, codenamed "Thunder," uses nearly 4,000 IntelÂ® ItaniumÂ® 2 processors... is expected to be complete in January 2004" (1 page).
Molecular dynamics calculation of shear viscosity for molten salt

International Nuclear Information System (INIS)

Okamoto, Yoshihiro; Yokokawa, Mitsuo; Ogawa, Toru

1993-12-01

A computer program of molecular dynamics simulation has been made to calculate shear viscosity of molten salt. Correlation function for an off-diagonal component of stress tensor can be obtained as the results of calculation. Shear viscosity is calculated by integration of the correlation function based on the Kubo-type formula. Shear viscosities for a molten KCl ranging in temperature from 1047K to 1273K were calculated using the program. Calculation of 10 5 steps (1 step corresponds to 5 x 10 -15 s) was performed for each temperature in the 216 ions system. The obtained results were in good agreement with the reported experimental values. The program has been vectorized to achieve a faster computation in supercomputer. It makes possible to calculate the viscosity using a large number of statistics amounting to several million MD steps. (author)
Workstations take over conceptual design

Science.gov (United States)

Kidwell, George H.

1987-01-01

Workstations provide sufficient computing memory and speed for early evaluations of aircraft design alternatives to identify those worthy of further study. It is recommended that the programming of such machines permit integrated calculations of the configuration and performance analysis of new concepts, along with the capability of changing up to 100 variables at a time and swiftly viewing the results. Computations can be augmented through links to mainframes and supercomputers. Programming, particularly debugging operations, are enhanced by the capability of working with one program line at a time and having available on-screen error indices. Workstation networks permit on-line communication among users and with persons and computers outside the facility. Application of the capabilities is illustrated through a description of NASA-Ames design efforts for an oblique wing for a jet performed on a MicroVAX network.
Federated data storage system prototype for LHC experiments and data intensive science

Science.gov (United States)

Kiryanov, A.; Klimentov, A.; Krasnopevtsev, D.; Ryabinkin, E.; Zarochentsev, A.

2017-10-01

Rapid increase of data volume from the experiments running at the Large Hadron Collider (LHC) prompted physics computing community to evaluate new data handling and processing solutions. Russian grid sites and universities’ clusters scattered over a large area aim at the task of uniting their resources for future productive work, at the same time giving an opportunity to support large physics collaborations. In our project we address the fundamental problem of designing a computing architecture to integrate distributed storage resources for LHC experiments and other data-intensive science applications and to provide access to data from heterogeneous computing facilities. Studies include development and implementation of federated data storage prototype for Worldwide LHC Computing Grid (WLCG) centres of different levels and University clusters within one National Cloud. The prototype is based on computing resources located in Moscow, Dubna, Saint Petersburg, Gatchina and Geneva. This project intends to implement a federated distributed storage for all kind of operations such as read/write/transfer and access via WAN from Grid centres, university clusters, supercomputers, academic and commercial clouds. The efficiency and performance of the system are demonstrated using synthetic and experiment-specific tests including real data processing and analysis workflows from ATLAS and ALICE experiments, as well as compute-intensive bioinformatics applications (PALEOMIX) running on supercomputers. We present topology and architecture of the designed system, report performance and statistics for different access patterns and show how federated data storage can be used efficiently by physicists and biologists. We also describe how sharing data on a widely distributed storage system can lead to a new computing model and reformations of computing style, for instance how bioinformatics program running on supercomputers can read/write data from the federated storage.
Advanced Architectures for Astrophysical Supercomputing

Science.gov (United States)

Barsdell, B. R.; Barnes, D. G.; Fluke, C. J.

2010-12-01

Astronomers have come to rely on the increasing performance of computers to reduce, analyze, simulate and visualize their data. In this environment, faster computation can mean more science outcomes or the opening up of new parameter spaces for investigation. If we are to avoid major issues when implementing codes on advanced architectures, it is important that we have a solid understanding of our algorithms. A recent addition to the high-performance computing scene that highlights this point is the graphics processing unit (GPU). The hardware originally designed for speeding-up graphics rendering in video games is now achieving speed-ups of O(100×) in general-purpose computation - performance that cannot be ignored. We are using a generalized approach, based on the analysis of astronomy algorithms, to identify the optimal problem-types and techniques for taking advantage of both current GPU hardware and future developments in computing architectures.
Supercomputer requirements for theoretical chemistry

International Nuclear Information System (INIS)

Walker, R.B.; Hay, P.J.; Galbraith, H.W.

1980-01-01

Many problems important to the theoretical chemist would, if implemented in their full complexity, strain the capabilities of today's most powerful computers. Several such problems are now being implemented on the CRAY-1 computer at Los Alamos. Examples of these problems are taken from the fields of molecular electronic structure calculations, quantum reactive scattering calculations, and quantum optics. 12 figures
THE LOS ALAMOS NATIONAL LABORATORY ATMOSPHERIC TRANSPORT AND DIFFUSION MODELS

Energy Technology Data Exchange (ETDEWEB)

M. WILLIAMS [and others

1999-08-01

The LANL atmospheric transport and diffusion models are composed of two state-of-the-art computer codes. The first is an atmospheric wind model called HOThlAC, Higher Order Turbulence Model for Atmospheric circulations. HOTMAC generates wind and turbulence fields by solving a set of atmospheric dynamic equations. The second is an atmospheric diffusion model called RAPTAD, Random Particle Transport And Diffusion. RAPTAD uses the wind and turbulence output from HOTMAC to compute particle trajectories and concentration at any location downwind from a source. Both of these models, originally developed as research codes on supercomputers, have been modified to run on microcomputers. Because the capability of microcomputers is advancing so rapidly, the expectation is that they will eventually become as good as today's supercomputers. Now both models are run on desktop or deskside computers, such as an IBM PC/AT with an Opus Pm 350-32 bit coprocessor board and a SUN workstation. Codes have also been modified so that high level graphics, NCAR Graphics, of the output from both models are displayed on the desktop computer monitors and plotted on a laser printer. Two programs, HOTPLT and RAPLOT, produce wind vector plots of the output from HOTMAC and particle trajectory plots of the output from RAPTAD, respectively. A third CONPLT provides concentration contour plots. Section II describes step-by-step operational procedures, specifically for a SUN-4 desk side computer, on how to run main programs HOTMAC and RAPTAD, and graphics programs to display the results. Governing equations, boundary conditions and initial values of HOTMAC and RAPTAD are discussed in Section III. Finite-difference representations of the governing equations, numerical solution procedures, and a grid system are given in Section IV.
Development of in-situ visualization tool for PIC simulation

International Nuclear Information System (INIS)

Ohno, Nobuaki; Ohtani, Hiroaki

2014-01-01

As the capability of a supercomputer is improved, the sizes of simulation and its output data also become larger and larger. Visualization is usually carried out on a researcher's PC with interactive visualization software after performing the computer simulation. However, the data size is becoming too large to do it currently. A promising answer is in-situ visualization. For this case a simulation code is coupled with the visualization code and visualization is performed with the simulation on the same supercomputer. We developed an in-situ visualization tool for particle-in-cell (PIC) simulation and it is provided as a Fortran's module. We coupled it with a PIC simulation code and tested the coupled code on Plasma Simulator supercomputer, and ensured that it works. (author)
Tracking of macroscopic particle motions generated by a turbulent wind via digital image analysis

Science.gov (United States)

Ciccone, A. D.; Kawall, J. G.; Keffer, J. F.

A novel technique utilizing the basic principles of two-dimensional signal analysis and artificial intelligence/computer vision to reconstruct the Lagrangian particle trajectories from flow visualization images of macroparticle motions in a turbulent boundary layer is presented. Since, in most cases, the entire trajectory of a particle could not be viewed in one photographic frame (the particles were moving at a high velocity over a small field of view), a stochastic model was developed to complete the trajectories and obtain statistical data on particle velocities. The associated programs were implemented on a Cray supercomputer to optimize computational costs and time.

Radio Astronomy at the Centre for High Performance Computing in South Africa

Science.gov (United States)

Catherine Cress; UWC Simulation Team

2014-04-01

I will present results on galaxy evolution and cosmology which we obtained using the supercomputing facilities at the CHPC. These include cosmological-scale N-body simulations modelling neutral hydrogen as well as the study of the clustering of radio galaxies to probe the relationship between dark and luminous matter in the universe. I will also discuss the various roles that the CHPC is playing in Astronomy in SA, including the provision of HPC for a variety of Astronomical applications, the provision of storage for radio data, our educational programs and our participation in planning for the SKA.
Performance Assessment Institute-NV

Energy Technology Data Exchange (ETDEWEB)

Lombardo, Joesph [Univ. of Nevada, Las Vegas, NV (United States)

2012-12-31

The National Supercomputing Center for Energy and the Environment’s intention is to purchase a multi-purpose computer cluster in support of the Performance Assessment Institute (PA Institute). The PA Institute will serve as a research consortium located in Las Vegas Nevada with membership that includes: national laboratories, universities, industry partners, and domestic and international governments. This center will provide a one-of-a-kind centralized facility for the accumulation of information for use by Institutions of Higher Learning, the U.S. Government, and Regulatory Agencies and approved users. This initiative will enhance and extend High Performance Computing (HPC) resources in Nevada to support critical national and international needs in "scientific confirmation". The PA Institute will be promoted as the leading Modeling, Learning and Research Center worldwide. The program proposes to utilize the existing supercomputing capabilities and alliances of the University of Nevada Las Vegas as a base, and to extend these resource and capabilities through a collaborative relationship with its membership. The PA Institute will provide an academic setting for interactive sharing, learning, mentoring and monitoring of multi-disciplinary performance assessment and performance confirmation information. The role of the PA Institute is to facilitate research, knowledge-increase, and knowledge-sharing among users.
Energy consumption optimization of the total-FETI solver by changing the CPU frequency

Science.gov (United States)

Horak, David; Riha, Lubomir; Sojka, Radim; Kruzik, Jakub; Beseda, Martin; Cermak, Martin; Schuchart, Joseph

2017-07-01

The energy consumption of supercomputers is one of the critical problems for the upcoming Exascale supercomputing era. The awareness of power and energy consumption is required on both software and hardware side. This paper deals with the energy consumption evaluation of the Finite Element Tearing and Interconnect (FETI) based solvers of linear systems, which is an established method for solving real-world engineering problems. We have evaluated the effect of the CPU frequency on the energy consumption of the FETI solver using a linear elasticity 3D cube synthetic benchmark. In this problem, we have evaluated the effect of frequency tuning on the energy consumption of the essential processing kernels of the FETI method. The paper provides results for two types of frequency tuning: (1) static tuning and (2) dynamic tuning. For static tuning experiments, the frequency is set before execution and kept constant during the runtime. For dynamic tuning, the frequency is changed during the program execution to adapt the system to the actual needs of the application. The paper shows that static tuning brings up 12% energy savings when compared to default CPU settings (the highest clock rate). The dynamic tuning improves this further by up to 3%.
Computing element evolution towards Exascale and its impact on legacy simulation codes

International Nuclear Information System (INIS)

Colin de Verdiere, Guillaume J.L.

2015-01-01

In the light of the current race towards the Exascale, this article highlights the main features of the forthcoming computing elements that will be at the core of next generations of supercomputers. The market analysis, underlying this work, shows that computers are facing a major evolution in terms of architecture. As a consequence, it is important to understand the impacts of those evolutions on legacy codes or programming methods. The problems of dissipated power and memory access are discussed and will lead to a vision of what should be an exascale system. To survive, programming languages had to respond to the hardware evolutions either by evolving or with the creation of new ones. From the previous elements, we elaborate why vectorization, multithreading, data locality awareness and hybrid programming will be the key to reach the exascale, implying that it is time to start rewriting codes. (orig.)
Automation of Data Traffic Control on DSM Architecture

Science.gov (United States)

Frumkin, Michael; Jin, Hao-Qiang; Yan, Jerry

2001-01-01

The design of distributed shared memory (DSM) computers liberates users from the duty to distribute data across processors and allows for the incremental development of parallel programs using, for example, OpenMP or Java threads. DSM architecture greatly simplifies the development of parallel programs having good performance on a few processors. However, to achieve a good program scalability on DSM computers requires that the user understand data flow in the application and use various techniques to avoid data traffic congestions. In this paper we discuss a number of such techniques, including data blocking, data placement, data transposition and page size control and evaluate their efficiency on the NAS (NASA Advanced Supercomputing) Parallel Benchmarks. We also present a tool which automates the detection of constructs causing data congestions in Fortran array oriented codes and advises the user on code transformations for improving data traffic in the application.
Computing element evolution towards Exascale and its impact on legacy simulation codes

Science.gov (United States)

Colin de Verdière, Guillaume J. L.

2015-12-01

In the light of the current race towards the Exascale, this article highlights the main features of the forthcoming computing elements that will be at the core of next generations of supercomputers. The market analysis, underlying this work, shows that computers are facing a major evolution in terms of architecture. As a consequence, it is important to understand the impacts of those evolutions on legacy codes or programming methods. The problems of dissipated power and memory access are discussed and will lead to a vision of what should be an exascale system. To survive, programming languages had to respond to the hardware evolutions either by evolving or with the creation of new ones. From the previous elements, we elaborate why vectorization, multithreading, data locality awareness and hybrid programming will be the key to reach the exascale, implying that it is time to start rewriting codes.
Aviation Research and the Internet

Science.gov (United States)

Scott, Antoinette M.

1995-01-01

The Internet is a network of networks. It was originally funded by the Defense Advanced Research Projects Agency or DOD/DARPA and evolved in part from the connection of supercomputer sites across the United States. The National Science Foundation (NSF) made the most of their supercomputers by connecting the sites to each other. This made the supercomputers more efficient and now allows scientists, engineers and researchers to access the supercomputers from their own labs and offices. The high speed networks that connect the NSF supercomputers form the backbone of the Internet. The World Wide Web (WWW) is a menu system. It gathers Internet resources from all over the world into a series of screens that appear on your computer. The WWW is also a distributed. The distributed system stores data information on many computers (servers). These servers can go out and get data when you ask for it. Hypermedia is the base of the WWW. One can 'click' on a section and visit other hypermedia (pages). Our approach to demonstrating the importance of aviation research through the Internet began with learning how to put pages on the Internet (on-line) ourselves. We were assigned two aviation companies; Vision Micro Systems Inc. and Innovative Aerodynamic Technologies (IAT). We developed home pages for these SBIR companies. The equipment used to create the pages were the UNIX and Macintosh machines. HTML Supertext software was used to write the pages and the Sharp JX600S scanner to scan the images. As a result, with the use of the UNIX, Macintosh, Sun, PC, and AXIL machines, we were able to present our home pages to over 800,000 visitors.
PGHPF – An Optimizing High Performance Fortran Compiler for Distributed Memory Machines

Directory of Open Access Journals (Sweden)

Zeki Bozkus

1997-01-01

Full Text Available High Performance Fortran (HPF is the first widely supported, efficient, and portable parallel programming language for shared and distributed memory systems. HPF is realized through a set of directive-based extensions to Fortran 90. It enables application developers and Fortran end-users to write compact, portable, and efficient software that will compile and execute on workstations, shared memory servers, clusters, traditional supercomputers, or massively parallel processors. This article describes a production-quality HPF compiler for a set of parallel machines. Compilation techniques such as data and computation distribution, communication generation, run-time support, and optimization issues are elaborated as the basis for an HPF compiler implementation on distributed memory machines. The performance of this compiler on benchmark programs demonstrates that high efficiency can be achieved executing HPF code on parallel architectures.
Fast methods for long-range interactions in complex systems. Lecture notes

Energy Technology Data Exchange (ETDEWEB)

Sutmann, Godehard; Gibbon, Paul; Lippert, Thomas (eds.)

2011-10-13

Parallel computing and computer simulations of complex particle systems including charges have an ever increasing impact in a broad range of fields in the physical sciences, e.g. in astrophysics, statistical physics, plasma physics, material sciences, physical chemistry, and biophysics. The present summer school, funded by the German Heraeus-Foundation, took place at the Juelich Supercomputing Centre from 6 - 10 September 2010. The focus was on providing an introduction and overview over different methods, algorithms and new trends for the computational treatment of long-range interactions in particle systems. The Lecture Notes contain an introduction into particle simulation, as well as five different fast methods, i.e. the Fast Multipole Method, Barnes-Hut Tree Method, Multigrid, FFT based methods, and Fast Summation using the non-equidistant FFT. In addition to introducing the methods, efficient parallelization of the methods is presented in detail. This publication was edited at the Juelich Supercomputing Centre (JSC) which is an integral part of the Institute for Advanced Simulation (IAS). The IAS combines the Juelich simulation sciences and the supercomputer facility in one organizational unit. It includes those parts of the scientific institutes at Forschungszentrum Juelich which use simulation on supercomputers as their main research methodology. (orig.)
Fast methods for long-range interactions in complex systems. Lecture notes

International Nuclear Information System (INIS)

Sutmann, Godehard; Gibbon, Paul; Lippert, Thomas

2011-01-01

Parallel computing and computer simulations of complex particle systems including charges have an ever increasing impact in a broad range of fields in the physical sciences, e.g. in astrophysics, statistical physics, plasma physics, material sciences, physical chemistry, and biophysics. The present summer school, funded by the German Heraeus-Foundation, took place at the Juelich Supercomputing Centre from 6 - 10 September 2010. The focus was on providing an introduction and overview over different methods, algorithms and new trends for the computational treatment of long-range interactions in particle systems. The Lecture Notes contain an introduction into particle simulation, as well as five different fast methods, i.e. the Fast Multipole Method, Barnes-Hut Tree Method, Multigrid, FFT based methods, and Fast Summation using the non-equidistant FFT. In addition to introducing the methods, efficient parallelization of the methods is presented in detail. This publication was edited at the Juelich Supercomputing Centre (JSC) which is an integral part of the Institute for Advanced Simulation (IAS). The IAS combines the Juelich simulation sciences and the supercomputer facility in one organizational unit. It includes those parts of the scientific institutes at Forschungszentrum Juelich which use simulation on supercomputers as their main research methodology. (orig.)
The Fermilab Advanced Computer Program multi-array processor system (ACPMAPS): A site oriented supercomputer for theoretical physics

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

The ACP Multi-Array Processor System (ACPMAPS) is a highly cost effective, local memory parallel computer designed for floating point intensive grid based problems. The processing nodes of the system are single board array processors based on the FORTRAN and C programmable Weitek XL chip set. The nodes are connected by a network of very high bandwidth 16 port crossbar switches. The architecture is designed to achieve the highest possible cost effectiveness while maintaining a high level of programmability. The primary application of the machine at Fermilab will be lattice gauge theory. The hardware is supported by a transparent site oriented software system called CANOPY which shields theorist users from the underlying node structure. 4 refs., 2 figs
High performance computing, supercomputing, náročné počítání

Czech Academy of Sciences Publication Activity Database

Okrouhlík, Miloslav

2003-01-01

Roč. 10, č. 5 (2003), s. 429-438 ISSN 1210-2717 R&D Projects: GA ČR GA101/02/0072 Institutional research plan: CEZ:AV0Z2076919 Keywords : high performance computing * vector and parallel computers * programing tools for parellelization Subject RIV: BI - Acoustics
Algorithm comparison and benchmarking using a parallel spectra transform shallow water model

Energy Technology Data Exchange (ETDEWEB)

Worley, P.H. [Oak Ridge National Lab., TN (United States); Foster, I.T.; Toonen, B. [Argonne National Lab., IL (United States)

1995-04-01

In recent years, a number of computer vendors have produced supercomputers based on a massively parallel processing (MPP) architecture. These computers have been shown to be competitive in performance with conventional vector supercomputers for some applications. As spectral weather and climate models are heavy users of vector supercomputers, it is interesting to determine how these models perform on MPPS, and which MPPs are best suited to the execution of spectral models. The benchmarking of MPPs is complicated by the fact that different algorithms may be more efficient on different architectures. Hence, a comprehensive benchmarking effort must answer two related questions: which algorithm is most efficient on each computer and how do the most efficient algorithms compare on different computers. In general, these are difficult questions to answer because of the high cost associated with implementing and evaluating a range of different parallel algorithms on each MPP platform.
Parallelism in computations in quantum and statistical mechanics

International Nuclear Information System (INIS)

Clementi, E.; Corongiu, G.; Detrich, J.H.

1985-01-01

Often very fundamental biochemical and biophysical problems defy simulations because of limitations in today's computers. We present and discuss a distributed system composed of two IBM 4341 s and/or an IBM 4381 as front-end processors and ten FPS-164 attached array processors. This parallel system - called LCAP - has presently a peak performance of about 110 Mflops; extensions to higher performance are discussed. Presently, the system applications use a modified version of VM/SP as the operating system: description of the modifications is given. Three applications programs have been migrated from sequential to parallel: a molecular quantum mechanical, a Metropolis-Monte Carlo and a molecular dynamics program. Descriptions of the parallel codes are briefly outlined. Use of these parallel codes has already opened up new capabilities for our research. The very positive performance comparisons with today's supercomputers allow us to conclude that parallel computers and programming, of the type we have considered, represent a pragmatic answer to many computationally intensive problems. (orig.)
Studying Turbulence Using Numerical Simulation Databases. No. 7; Proceedings of the Summer Program

Science.gov (United States)

1998-01-01

The Seventh Summer Program of the Center for Turbulence Research took place in the four-week period, July 5 to July 31, 1998. This was the largest CTR Summer Program to date, involving thirty-six participants from the U. S. and nine other countries. Thirty-one Stanford and NASA-Ames staff members facilitated and contributed to most of the Summer projects. A new feature, and perhaps a preview of the future programs, was that many of the projects were executed on non-NASA computers. These included supercomputers located in Europe as well as those operated by the Departments of Defense and Energy in the United States. In addition, several simulation programs developed by the visiting participants at their home institutions were used. Another new feature was the prevalence of lap-top personal computers which were used by several participants to carry out some of the work that in the past were performed on desk-top workstations. We expect these trends to continue as computing power is enhanced and as more researchers (many of whom CTR alumni) use numerical simulations to study turbulent flows. CTR's main role continues to be in providing a forum for the study of turbulence for engineering analysis and in facilitating intellectual exchange among the leading researchers in the field. Once again the combustion group was the largest. Turbulent combustion has enjoyed remarkable progress in using simulations to address increasingly complex and practically more relevant questions. The combustion group's studies included such challenging topics as fuel evaporation, soot chemistry, and thermonuclear reactions. The latter study was one of three projects related to the Department of Energy's ASCI Program (www.llnl.gov/asci); the other two (rocket propulsion and fire safety) were carried out in the turbulence modeling group. The flow control and acoustics group demonstrated a successful application of the so-called evolution algorithms which actually led to a previously unknown
Parallel-Vector Algorithm For Rapid Structural Anlysis

Science.gov (United States)

Agarwal, Tarun R.; Nguyen, Duc T.; Storaasli, Olaf O.

1993-01-01

New algorithm developed to overcome deficiency of skyline storage scheme by use of variable-band storage scheme. Exploits both parallel and vector capabilities of modern high-performance computers. Gives engineers and designers opportunity to include more design variables and constraints during optimization of structures. Enables use of more refined finite-element meshes to obtain improved understanding of complex behaviors of aerospace structures leading to better, safer designs. Not only attractive for current supercomputers but also for next generation of shared-memory supercomputers.
Scalability of DL_POLY on High Performance Computing Platform

CSIR Research Space (South Africa)

Mabakane, Mabule S

2017-12-01

Full Text Available stream_source_info Mabakanea_19979_2017.pdf.txt stream_content_type text/plain stream_size 33716 Content-Encoding UTF-8 stream_name Mabakanea_19979_2017.pdf.txt Content-Type text/plain; charset=UTF-8 SACJ 29(3) December... when using many processors within the compute nodes of the supercomputer. The type of the processors of compute nodes and their memory also play an important role in the overall performance of the parallel application running on a supercomputer. DL...
The new landscape of parallel computer architecture

International Nuclear Information System (INIS)

Shalf, John

2007-01-01

The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models
The new landscape of parallel computer architecture

Energy Technology Data Exchange (ETDEWEB)

Shalf, John [NERSC Division, Lawrence Berkeley National Laboratory 1 Cyclotron Road, Berkeley California, 94720 (United States)

2007-07-15

The past few years has seen a sea change in computer architecture that will impact every facet of our society as every electronic device from cell phone to supercomputer will need to confront parallelism of unprecedented scale. Whereas the conventional multicore approach (2, 4, and even 8 cores) adopted by the computing industry will eventually hit a performance plateau, the highest performance per watt and per chip area is achieved using manycore technology (hundreds or even thousands of cores). However, fully unleashing the potential of the manycore approach to ensure future advances in sustained computational performance will require fundamental advances in computer architecture and programming models that are nothing short of reinventing computing. In this paper we examine the reasons behind the movement to exponentially increasing parallelism, and its ramifications for system design, applications and programming models.
Heuristic simulation of nuclear systems on a supercomputer using the HAL-1987 general-purpose production-rule analysis system

International Nuclear Information System (INIS)

Ragheb, M.; Gvillo, D.; Makowitz, H.

1987-01-01

HAL-1987 is a general-purpose tool for the construction of production-rule analysis systems. It uses the rule-based paradigm from the part of artificial intelligence concerned with knowledge engineering. It uses backward-chaining and forward-chaining in an antecedent-consequent logic, and is programmed in Portable Standard Lisp (PSL). The inference engine is flexible and accommodates general additions and modifications to the knowledge base. The system is used in coupled symbolic-procedural programming adaptive methodologies for stochastic simulations. In Monte Carlo simulations of particle transport, the system considers the pre-processing of the input data to the simulation and adaptively controls the variance reduction process as the simulation progresses. This is accomplished through the use of a knowledge base of rules which encompass the user's expertise in the variance reduction process. It is also applied to the construction of model-based systems for monitoring, fault-diagnosis and crisis-alert in engineering devices, particularly in the field of nuclear reactor safety analysis

Strategic research field no.4, industrial innovations

International Nuclear Information System (INIS)

Kato, Chisachi

2011-01-01

'Kei'-supercomputer is planned to start its full-scale operation in about one year and a half. With this, High Performance Computing (HPC) is most likely to contribute not only to further progress in basic and applied sciences, but also to bringing about innovations in various fields of industries. It is expected to substantially shorten design time, drastically improve performance and/or liability of various industrial products, and greatly enhance safety of large-scale power plants. In this particle, six research themes, which are currently being prepared in this strategic research field, 'industrial innovations' so as to use 'Kei'-supercomputer as soon as it starts operations, will be briefly described regarding their specific goals and break-through that they are expected to bring about in industries. It is also explained how we have determined these themes. We are also planning several measures in order to promote widespread use of HPC including 'Kei'-supercomputer in industries, which will also be elaborated in this article. (author)
An evaluation of current high-performance networks

Energy Technology Data Exchange (ETDEWEB)

Bell, Christian; Bonachea, Dan; Cote, Yannick; Duell, Jason; Hargrove, Paul; Husbands, Parry; Iancu, Costin; Welcome, Michael; Yelick, Katherine

2003-01-25

High-end supercomputers are increasingly built out of commodity components, and lack tight integration between the processor and network. This often results in inefficiencies in the communication subsystem, such as high software overheads and/or message latencies. In this paper we use a set of microbenchmarks to quantify the cost of this commoditization, measuring software overhead, latency, and bandwidth on five contemporary supercomputing networks. We compare the performance of the ubiquitous MPI layer to that of lower-level communication layers, and quantify the advantages of the latter for small message performance. We also provide data on the potential for various communication-related optimizations, such as overlapping communication with computation or other communication. Finally, we determine the minimum size needed for a message to be considered 'large' (i.e., bandwidth-bound) on these platforms, and provide historical data on the software overheads of a number of supercomputers over the past decade.
Real science at the petascale.

Science.gov (United States)

Saksena, Radhika S; Boghosian, Bruce; Fazendeiro, Luis; Kenway, Owain A; Manos, Steven; Mazzeo, Marco D; Sadiq, S Kashif; Suter, James L; Wright, David; Coveney, Peter V

2009-06-28

We describe computational science research that uses petascale resources to achieve scientific results at unprecedented scales and resolution. The applications span a wide range of domains, from investigation of fundamental problems in turbulence through computational materials science research to biomedical applications at the forefront of HIV/AIDS research and cerebrovascular haemodynamics. This work was mainly performed on the US TeraGrid 'petascale' resource, Ranger, at Texas Advanced Computing Center, in the first half of 2008 when it was the largest computing system in the world available for open scientific research. We have sought to use this petascale supercomputer optimally across application domains and scales, exploiting the excellent parallel scaling performance found on up to at least 32 768 cores for certain of our codes in the so-called 'capability computing' category as well as high-throughput intermediate-scale jobs for ensemble simulations in the 32-512 core range. Furthermore, this activity provides evidence that conventional parallel programming with MPI should be successful at the petascale in the short to medium term. We also report on the parallel performance of some of our codes on up to 65 636 cores on the IBM Blue Gene/P system at the Argonne Leadership Computing Facility, which has recently been named the fastest supercomputer in the world for open science.
Supercomputer modeling of volcanic eruption dynamics

Energy Technology Data Exchange (ETDEWEB)

Kieffer, S.W. [Arizona State Univ., Tempe, AZ (United States); Valentine, G.A. [Los Alamos National Lab., NM (United States); Woo, Mahn-Ling [Arizona State Univ., Tempe, AZ (United States)

1995-06-01

Our specific goals are to: (1) provide a set of models based on well-defined assumptions about initial and boundary conditions to constrain interpretations of observations of active volcanic eruptions--including movies of flow front velocities, satellite observations of temperature in plumes vs. time, and still photographs of the dimensions of erupting plumes and flows on Earth and other planets; (2) to examine the influence of subsurface conditions on exit plane conditions and plume characteristics, and to compare the models of subsurface fluid flow with seismic constraints where possible; (3) to relate equations-of-state for magma-gas mixtures to flow dynamics; (4) to examine, in some detail, the interaction of the flowing fluid with the conduit walls and ground topography through boundary layer theory so that field observations of erosion and deposition can be related to fluid processes; and (5) to test the applicability of existing two-phase flow codes for problems related to the generation of volcanic long-period seismic signals; (6) to extend our understanding and simulation capability to problems associated with emplacement of fragmental ejecta from large meteorite impacts.
Trends in supercomputers and computational physics

International Nuclear Information System (INIS)

Bloch, T.

1985-01-01

Today, scientists using numerical models explore the basic mechanisms of semiconductors, apply global circulation models to climatic and oceanographic problems, probe into the behaviour of galaxies and try to verify basic theories of matter, such as Quantum Chromo Dynamics by simulating the constitution of elementary particles. Chemists, crystallographers and molecular dynamics researchers develop models for chemical reactions, formation of crystals and try to deduce the chemical properties of molecules as a function of the shapes of their states. Chaotic systems are studied extensively in turbulence (combustion included) and the design of the next generation of controlled fusion devices relies heavily on computational physics. (orig./HSI)
A supercomputer for parallel data analysis

International Nuclear Information System (INIS)

Kolpakov, I.F.; Senner, A.E.; Smirnov, V.A.

1987-01-01

The project of a powerful multiprocessor system is proposed. The main purpose of the project is to develop a low cost computer system with a processing rate of a few tens of millions of operations per second. The system solves many problems of data analysis from high-energy physics spectrometers. It includes about 70 MOTOROLA-68020 based powerful slave microprocessor boards liaisoned through the VME crates to a host VAX micro computer. Each single microprocessor board performs the same algorithm requiring large computing time. The host computer distributes data over the microprocessor board, collects and combines obtained results. The architecture of the system easily allows one to use it in the real time mode
Towards Automatic Learning of Heuristics for Mechanical Transformations of Procedural Code

Directory of Open Access Journals (Sweden)

Guillermo Vigueras

2017-01-01

Full Text Available The current trends in next-generation exascale systems go towards integrating a wide range of specialized (co-processors into traditional supercomputers. Due to the efficiency of heterogeneous systems in terms of Watts and FLOPS per surface unit, opening the access of heterogeneous platforms to a wider range of users is an important problem to be tackled. However, heterogeneous platforms limit the portability of the applications and increase development complexity due to the programming skills required. Program transformation can help make programming heterogeneous systems easier by defining a step-wise transformation process that translates a given initial code into a semantically equivalent final code, but adapted to a specific platform. Program transformation systems require the definition of efficient transformation strategies to tackle the combinatorial problem that emerges due to the large set of transformations applicable at each step of the process. In this paper we propose a machine learning-based approach to learn heuristics to define program transformation strategies. Our approach proposes a novel combination of reinforcement learning and classification methods to efficiently tackle the problems inherent to this type of systems. Preliminary results demonstrate the suitability of this approach.
Behavioral program synthesis with genetic programming

CERN Document Server

Krawiec, Krzysztof

2016-01-01

Genetic programming (GP) is a popular heuristic methodology of program synthesis with origins in evolutionary computation. In this generate-and-test approach, candidate programs are iteratively produced and evaluated. The latter involves running programs on tests, where they exhibit complex behaviors reflected in changes of variables, registers, or memory. That behavior not only ultimately determines program output, but may also reveal its `hidden qualities' and important characteristics of the considered synthesis problem. However, the conventional GP is oblivious to most of that information and usually cares only about the number of tests passed by a program. This `evaluation bottleneck' leaves search algorithm underinformed about the actual and potential qualities of candidate programs. This book proposes behavioral program synthesis, a conceptual framework that opens GP to detailed information on program behavior in order to make program synthesis more efficient. Several existing and novel mechanisms subs...
Large scale visualization on the Cray XT3 using ParaView.

Energy Technology Data Exchange (ETDEWEB)

Rogers, David; Geveci, Berk (Kitware, Inc.); Eschenbert, Kent (Pittsburgh Supercomputing Center); Neundorf, Alexander (Technical University of Kaiserslautern); Marion, Patrick (Kitware, Inc.); Moreland, Kenneth D.; Greenfield, John

2008-05-01

Post-processing and visualization are key components to understanding any simulation. Porting ParaView, a scalable visualization tool, to the Cray XT3 allows our analysts to leverage the same supercomputer they use for simulation to perform post-processing. Visualization tools traditionally rely on a variety of rendering, scripting, and networking resources; the challenge of running ParaView on the Lightweight Kernel is to provide and use the visualization and post-processing features in the absence of many OS resources. We have successfully accomplished this at Sandia National Laboratories and the Pittsburgh Supercomputing Center.
Magnetic fusion energy and computers. The role of computing in magnetic fusion energy research and development (second edition)

International Nuclear Information System (INIS)

1983-01-01

This report documents the structure and uses of the MFE Network and presents a compilation of future computing requirements. Its primary emphasis is on the role of supercomputers in fusion research. One of its key findings is that with the introduction of each successive class of supercomputer, qualitatively improved understanding of fusion processes has been gained. At the same time, even the current Class VI machines severely limit the attainable realism of computer models. Many important problems will require the introduction of Class VII or even larger machines before they can be successfully attacked
A numerical method for the solution of three-dimensional incompressible viscous flow using the boundary-fitted curvilinear coordinate transformation and domain decomposition technique

International Nuclear Information System (INIS)

Umegaki, Kikuo; Miki, Kazuyoshi

1990-01-01

A numerical method is developed to solve three-dimensional incompressible viscous flow in complicated geometry using curvilinear coordinate transformation and domain decomposition technique. In this approach, a complicated flow domain is decomposed into several subdomains, each of which has an overlapping region with neighboring subdomains. Curvilinear coordinates are numerically generated in each subdomain using the boundary-fitted coordinate transformation technique. The modified SMAC scheme is developed to solve Navier-Stokes equations in which the convective terms are discretized by the QUICK method. A fully vectorized computer program is developed on the basis of the proposed method. The program is applied to flow analysis in a semicircular curved, 90deg elbow and T-shape branched pipes. Computational time with the vector processor of the HITAC S-810/20 supercomputer system, is reduced to 1/10∼1/20 of that with a scalar processor. (author)
IBM PC enhances the world's future

Science.gov (United States)

Cox, Jozelle

1988-01-01

Although the purpose of this research is to illustrate the importance of computers to the public, particularly the IBM PC, present examinations will include computers developed before the IBM PC was brought into use. IBM, as well as other computing facilities, began serving the public years ago, and is continuing to find ways to enhance the existence of man. With new developments in supercomputers like the Cray-2, and the recent advances in artificial intelligence programming, the human race is gaining knowledge at a rapid pace. All have benefited from the development of computers in the world; not only have they brought new assets to life, but have made life more and more of a challenge everyday.
Development of an advanced fluid-dynamic analysis code: α-flow

International Nuclear Information System (INIS)

Akiyama, Mamoru

1990-01-01

A Project for development of large scale three-dimensional fluid-dynamic analysis code, α-FLOW, coping with the recent advancement of supercomputers and workstations, has been in progress. This project is called the α-Project, which has been promoted by the Association for Large Scale Fluid Dynamics Analysis Code comprising private companies and research institutions such as universities. The developmental period for the α-FLOW is four years, March 1989 to March 1992. To date, the major portions of basic design and program preparation have been completed and the project is in the stage of testing each module. In this paper, the present status of the α-Project, design policy and outline of α-FLOW are described. (author)
Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

Science.gov (United States)

Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

2016-08-01

Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.
ANSTO. Annual report 1991-1992

International Nuclear Information System (INIS)

1992-01-01

Development and investment at the Australian Nuclear Science and Technology Organization (ANSTO) has continued to be focussed on providing facilities that are of the world standard. The opening of the tandem accelerator and the supercomputing facility are examples of this commitment. The opening of the National Medical Cyclotron (NMC) in March 1992 provides Australia with new capabilities in nuclear medicine since it produces radiopharmaceuticals which complements those produced by the HIFAR research reactor. The objectives and research projects of Advanced Materials, Application of Nuclear Physics, Biomedicine and Health, Environmental Science, Industrial Technology, the NMC, Nuclear Technology and ANSTO Engineering programs are presented. The financial statement for the year under review is also presented. tabs. ills
Parallelization of a numerical simulation code for isotropic turbulence

International Nuclear Information System (INIS)

Sato, Shigeru; Yokokawa, Mitsuo; Watanabe, Tadashi; Kaburaki, Hideo.

1996-03-01

A parallel pseudospectral code which solves the three-dimensional Navier-Stokes equation by direct numerical simulation is developed and execution time, parallelization efficiency, load balance and scalability are evaluated. A vector parallel supercomputer, Fujitsu VPP500 with up to 16 processors is used for this calculation for Fourier modes up to 256x256x256 using 16 processors. Good scalability for number of processors is achieved when number of Fourier mode is fixed. For small Fourier modes, calculation time of the program is proportional to NlogN which is ideal complexity of calculation for 3D-FFT on vector parallel processors. It is found that the calculation performance decreases as the increase of the Fourier modes. (author)
Challenges in scaling NLO generators to leadership computers

Science.gov (United States)

Benjamin, D.; Childers, JT; Hoeche, S.; LeCompte, T.; Uram, T.

2017-10-01

Exascale computing resources are roughly a decade away and will be capable of 100 times more computing than current supercomputers. In the last year, Energy Frontier experiments crossed a milestone of 100 million core-hours used at the Argonne Leadership Computing Facility, Oak Ridge Leadership Computing Facility, and NERSC. The Fortran-based leading-order parton generator called Alpgen was successfully scaled to millions of threads to achieve this level of usage on Mira. Sherpa and MadGraph are next-to-leading order generators used heavily by LHC experiments for simulation. Integration times for high-multiplicity or rare processes can take a week or more on standard Grid machines, even using all 16-cores. We will describe our ongoing work to scale the Sherpa generator to thousands of threads on leadership-class machines and reduce run-times to less than a day. This work allows the experiments to leverage large-scale parallel supercomputers for event generation today, freeing tens of millions of grid hours for other work, and paving the way for future applications (simulation, reconstruction) on these and future supercomputers.
Object-Oriented Support for Adaptive Methods on Paranel Machines

Directory of Open Access Journals (Sweden)

Sandeep Bhatt

1993-01-01

Full Text Available This article reports on experiments from our ongoing project whose goal is to develop a C++ library which supports adaptive and irregular data structures on distributed memory supercomputers. We demonstrate the use of our abstractions in implementing "tree codes" for large-scale N-body simulations. These algorithms require dynamically evolving treelike data structures, as well as load-balancing, both of which are widely believed to make the application difficult and cumbersome to program for distributed-memory machines. The ease of writing the application code on top of our C++ library abstractions (which themselves are application independent, and the low overhead of the resulting C++ code (over hand-crafted C code supports our belief that object-oriented approaches are eminently suited to programming distributed-memory machines in a manner that (to the applications programmer is architecture-independent. Our contribution in parallel programming methodology is to identify and encapsulate general classes of communication and load-balancing strategies useful across applications and MIMD architectures. This article reports experimental results from simulations of half a million particles using multiple methods.
Application experiences with the Globus toolkit.

Energy Technology Data Exchange (ETDEWEB)

Brunett, S.

1998-06-09

The Globus grid toolkit is a collection of software components designed to support the development of applications for high-performance distributed computing environments, or ''computational grids'' [14]. The Globus toolkit is an implementation of a ''bag of services'' architecture, which provides application and tool developers not with a monolithic system but rather with a set of stand-alone services. Each Globus component provides a basic service, such as authentication, resource allocation, information, communication, fault detection, and remote data access. Different applications and tools can combine these services in different ways to construct ''grid-enabled'' systems. The Globus toolkit has been used to construct the Globus Ubiquitous Supercomputing Testbed, or GUSTO: a large-scale testbed spanning 20 sites and included over 4000 compute nodes for a total compute power of over 2 TFLOPS. Over the past six months, we and others have used this testbed to conduct a variety of application experiments, including multi-user collaborative environments (tele-immersion), computational steering, distributed supercomputing, and high throughput computing. The goal of this paper is to review what has been learned from these experiments regarding the effectiveness of the toolkit approach. To this end, we describe two of the application experiments in detail, noting what worked well and what worked less well. The two applications are a distributed supercomputing application, SF-Express, in which multiple supercomputers are harnessed to perform large distributed interactive simulations; and a tele-immersion application, CAVERNsoft, in which the focus is on connecting multiple people to a distributed simulated world.
Functional Programming

OpenAIRE

Chitil, Olaf

2009-01-01

Functional programming is a programming paradigm like object-oriented programming and logic programming. Functional programming comprises both a specific programming style and a class of programming languages that encourage and support this programming style. Functional programming enables the programmer to describe an algorithm on a high-level, in terms of the problem domain, without having to deal with machine-related details. A program is constructed from functions that only map inputs to ...

Programming language for computations in the Interkosmos program

Science.gov (United States)

Schmidt, K.

1975-01-01

The programming system for Intercosmos data processing, based on the structural programming theory, which considers a program as an ordered set of standardized elementary parts, from which the user programs are automatically generated, is described. The programs are comprised of several modules, which are briefly summarized. The general structure of the programming system is presented in a block diagram. A programming control language developed to formulate the problem quickly and completely is presented along with basic symbols which are characteristic of the Intercosmos programming system.
Evaluating Satellite and Supercomputing Technologies for Improved Coastal Ecosystem Assessments

Science.gov (United States)

McCarthy, Matthew James

Water quality and wetlands represent two vital elements of a healthy coastal ecosystem. Both experienced substantial declines in the U.S. during the 20th century. Overall coastal wetland cover decreased over 50% in the 20th century due to coastal development and water pollution. Management and legislative efforts have successfully addressed some of the problems and threats, but recent research indicates that the diffuse impacts of climate change and non-point source pollution may be the primary drivers of current and future water-quality and wetland stress. In order to respond to these pervasive threats, traditional management approaches need to adopt modern technological tools for more synoptic, frequent and fine-scale monitoring and assessment. In this dissertation, I explored some of the applications possible with new, commercial satellite imagery to better assess the status of coastal ecosystems. Large-scale land-cover change influences the quality of adjacent coastal water. Satellite imagery has been used to derive land-cover maps since the 1960's. It provides multiple data points with which to evaluate the effects of land-cover change on water quality. The objective of the first chapter of this research was to determine how 40 years of land-cover change in the Tampa Bay watershed (6,500 km2) may have affected turbidity and chlorophyll concentration - two proxies for coastal water quality. Land cover classes were evaluated along with precipitation and wind stress as explanatory variables. Results varied between analyses for the entire estuary and those of segments within the bay. Changes in developed land percent cover best explained the turbidity and chlorophyll-concentration time series for the entire bay (R2 > 0.75, p Ocean-color satellite imagery was used to derive proxies for coastal water with near-daily satellite observations since 2000. The goal of chapter two was to identify drivers of turbidity variability for 11 National Estuary Program water bodies
A special purpose computer for the calculation of the electric conductivity of a random resistor network

International Nuclear Information System (INIS)

Hajjar, Mansour

1987-01-01

The special purpose computer PERCOLA is designed for long numerical simulations on a percolation problem in Statistical Mechanics of disordered media. Our aim is to improve the actual values of the critical exponents characterizing the behaviour of random resistance networks at percolation threshold. The architecture of PERCOLA is based on an efficient iterative algorithm used to compute the electric conductivity of such networks. The calculator has the characteristics of a general purpose 64 bits floating point micro-programmable computer that can run programs for various types of problems with a peak performance of 25 Mflops. This high computing speed is a result of the pipeline architecture based on internal parallelism and separately micro-code controlled units such as: data memories, a micro-code memory, ALUs and multipliers (both WEITEK components), various data paths, a sequencer (ANALOG DEVICES component), address generators and a random number generator. Thus, the special purpose computer runs percolation problem program 10 percent faster than the supercomputer CRAY XMP. (author) [fr
ASTEC and MODEL: Controls software development at Goddard Space Flight Center

Science.gov (United States)

Downing, John P.; Bauer, Frank H.; Surber, Jeffrey L.

1993-01-01

The ASTEC (Analysis and Simulation Tools for Engineering Controls) software is under development at the Goddard Space Flight Center (GSFC). The design goal is to provide a wide selection of controls analysis tools at the personal computer level, as well as the capability to upload compute-intensive jobs to a mainframe or supercomputer. In the last three years the ASTEC (Analysis and Simulation Tools for Engineering Controls) software has been under development. ASTEC is meant to be an integrated collection of controls analysis tools for use at the desktop level. MODEL (Multi-Optimal Differential Equation Language) is a translator that converts programs written in the MODEL language to FORTRAN. An upgraded version of the MODEL program will be merged into ASTEC. MODEL has not been modified since 1981 and has not kept with changes in computers or user interface techniques. This paper describes the changes made to MODEL in order to make it useful in the 90's and how it relates to ASTEC.
Experiences developing ALEGRA: A C++ coupled physics framework

Energy Technology Data Exchange (ETDEWEB)

Budge, K.G.; Peery, J.S.

1998-11-01

ALEGRA is a coupled physics framework originally written to simulate inertial confinement fusion (ICF) experiments being conducted at the PBFA-II facility at Sandia National Laboratories. It has since grown into a large software development project supporting a number of computational programs at Sandia. As the project has grown, so has the development team, from the original two authors to a group of over fifteen programmers crossing several departments. In addition, ALEGRA now runs on a wide variety of platforms, from large PCs to the ASCI Teraflops massively parallel supercomputer. The authors discuss the reasons for ALEGRA`s success, which include the intelligent use of object-oriented techniques and the choice of C++ as the programming language. They argue that the intelligent use of development tools, such as build tools (e.g. make), compiler, debugging environment (e.g. dbx), version control system (e.g. cvs), and bug management software (e.g. ClearDDTS), is nearly as important as the choice of language and paradigm.
Parallel Computing Strategies for Irregular Algorithms

Science.gov (United States)

Biswas, Rupak; Oliker, Leonid; Shan, Hongzhang; Biegel, Bryan (Technical Monitor)

2002-01-01

Parallel computing promises several orders of magnitude increase in our ability to solve realistic computationally-intensive problems, but relies on their efficient mapping and execution on large-scale multiprocessor architectures. Unfortunately, many important applications are irregular and dynamic in nature, making their effective parallel implementation a daunting task. Moreover, with the proliferation of parallel architectures and programming paradigms, the typical scientist is faced with a plethora of questions that must be answered in order to obtain an acceptable parallel implementation of the solution algorithm. In this paper, we consider three representative irregular applications: unstructured remeshing, sparse matrix computations, and N-body problems, and parallelize them using various popular programming paradigms on a wide spectrum of computer platforms ranging from state-of-the-art supercomputers to PC clusters. We present the underlying problems, the solution algorithms, and the parallel implementation strategies. Smart load-balancing, partitioning, and ordering techniques are used to enhance parallel performance. Overall results demonstrate the complexity of efficiently parallelizing irregular algorithms.
Parallel Evolutionary Optimization for Neuromorphic Network Training

Energy Technology Data Exchange (ETDEWEB)

Schuman, Catherine D [ORNL; Disney, Adam [University of Tennessee (UT); Singh, Susheela [North Carolina State University (NCSU), Raleigh; Bruer, Grant [University of Tennessee (UT); Mitchell, John Parker [University of Tennessee (UT); Klibisz, Aleksander [University of Tennessee (UT); Plank, James [University of Tennessee (UT)

2016-01-01

One of the key impediments to the success of current neuromorphic computing architectures is the issue of how best to program them. Evolutionary optimization (EO) is one promising programming technique; in particular, its wide applicability makes it especially attractive for neuromorphic architectures, which can have many different characteristics. In this paper, we explore different facets of EO on a spiking neuromorphic computing model called DANNA. We focus on the performance of EO in the design of our DANNA simulator, and on how to structure EO on both multicore and massively parallel computing systems. We evaluate how our parallel methods impact the performance of EO on Titan, the U.S.'s largest open science supercomputer, and BOB, a Beowulf-style cluster of Raspberry Pi's. We also focus on how to improve the EO by evaluating commonality in higher performing neural networks, and present the result of a study that evaluates the EO performed by Titan.
Experiences developing ALEGRA: A C++ coupled physics framework

International Nuclear Information System (INIS)

Budge, K.G.; Peery, J.S.

1998-01-01

ALEGRA is a coupled physics framework originally written to simulate inertial confinement fusion (ICF) experiments being conducted at the PBFA-II facility at Sandia National Laboratories. It has since grown into a large software development project supporting a number of computational programs at Sandia. As the project has grown, so has the development team, from the original two authors to a group of over fifteen programmers crossing several departments. In addition, ALEGRA now runs on a wide variety of platforms, from large PCs to the ASCI Teraflops massively parallel supercomputer. The authors discuss the reasons for ALEGRA's success, which include the intelligent use of object-oriented techniques and the choice of C++ as the programming language. They argue that the intelligent use of development tools, such as build tools (e.g. make), compiler, debugging environment (e.g. dbx), version control system (e.g. cvs), and bug management software (e.g. ClearDDTS), is nearly as important as the choice of language and paradigm
OPTICON: Pro-Matlab software for large order controlled structure design

Science.gov (United States)

Peterson, Lee D.

1989-01-01

A software package for large order controlled structure design is described and demonstrated. The primary program, called OPTICAN, uses both Pro-Matlab M-file routines and selected compiled FORTRAN routines linked into the Pro-Matlab structure. The program accepts structural model information in the form of state-space matrices and performs three basic design functions on the model: (1) open loop analyses; (2) closed loop reduced order controller synthesis; and (3) closed loop stability and performance assessment. The current controller synthesis methods which were implemented in this software are based on the Generalized Linear Quadratic Gaussian theory of Bernstein. In particular, a reduced order Optimal Projection synthesis algorithm based on a homotopy solution method was successfully applied to an experimental truss structure using a 58-state dynamic model. These results are presented and discussed. Current plans to expand the practical size of the design model to several hundred states and the intention to interface Pro-Matlab to a supercomputing environment are discussed.
Fiscal 2000 report on advanced parallelized compiler technology. Outlines; 2000 nendo advanced heiretsuka compiler gijutsu hokokusho (Gaiyo hen)

Energy Technology Data Exchange (ETDEWEB)

NONE

2001-03-01

Research and development was carried out concerning the automatic parallelized compiler technology which improves on the practical performance, cost/performance ratio, and ease of operation of the multiprocessor system now used for constructing supercomputers and expected to provide a fundamental architecture for microprocessors for the 21st century. Efforts were made to develop an automatic multigrain parallelization technology for extracting multigrain as parallelized from a program and for making full use of the same and a parallelizing tuning technology for accelerating parallelization by feeding back to the compiler the dynamic information and user knowledge to be acquired during execution. Moreover, a benchmark program was selected and studies were made to set execution rules and evaluation indexes for the establishment of technologies for subjectively evaluating the performance of parallelizing compilers for the existing commercial parallel processing computers, which was achieved through the implementation and evaluation of the 'Advanced parallelizing compiler technology research and development project.' (NEDO)
Parallel R-matrix computation

International Nuclear Information System (INIS)

Heggarty, J.W.

1999-06-01

For almost thirty years, sequential R-matrix computation has been used by atomic physics research groups, from around the world, to model collision phenomena involving the scattering of electrons or positrons with atomic or molecular targets. As considerable progress has been made in the understanding of fundamental scattering processes, new data, obtained from more complex calculations, is of current interest to experimentalists. Performing such calculations, however, places considerable demands on the computational resources to be provided by the target machine, in terms of both processor speed and memory requirement. Indeed, in some instances the computational requirements are so great that the proposed R-matrix calculations are intractable, even when utilising contemporary classic supercomputers. Historically, increases in the computational requirements of R-matrix computation were accommodated by porting the problem codes to a more powerful classic supercomputer. Although this approach has been successful in the past, it is no longer considered to be a satisfactory solution due to the limitations of current (and future) Von Neumann machines. As a consequence, there has been considerable interest in the high performance multicomputers, that have emerged over the last decade which appear to offer the computational resources required by contemporary R-matrix research. Unfortunately, developing codes for these machines is not as simple a task as it was to develop codes for successive classic supercomputers. The difficulty arises from the considerable differences in the computing models that exist between the two types of machine and results in the programming of multicomputers to be widely acknowledged as a difficult, time consuming and error-prone task. Nevertheless, unless parallel R-matrix computation is realised, important theoretical and experimental atomic physics research will continue to be hindered. This thesis describes work that was undertaken in
Maine Migrant Program: 1997-1998 Program Evaluation.

Science.gov (United States)

Bazinet, Suzanne C., Ed.

The Maine Department of Education contracts with local educational agencies to administer the Maine Migrant Education Program. The program's overall mission is to provide the support necessary for migrant children to achieve Maine's academic standards. In 1997-98, 73 local migrant programs served 9,838 students, and 63 summer programs served 1,769…
Program Leadership from a Nordic Perspective - Program Leaders' Power to Influence Their Program

DEFF Research Database (Denmark)

Högfeldt, Anna-Karin; Strömberg, Emma; Jerbrant, Anna

2013-01-01

research demonstrated that program leaders have quite different positions, strategies and methods when it comes to monitoring and developing their programs. In this paper, a deeper investigation is carried out of the (im-) possibilities to make real influence on the study courses that constitutes...... the respective Engineering study programs. Eight program leaders from the five N5T universities have been interviewed, and the analysis of these studies, has culminated in a model for the analysis of program leadership for Engineering education development....
Multi-Year Program Plan - Building Regulatory Programs

Energy Technology Data Exchange (ETDEWEB)

none,

2010-10-01

This document presents DOE’s multi-year plan for the three components of the Buildings Regulatory Program: Appliance and Equipment Efficiency Standards, ENERGY STAR, and the Building Energy Codes Program. This document summarizes the history of these programs, the mission and goals of the programs, pertinent statutory requirements, and DOE’s 5-year plan for moving forward.
Data Mining Supercomputing with SAS JMP® Genomics

Directory of Open Access Journals (Sweden)

Richard S. Segall

2011-02-01

Full Text Available JMP® Genomics is statistical discovery software that can uncover meaningful patterns in high-throughput genomics and proteomics data. JMP® Genomics is designed for biologists, biostatisticians, statistical geneticists, and those engaged in analyzing the vast stores of data that are common in genomic research (SAS, 2009. Data mining was performed using JMP® Genomics on the two collections of microarray databases available from National Center for Biotechnology Information (NCBI for lung cancer and breast cancer. The Gene Expression Omnibus (GEO of NCBI serves as a public repository for a wide range of highthroughput experimental data, including the two collections of lung cancer and breast cancer that were used for this research. The results for applying data mining using software JMP® Genomics are shown in this paper with numerous screen shots.
Program summary for the Civilian Reactor Development Program

International Nuclear Information System (INIS)

1982-07-01

This Civilian Reactor Development Program document has the prime purpose of summarizing the technical programs supported by the FY 1983 budget request. This section provides a statement of the overall program objectives and a general program overview. Section II presents the technical programs in a format intended to show logical technical interrelationships, and does not necessarily follow the structure of the formal budget presentation. Section III presents the technical organization and management structure of the program
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

Energy Technology Data Exchange (ETDEWEB)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K. [Cray Inc., St. Paul, MN 55101 (United States); Porter, D. [Minnesota Supercomputing Institute for Advanced Computational Research, Minneapolis, MN USA (United States); O’Neill, B. J.; Nolting, C.; Donnert, J. M. F.; Jones, T. W. [School of Physics and Astronomy, University of Minnesota, Minneapolis, MN 55455 (United States); Edmon, P., E-mail: pjm@cray.com, E-mail: nradclif@cray.com, E-mail: kkandalla@cray.com, E-mail: oneill@astro.umn.edu, E-mail: nolt0040@umn.edu, E-mail: donnert@ira.inaf.it, E-mail: twj@umn.edu, E-mail: dhp@umn.edu, E-mail: pedmon@cfa.harvard.edu [Institute for Theory and Computation, Center for Astrophysics, Harvard University, Cambridge, MA 02138 (United States)

2017-02-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.
Development of computer aided engineering system for TRAC applications

International Nuclear Information System (INIS)

Arai, Kenji; Itoya, Seihiro; Uematsu, Hitoshi; Tsunoyama, Shigeaki

1990-01-01

An advanced best estimate computer program for nuclear reactor transient analysis, TRAC has been extensively used to carry out various thermal hydraulic calculations in the nuclear engineering field, because of its versatility. To perform efficiently a wide variety of TRAC calculation, the efficient utilization of computers and the convenient environment for input and output processing is necessary. We have applied a computer network comprising a super-computer, engineering work stations and personal computers to TRAC calculations and have assigned the appropriate functions to each computer. We have also been developing an interactive graphics system for input and output processing on an EWS. This hardware and software environment can improve the effectiveness of TRAC utilization for various thermal hydraulic calculations. (author)
On The Export Control Of High Speed Imaging For Nuclear Weapons Applications

Energy Technology Data Exchange (ETDEWEB)

Watson, Scott Avery [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Altherr, Michael Robert [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2017-09-15

Since the Manhattan Project, the use of high-speed photography, and its cousins flash radiography1 and schieleren photography have been a technological proliferation concern. Indeed, like the supercomputer, the development of high-speed photography as we now know it essentially grew out of the nuclear weapons program at Los Alamos2,3,4. Naturally, during the course of the last 75 years the technology associated with computers and cameras has been export controlled by the United States and others to prevent both proliferation among non-P5-nations and technological parity among potential adversaries among P5 nations. Here we revisit these issues as they relate to high-speed photographic technologies and make recommendations about how future restrictions, if any, should be guided.
WOMBAT: A Scalable and High-performance Astrophysical Magnetohydrodynamics Code

International Nuclear Information System (INIS)

Mendygral, P. J.; Radcliffe, N.; Kandalla, K.; Porter, D.; O’Neill, B. J.; Nolting, C.; Donnert, J. M. F.; Jones, T. W.; Edmon, P.

2017-01-01

We present a new code for astrophysical magnetohydrodynamics specifically designed and optimized for high performance and scaling on modern and future supercomputers. We describe a novel hybrid OpenMP/MPI programming model that emerged from a collaboration between Cray, Inc. and the University of Minnesota. This design utilizes MPI-RMA optimized for thread scaling, which allows the code to run extremely efficiently at very high thread counts ideal for the latest generation of multi-core and many-core architectures. Such performance characteristics are needed in the era of “exascale” computing. We describe and demonstrate our high-performance design in detail with the intent that it may be used as a model for other, future astrophysical codes intended for applications demanding exceptional performance.

6th International Parallel Tools Workshop

CERN Document Server

Brinkmann, Steffen; Gracia, José; Resch, Michael; Nagel, Wolfgang

2013-01-01

The latest advances in the High Performance Computing hardware have significantly raised the level of available compute performance. At the same time, the growing hardware capabilities of modern supercomputing architectures have caused an increasing complexity of the parallel application development. Despite numerous efforts to improve and simplify parallel programming, there is still a lot of manual debugging and tuning work required. This process is supported by special software tools, facilitating debugging, performance analysis, and optimization and thus making a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools, which were presented and discussed at the 6th International Parallel Tools Workshop, held in Stuttgart, Germany, 25-26 September 2012.
Performance Analysis of FEM Algorithmson GPU and Many-Core Architectures

KAUST Repository

Khurram, Rooh

2015-04-27

The roadmaps of the leading supercomputer manufacturers are based on hybrid systems, which consist of a mix of conventional processors and accelerators. This trend is mainly due to the fact that the power consumption cost of the future cpu-only Exascale systems will be unsustainable, thus accelerators such as graphic processing units (GPUs) and many-integrated-core (MIC) will likely be the integral part of the TOP500 (http://www.top500.org/) supercomputers, beyond 2020. The emerging supercomputer architecture will bring new challenges for the code developers. Continuum mechanics codes will particularly be affected, because the traditional synchronous implicit solvers will probably not scale on hybrid Exascale machines. In the previous study[1], we reported on the performance of a conjugate gradient based mesh motion algorithm[2]on Sandy Bridge, Xeon Phi, and K20c. In the present study we report on the comparative study of finite element codes, using PETSC and AmgX solvers on CPU and GPUs, respectively [3,4]. We believe this study will be a good starting point for FEM code developers, who are contemplating a CPU to accelerator transition.
SiGN-SSM: open source parallel software for estimating gene networks with state space models.

Science.gov (United States)

Tamada, Yoshinori; Yamaguchi, Rui; Imoto, Seiya; Hirose, Osamu; Yoshida, Ryo; Nagasaki, Masao; Miyano, Satoru

2011-04-15

SiGN-SSM is an open-source gene network estimation software able to run in parallel on PCs and massively parallel supercomputers. The software estimates a state space model (SSM), that is a statistical dynamic model suitable for analyzing short time and/or replicated time series gene expression profiles. SiGN-SSM implements a novel parameter constraint effective to stabilize the estimated models. Also, by using a supercomputer, it is able to determine the gene network structure by a statistical permutation test in a practical time. SiGN-SSM is applicable not only to analyzing temporal regulatory dependencies between genes, but also to extracting the differentially regulated genes from time series expression profiles. SiGN-SSM is distributed under GNU Affero General Public Licence (GNU AGPL) version 3 and can be downloaded at http://sign.hgc.jp/signssm/. The pre-compiled binaries for some architectures are available in addition to the source code. The pre-installed binaries are also available on the Human Genome Center supercomputer system. The online manual and the supplementary information of SiGN-SSM is available on our web site. tamada@ims.u-tokyo.ac.jp.
KNBD: A Remote Kernel Block Server for Linux

Science.gov (United States)

Becker, Jeff

1999-01-01

I am developing a prototype of a Linux remote disk block server whose purpose is to serve as a lower level component of a parallel file system. Parallel file systems are an important component of high performance supercomputers and clusters. Although supercomputer vendors such as SGI and IBM have their own custom solutions, there has been a void and hence a demand for such a system on Beowulf-type PC Clusters. Recently, the Parallel Virtual File System (PVFS) project at Clemson University has begun to address this need (1). Although their system provides much of the functionality of (and indeed was inspired by) the equivalent file systems in the commercial supercomputer market, their system is all in user-space. Migrating their 10 services to the kernel could provide a performance boost, by obviating the need for expensive system calls. Thanks to Pavel Machek, the Linux kernel has provided the network block device (2) with kernels 2.1.101 and later. You can configure this block device to redirect reads and writes to a remote machine's disk. This can be used as a building block for constructing a striped file system across several nodes.
Geothermal Technologies Program Overview - Peer Review Program

Energy Technology Data Exchange (ETDEWEB)

Milliken, JoAnn [Office of Energy Efficiency and Renewable Energy (EERE), Washington, DC (United States)

2011-06-06

This Geothermal Technologies Program presentation was delivered on June 6, 2011 at a Program Peer Review meeting. It contains annual budget, Recovery Act, funding opportunities, upcoming program activities, and more.
How General-Purpose can a GPU be?

Directory of Open Access Journals (Sweden)

Philip Machanick

2015-12-01

Full Text Available The use of graphics processing units (GPUs in general-purpose computation (GPGPU is a growing field. GPU instruction sets, while implementing a graphics pipeline, draw from a range of single instruction multiple datastream (SIMD architectures characteristic of the heyday of supercomputers. Yet only one of these SIMD instruction sets has been of application on a wide enough range of problems to survive the era when the full range of supercomputer design variants was being explored: vector instructions. This paper proposes a reconceptualization of the GPU as a multicore design with minimal exotic modes of parallelism so as to make GPGPU truly general.
A Program Transformation for Backwards Analysis of Logic Programs

DEFF Research Database (Denmark)

Gallagher, John Patrick

2003-01-01

The input to backwards analysis is a program together with properties that are required to hold at given program points. The purpose of the analysis is to derive initial goals or pre-conditions that guarantee that, when the program is executed, the given properties hold. The solution for logic...... programs presented here is based on a transformation of the input program, which makes explicit the dependencies of the given program points on the initial goals. The transformation is derived from the resultants semantics of logic programs. The transformed program is then analysed using a standard...
Object-Oriented Programming in the Beta Programming Language

DEFF Research Database (Denmark)

Madsen, Ole Lehrmann; Møller-Pedersen, Birger; Nygaard, Kristen

This is a book on object-oriented programming and the BETA programming language. Object-oriented programming originated with the Simula languages developed at the Norwegian Computing Center, Oslo, in the 1960s. The first Simula language, Simula I, was intended for writing simulation programs....... Simula I was later used as a basis for defining a general purpose programming language, Simula 67. In addition to being a programming language, Simula1 was also designed as a language for describing and communicating about systems in general. Simula has been used by a relatively small community for many...... years, although it has had a major impact on research in computer science. The real breakthrough for object-oriented programming came with the development of Smalltalk. Since then, a large number of programming languages based on Simula concepts have appeared. C++ is the language that has had...
Repository-Based Software Engineering Program: Working Program Management Plan

Science.gov (United States)

1993-01-01

Repository-Based Software Engineering Program (RBSE) is a National Aeronautics and Space Administration (NASA) sponsored program dedicated to introducing and supporting common, effective approaches to software engineering practices. The process of conceiving, designing, building, and maintaining software systems by using existing software assets that are stored in a specialized operational reuse library or repository, accessible to system designers, is the foundation of the program. In addition to operating a software repository, RBSE promotes (1) software engineering technology transfer, (2) academic and instructional support of reuse programs, (3) the use of common software engineering standards and practices, (4) software reuse technology research, and (5) interoperability between reuse libraries. This Program Management Plan (PMP) is intended to communicate program goals and objectives, describe major work areas, and define a management report and control process. This process will assist the Program Manager, University of Houston at Clear Lake (UHCL) in tracking work progress and describing major program activities to NASA management. The goal of this PMP is to make managing the RBSE program a relatively easy process that improves the work of all team members. The PMP describes work areas addressed and work efforts being accomplished by the program; however, it is not intended as a complete description of the program. Its focus is on providing management tools and management processes for monitoring, evaluating, and administering the program; and it includes schedules for charting milestones and deliveries of program products. The PMP was developed by soliciting and obtaining guidance from appropriate program participants, analyzing program management guidance, and reviewing related program management documents.
Program auto

International Nuclear Information System (INIS)

Rawool-Sullivan, M.W.; Plagnol, E.

1990-01-01

The program AUTO was developed to be used in the analysis of dE vs E type spectra. This program is written in FORTRAN and calculates dE vs E lines in MeV. The provision is also made in the program to convert these lines from MeV to ADC channel numbers to facilitate the comparison with the raw data from the experiments. Currently the output of this program can be plotted with the display program, called VISU, but it can also be used independent of the program VISU, with little or no modification in the actual fortran code. The program AUTO has many useful applications. In this article the program AUTO is described along with its applications
A Playful Programming Products Vs. Programming Concepts Matrix

DEFF Research Database (Denmark)

Allsopp, Benjamin Brink

2017-01-01

to computer program: playful programming. This research also describes a project to bring together different stakeholders (developers, educators, parents and researchers) with a common vocabulary for describing, developing, teaching with and comparing these playful programming products and develops a model......A number of Danish primary schools are involved in pilot studies where 1st to 9th grade students work with Scratch and Lego MindStorms in STEM subjects. These games may become part of the curriculum at these schools. Recent research identifies a category of games and toys that support learning...... to provide educators and researchers involved in pilot studies with an overview of which programming concepts various playful programming products exercise (a playful programming products vs. programming concepts matrix). We also add additional concept specializations and expand on the descriptions...
Programming by Numbers -- A Programming Method for Complete Novices

NARCIS (Netherlands)

Glaser, Hugh; Hartel, Pieter H.

2000-01-01

Students often have difficulty with the minutiae of program construction. We introduce the idea of `Programming by Numbers', which breaks some of the programming process down into smaller steps, giving such students a way into the process of Programming in the Small. Programming by Numbers does not
Clean Coal Technology Demonstration Program: Program Update 1998

Energy Technology Data Exchange (ETDEWEB)

Assistant Secretary for Fossil Energy

1999-03-01

Annual report on the Clean Coal Technology Demonstration Program (CCT Program). The report address the role of the CCT Program, implementation, funding and costs, accomplishments, project descriptions, legislative history, program history, environmental aspects, and project contacts. The project descriptions describe the technology and provides a brief summary of the demonstration results.
Program specialization

CERN Document Server

Marlet, Renaud

2013-01-01

This book presents the principles and techniques of program specialization - a general method to make programs faster (and possibly smaller) when some inputs can be known in advance. As an illustration, it describes the architecture of Tempo, an offline program specializer for C that can also specialize code at runtime, and provides figures for concrete applications in various domains. Technical details address issues related to program analysis precision, value reification, incomplete program specialization, strategies to exploit specialized program, incremental specialization, and data speci
UNDERSTANDING THE RELATIONSHIPS OF PROGRAM SATISFACTION, PROGRAM LOYALTY AND STORE LOYALTY AMONG CARDHOLDERS OF LOYALTY PROGRAMS

Directory of Open Access Journals (Sweden)

Nor Asiah Omar

2011-01-01

Full Text Available Loyalty programs have increasingly attracted interest in both academic marketing research and practice. One major factor that has been increasingly discussed is loyalty. In this study we examine the influence of cardholders' satisfaction on loyalty (program loyalty and store loyalty in a retail context, namely, in department stores and superstores. Data were collected from 400 cardholders of a retail loyalty program in Klang Valley, Malaysia via the drop-off-and-collect technique. Structural modelling techniques were applied to analyze the data. The results indicated that program satisfaction is not related to store loyalty (share-of-wallet, share-of-visit and store preference. However, loyalty to the program (program loyalty plays a crucial intervening role in the relationship between program satisfaction and store loyalty. The study underscores the principal importance of program loyalty in the retail loyalty program.
Final Project Report "Advanced Concept Exploration For Fast Ignition Science Program"

Energy Technology Data Exchange (ETDEWEB)

STEPHENS, Richard B.; McLEAN, Harry M.; THEOBALD, Wolfgang; AKLI, Kramer; BEG, Farhat N.; SENTOKU, Yasuiko; SCHUMACHER, Douglas; WEI, Mingsheng S.

2014-01-31

and x-ray line radiation from K-shell fluorescence. Integrated experiments, which combine target compression with short-pulse laser heating, yield additional information on target heating efficiency. This indirect way of studying the underlying behavior of the electrons must be validated with computational modeling to understand the physics and improve the design. This program execution required a large, well-organized team and it was managed by a joint Collaboration between General Atomics (GA), Lawrence Livermore National Laboratory (LLNL), and the Laboratory for Laser Energetics (LLE). The Collaboration was formed 8 years ago to understand the physics issues of the Fast Ignition concept, building on the strengths of each partner. GA fulfills its responsibilities jointly with the University of California, San Diego (UCSD), The Ohio State University (OSU) and the University of Nevada at Reno (UNR). Since RHED physics is pursued vigorously in many countries, international researchers have been an important part of our efforts to make progress. The division of responsibility was as follows: (1) LLE had primary leadership for channeling studies and the integrated energy transfer, (2) LLNL led the development of measurement methods, analysis, and deployment of diagnostics, and (3) GA together with UCSD, OSU and UNR studied the detailed energy-transfer physics. The experimental program was carried out using the Titan laser at the Jupiter Laser Facility at LLNL, the OMEGA and OMEGA EP lasers at LLE and the Texas Petawatt laser (TPW) at UT Austin. Modeling has been pursued on large computing facilities at LLNL, OSU, and UCSD using codes developed (by us and others) within the HEDLP program, commercial codes, and by leveraging existing supercomputer codes developed by the NNSA ICF program. This Consortium brought together all the components—resources, facilities, and personnel—necessary to accomplish its aggressive goals. The ACE Program has been strongly collaborative
Collectively loading programs in a multiple program multiple data environment

Science.gov (United States)

Aho, Michael E.; Attinella, John E.; Gooding, Thomas M.; Gooding, Thomas M.; Miller, Samuel J.

2016-11-08

Techniques are disclosed for loading programs efficiently in a parallel computing system. In one embodiment, nodes of the parallel computing system receive a load description file which indicates, for each program of a multiple program multiple data (MPMD) job, nodes which are to load the program. The nodes determine, using collective operations, a total number of programs to load and a number of programs to load in parallel. The nodes further generate a class route for each program to be loaded in parallel, where the class route generated for a particular program includes only those nodes on which the program needs to be loaded. For each class route, a node is selected using a collective operation to be a load leader which accesses a file system to load the program associated with a class route and broadcasts the program via the class route to other nodes which require the program.
Clean Coal Technology Demonstration Program: Program Update 2001

Energy Technology Data Exchange (ETDEWEB)

Assistant Secretary for Fossil Energy

2002-07-30

Annual report on the Clean Coal Technology Demonstration Program (CCT Program). The report address the role of the CCT Program, implementation, funding and costs, accomplishments, project descriptions, legislative history, program history, environmental aspects, and project contacts. The project descriptions describe the technology and provides a brief summary of the demonstration results. Also includes Power Plant Improvement Initiative Projects.
An Analysis of Programming Beginners' Source Programs

Science.gov (United States)

Matsuyama, Chieko; Nakashima, Toyoshiro; Ishii, Naohiro

The production of animations was made the subject of a university programming course in order to make students understand the process of program creation, and so that students could tackle programming with interest. In this paper, the formats and composition of the programs which students produced were investigated. As a result, it was found that there were a lot of problems related to such matters as how to use indent, how to apply comments and functions etc. for the format and the composition of the source codes.
Learners Programming Language a Helping System for Introductory Programming Courses

Directory of Open Access Journals (Sweden)

MUHAMMAD SHUMAIL NAVEED

2016-07-01

Full Text Available Programming is the core of computer science and due to this momentousness a special care is taken in designing the curriculum of programming courses. A substantial work has been conducted on the definition of programming courses, yet the introductory programming courses are still facing high attrition, low retention and lack of motivation. This paper introduced a tiny pre-programming language called LPL (Learners Programming Language as a ZPL (Zeroth Programming Language to illuminate novice students about elementary concepts of introductory programming before introducing the first imperative programming course. The overall objective and design philosophy of LPL is based on a hypothesis that the soft introduction of a simple and paradigm specific textual programming can increase the motivation level of novice students and reduce the congenital complexities and hardness of the first programming course and eventually improve the retention rate and may be fruitful in reducing the dropout/failure level. LPL also generates the equivalent high level programs from user source program and eventually very fruitful in understanding the syntax of introductory programming languages. To overcome the inherent complexities of unusual and rigid syntax of introductory programming languages, the LPL provide elementary programming concepts in the form of algorithmic and plain natural language based computational statements. The initial results obtained after the introduction of LPL are very encouraging in motivating novice students and improving the retention rate.

NIC symposium 2012. 25 years HLRZ/NIC. Proceedings

International Nuclear Information System (INIS)

Binder, Kurt

2012-01-01

Since 25 years the John von Neumann Institute for Computing (NIC), the former ''Hoechstleistungsrechenzentrum'', plays a pioneering role in supporting research in computational science at the fore-front, by giving large grants of computer time to carefully selected research projects. The scope of these projects ranges from fundamental aspects of physics, such as the physics of elementary particles and nuclear physics, astrophysics, statistical physics and physics of condensed matter, computational chemistry and life sciences, to more applied areas of research, such as the modelling of processes in the atmosphere, materials science, fluid dynamics applications in engineering, etc. Use of the supercomputer resources that the Juelich Supercomputing Centre (JSC) provides for these research projects. The present book, which appears in the framework of the biannual NIC Symposia series, continues a tradition started 10 years ago, to present selected highlights of this research to a broader audience. Due to space restrictions, only a small number of the research projects that are carried out at the NIC can be presented in this way. Projects that stand out as particularly excellent are nominated as ''John von Neumann Excellence Project'' by the review board. In 2010 this award was given to A. Muramatsu (Stuttgart) for his project on ''Quantum Monte Carlo studies of strongly correlated systems''. In 2011, two such awards were given to C. Hoelbling (Wuppertal) for his project ''Computing B K with 2+1 flavours at the physical mass point'', and another one to W. Paul (Halle) for ''Long range correlations at polymer-solid interfaces''. The procedures adopted by the NIC to identify the scientifically best projects for the allocation of computer time are of the same character as those used by organisations founded more recently, such as (in Germany) the Gauss Centre for Supercomputing (GCS), an alliance of the three German national supercomputing centres in Juelich, Garching and
NIC symposium 2012. 25 years HLRZ/NIC. Proceedings

Energy Technology Data Exchange (ETDEWEB)

Binder, Kurt [Mainz Univ. (Germany). Inst. fuer Physik; Muenster, Gernot [Muenster Univ. (Germany). Inst. fuer Theoretische Physik 1; Kremer, Manfred [Forschungszentrum Juelich GmbH (DE). Juelich Supercomputing Centre (JSC)

2012-08-07

Since 25 years the John von Neumann Institute for Computing (NIC), the former ''Hoechstleistungsrechenzentrum'', plays a pioneering role in supporting research in computational science at the fore-front, by giving large grants of computer time to carefully selected research projects. The scope of these projects ranges from fundamental aspects of physics, such as the physics of elementary particles and nuclear physics, astrophysics, statistical physics and physics of condensed matter, computational chemistry and life sciences, to more applied areas of research, such as the modelling of processes in the atmosphere, materials science, fluid dynamics applications in engineering, etc. Use of the supercomputer resources that the Juelich Supercomputing Centre (JSC) provides for these research projects. The present book, which appears in the framework of the biannual NIC Symposia series, continues a tradition started 10 years ago, to present selected highlights of this research to a broader audience. Due to space restrictions, only a small number of the research projects that are carried out at the NIC can be presented in this way. Projects that stand out as particularly excellent are nominated as ''John von Neumann Excellence Project'' by the review board. In 2010 this award was given to A. Muramatsu (Stuttgart) for his project on ''Quantum Monte Carlo studies of strongly correlated systems''. In 2011, two such awards were given to C. Hoelbling (Wuppertal) for his project ''Computing B{sub K} with 2+1 flavours at the physical mass point'', and another one to W. Paul (Halle) for ''Long range correlations at polymer-solid interfaces''. The procedures adopted by the NIC to identify the scientifically best projects for the allocation of computer time are of the same character as those used by organisations founded more recently, such as (in Germany) the Gauss Centre for Supercomputing (GCS), an alliance of the three German national supercomputing centres in Juelich, Garching
NIC symposium 2012. 25 years HLRZ/NIC. Proceedings

Energy Technology Data Exchange (ETDEWEB)

Binder, Kurt [Mainz Univ. (Germany). Inst. fuer Physik; Muenster, Gernot [Muenster Univ. (Germany). Inst. fuer Theoretische Physik 1; Kremer, Manfred (eds.) [Forschungszentrum Juelich GmbH (DE). Juelich Supercomputing Centre (JSC)

2012-08-07

Since 25 years the John von Neumann Institute for Computing (NIC), the former ''Hoechstleistungsrechenzentrum'', plays a pioneering role in supporting research in computational science at the fore-front, by giving large grants of computer time to carefully selected research projects. The scope of these projects ranges from fundamental aspects of physics, such as the physics of elementary particles and nuclear physics, astrophysics, statistical physics and physics of condensed matter, computational chemistry and life sciences, to more applied areas of research, such as the modelling of processes in the atmosphere, materials science, fluid dynamics applications in engineering, etc. Use of the supercomputer resources that the Juelich Supercomputing Centre (JSC) provides for these research projects. The present book, which appears in the framework of the biannual NIC Symposia series, continues a tradition started 10 years ago, to present selected highlights of this research to a broader audience. Due to space restrictions, only a small number of the research projects that are carried out at the NIC can be presented in this way. Projects that stand out as particularly excellent are nominated as ''John von Neumann Excellence Project'' by the review board. In 2010 this award was given to A. Muramatsu (Stuttgart) for his project on ''Quantum Monte Carlo studies of strongly correlated systems''. In 2011, two such awards were given to C. Hoelbling (Wuppertal) for his project ''Computing B{sub K} with 2+1 flavours at the physical mass point'', and another one to W. Paul (Halle) for ''Long range correlations at polymer-solid interfaces''. The procedures adopted by the NIC to identify the scientifically best projects for the allocation of computer time are of the same character as those used by organisations founded more recently, such as (in Germany) the Gauss Centre for
2016 ALCF Science Highlights

Energy Technology Data Exchange (ETDEWEB)

Collins, James R. [Argonne National Lab. (ANL), Argonne, IL (United States); Cerny, Beth A. [Argonne National Lab. (ANL), Argonne, IL (United States); Wolf, Laura [Argonne National Lab. (ANL), Argonne, IL (United States); Coffey, Richard M. [Argonne National Lab. (ANL), Argonne, IL (United States); Papka, Michael E. [Argonne National Lab. (ANL), Argonne, IL (United States)

2016-01-01

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
2015 Annual Report - Argonne Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Collins, James R. [Argonne National Lab. (ANL), Argonne, IL (United States); Papka, Michael E. [Argonne National Lab. (ANL), Argonne, IL (United States); Cerny, Beth A. [Argonne National Lab. (ANL), Argonne, IL (United States); Coffey, Richard M. [Argonne National Lab. (ANL), Argonne, IL (United States)

2015-01-01

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
2014 Annual Report - Argonne Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Collins, James R. [Argonne National Lab. (ANL), Argonne, IL (United States); Papka, Michael E. [Argonne National Lab. (ANL), Argonne, IL (United States); Cerny, Beth A. [Argonne National Lab. (ANL), Argonne, IL (United States); Coffey, Richard M. [Argonne National Lab. (ANL), Argonne, IL (United States)

2014-01-01

The Argonne Leadership Computing Facility provides supercomputing capabilities to the scientific and engineering community to advance fundamental discovery and understanding in a broad range of disciplines.
Analyzing Array Manipulating Programs by Program Transformation

Science.gov (United States)

Cornish, J. Robert M.; Gange, Graeme; Navas, Jorge A.; Schachte, Peter; Sondergaard, Harald; Stuckey, Peter J.

2014-01-01

We explore a transformational approach to the problem of verifying simple array-manipulating programs. Traditionally, verification of such programs requires intricate analysis machinery to reason with universally quantified statements about symbolic array segments, such as "every data item stored in the segment A[i] to A[j] is equal to the corresponding item stored in the segment B[i] to B[j]." We define a simple abstract machine which allows for set-valued variables and we show how to translate programs with array operations to array-free code for this machine. For the purpose of program analysis, the translated program remains faithful to the semantics of array manipulation. Based on our implementation in LLVM, we evaluate the approach with respect to its ability to extract useful invariants and the cost in terms of code size.
Ab initio molecular dynamics simulations for the role of hydrogen in catalytic reactions of furfural on Pd(111)

Science.gov (United States)

Xue, Wenhua; Dang, Hongli; Liu, Yingdi; Jentoft, Friederike; Resasco, Daniel; Wang, Sanwu

2014-03-01

In the study of catalytic reactions of biomass, furfural conversion over metal catalysts with the presence of hydrogen has attracted wide attention. We report ab initio molecular dynamics simulations for furfural and hydrogen on the Pd(111) surface at finite temperatures. The simulations demonstrate that the presence of hydrogen is important in promoting furfural conversion. In particular, hydrogen molecules dissociate rapidly on the Pd(111) surface. As a result of such dissociation, atomic hydrogen participates in the reactions with furfural. The simulations also provide detailed information about the possible reactions of hydrogen with furfural. Supported by DOE (DE-SC0004600). This research used the supercomputer resources of the XSEDE, the NERSC Center, and the Tandy Supercomputing Center.
Scalability of DL_POLY on High Performance Computing Platform

Directory of Open Access Journals (Sweden)

Mabule Samuel Mabakane

2017-12-01

Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.
The Cure for HPC Neurosis: Multiple, Virtual Personalities!

Energy Technology Data Exchange (ETDEWEB)

Farber, Rob

2007-06-30

The selection of a new supercomputer for a scientific data center represents an interesting neurotic condition stemming from the conflict between a compulsion to acquire the best of the latest generation computer hardware, and unresolved issues as users seek validation from legacy scientific software - sometimes euphemistically called “research quality code”. Virtualization technology, now a mainstream feature on modern processors, permits multiple operating systems to efficiently and simultaneously run on each node of a supercomputer (or even your laptop and workstation). The benefits of this technology are many, ranging from supporting legacy software to paving the way towards robust petascale (1015 floating point operations per second) and eventually exascale (1018 floating point operations per second) computing.
1984 CERN school of computing

International Nuclear Information System (INIS)

1985-01-01

The eighth CERN School of Computing covered subjects mainly related to computing for elementary-particle physics. These proceedings contain written versions of most of the lectures delivered at the School. Notes on the following topics are included: trigger and data-acquisition plans for the LEP experiments; unfolding methods in high-energy physics experiments; Monte Carlo techniques; relational data bases; data networks and open systems; the Newcastle connection; portable operating systems; expert systems; microprocessors - from basic chips to complete systems; algorithms for parallel computers; trends in supercomputers and computational physics; supercomputing and related national projects in Japan; application of VLSI in high-energy physics, and single-user systems. See hints under the relevant topics. (orig./HSI)
Subseabed-disposal program: systems-analysis program plan

International Nuclear Information System (INIS)

Klett, R.D.

1981-03-01

This report contains an overview of the Subseabed Nuclear Waste Disposal Program systems analysis program plan, and includes sensitivity, safety, optimization, and cost/benefit analyses. Details of the primary barrier sensitivity analysis and the data acquisition and modeling cost/benefit studies are given, as well as the schedule through the technical, environmental, and engineering feasibility phases of the program
Federal Wind Energy Program. Program summary. [USA

Energy Technology Data Exchange (ETDEWEB)

None

1978-01-01

The objective of the Federal Wind Energy Program is to accelerate the development of reliable and economically viable wind energy systems and enable the earliest possible commercialization of wind power. To achieve this objective for small and large wind systems requires advancing the technology, developing a sound industrial technology base, and addressing the non-technological issues which could deter the use of wind energy. This summary report outlines the projects being supported by the program through FY 1977 toward the achievement of these goals. It also outlines the program's general organization and specific program elements.
Automatic Construction of Java Programs from Functional Program Specifications

OpenAIRE

Md. Humayun Kabir

2015-01-01

This paper presents a novel approach to construct Java programs automatically from the input functional program specifications on natural numbers from the constructive proofs of the input specifications using an inductive theorem prover called Poiti'n. The construction of a Java program from the input functional program specification involves two phases. The theorem prover is used to construct a higher order functional (HOF) program from the input specification expressed as an existential the...
Environmental radioactive intercomparison program and radioactive standards program

Energy Technology Data Exchange (ETDEWEB)

Dilbeck, G. [Environmental Monitoring Systems Laboratory, Las Vegas, NV (United States)

1993-12-31

The Environmental Radioactivity Intercomparison Program described herein provides quality assurance support for laboratories involved in analyzing public drinking water under the Safe Drinking Water Act (SDWA) Regulations, and to the environmental radiation monitoring activities of various agencies. More than 300 federal and state nuclear facilities and private laboratories participate in some phase of the program. This presentation describes the Intercomparison Program studies and matrices involved, summarizes the precision and accuracy requirements of various radioactive analytes, and describes the traceability determinations involved with radioactive calibration standards distributed to the participants. A summary of program participants, sample and report distributions, and additional responsibilities of this program are discussed.
Bringing ATLAS production to HPC resources. A case study with SuperMuc and Hydra

Energy Technology Data Exchange (ETDEWEB)

Duckeck, Guenter; Walker, Rodney [LMU Muenchen (Germany); Kennedy, John; Mazzaferro, Luca [RZG Garching (Germany); Kluth, Stefan [Max-Planck-Institut fuer Physik, Muenchen (Germany); Collaboration: ATLAS-Collaboration

2015-07-01

The possible usage of Supercomputer systems or HPC resources by ATLAS is now becoming viable due to the changing nature of these systems and it is also very attractive due to the need for increasing amounts of simulated data. The ATLAS experiment at CERN will begin a period of high luminosity data taking in 2015. The corresponding need for simulated data might potentially exceed the capabilities of the current Grid infrastructure. ATLAS aims to address this need by opportunistically accessing resources such as cloud and HPC systems. This contribution presents the results of two projects undertaken by LMU/LRZ and MPP/RZG to use the supercomputer facilities SuperMuc (LRZ) and Hydra (RZG). Both are Linux based supercomputers in the 100 k CPU-core category. The integration of such HPC resources into the ATLAS production system poses many challenges. Firstly, established techniques and features of standard WLCG operation are prohibited or much restricted on HPC systems, e.g. Grid middleware, software installation, outside connectivity, etc. Secondly, efficient use of available resources requires massive multi-core jobs, back-fill submission and check-pointing. We discuss the customization of these components and the strategies for HPC usage as well as possibilities for future directions.
Veterinary Technician Program Director Leadership Style and Program Success

Science.gov (United States)

Renda-Francis, Lori A.

2012-01-01

Program directors of American Veterinary Medical Association (AVMA) accredited veterinary technician programs may have little or no training in leadership. The need for program directors of AVMA-accredited veterinary technician programs to understand how leadership traits may have an impact on student success is often overlooked. The purpose of…
Large-scale computation in solid state physics - Recent developments and prospects

International Nuclear Information System (INIS)

DeVreese, J.T.

1985-01-01

During the past few years an increasing interest in large-scale computation is developing. Several initiatives were taken to evaluate and exploit the potential of ''supercomputers'' like the CRAY-1 (or XMP) or the CYBER-205. In the U.S.A., there first appeared the Lax report in 1982 and subsequently (1984) the National Science Foundation in the U.S.A. announced a program to promote large-scale computation at the universities. Also, in Europe several CRAY- and CYBER-205 systems have been installed. Although the presently available mainframes are the result of a continuous growth in speed and memory, they might have induced a discontinuous transition in the evolution of the scientific method; between theory and experiment a third methodology, ''computational science'', has become or is becoming operational
Highly parallel machines and future of scientific computing

International Nuclear Information System (INIS)

Singh, G.S.

1992-01-01

Computing requirement of large scale scientific computing has always been ahead of what state of the art hardware could supply in the form of supercomputers of the day. And for any single processor system the limit to increase in the computing power was realized a few years back itself. Now with the advent of parallel computing systems the availability of machines with the required computing power seems a reality. In this paper the author tries to visualize the future large scale scientific computing in the penultimate decade of the present century. The author summarized trends in parallel computers and emphasize the need for a better programming environment and software tools for optimal performance. The author concludes this paper with critique on parallel architectures, software tools and algorithms. (author). 10 refs., 2 tabs
GPAW - massively parallel electronic structure calculations with Python-based software

DEFF Research Database (Denmark)

Enkovaara, Jussi; Romero, Nichols A.; Shende, Sameer

2011-01-01

of the productivity enhancing features together with a good numerical performance. We have used this approach in implementing an electronic structure simulation software GPAW using the combination of Python and C programming languages. While the chosen approach works well in standard workstations and Unix...... popular choice. While dynamic, interpreted languages, such as Python, can increase the effciency of programmer, they cannot compete directly with the raw performance of compiled languages. However, by using an interpreted language together with a compiled language, it is possible to have most...... environments, massively parallel supercomputing systems can present some challenges in porting, debugging and profiling the software. In this paper we describe some details of the implementation and discuss the advantages and challenges of the combined Python/C approach. We show that despite the challenges...

Programming

International Nuclear Information System (INIS)

Jackson, M.A.

1982-01-01

The programmer's task is often taken to be the construction of algorithms, expressed in hierarchical structures of procedures: this view underlies the majority of traditional programming languages, such as Fortran. A different view is appropriate to a wide class of problem, perhaps including some problems in High Energy Physics. The programmer's task is regarded as having three main stages: first, an explicit model is constructed of the reality with which the program is concerned; second, this model is elaborated to produce the required program outputs; third, the resulting program is transformed to run efficiently in the execution environment. The first two stages deal in network structures of sequential processes; only the third is concerned with procedure hierarchies. (orig.)
Knowledge, programming, and programming cultures: LISP, C, and Ada

Science.gov (United States)

Rochowiak, Daniel

1990-01-01

The results of research 'Ada as an implementation language for knowledge based systems' are presented. The purpose of the research was to compare Ada to other programming languages. The report focuses on the programming languages Ada, C, and Lisp, the programming cultures that surround them, and the programming paradigms they support.
Los Alamos National Laboratory Science Education Programs. Progress report, October 1, 1994--December 31, 1994

Energy Technology Data Exchange (ETDEWEB)

Gill, D.H.

1995-02-01

During the 1994 summer institute NTEP teachers worked in coordination with LANL and the Los Alamos Middle School and Mountain Elementary School to gain experience in communicating on-line, to gain further information from the Internet and in using electronic Bulletin Board Systems (BBSs) to exchange ideas with other teachers. To build on their telecommunications skills, NTEP teachers participated in the International Telecommunications In Education Conference (Tel*ED `94) at the Albuquerque Convention Center on November 11 & 12, 1994. They attended the multimedia keynote address, various workshops highlighting many aspects of educational telecommunications skills, and the Telecomm Rodeo sponsored by Los Alamos National Laboratory. The Rodeo featured many presentations by Laboratory personnel and educational institutions on ways in which telecommunications technologies can be use din the classroom. Many were of the `hands-on` type, so that teachers were able to try out methods and equipment and evaluate their usefulness in their own schools and classrooms. Some of the presentations featured were the Geonet educational BBS system, the Supercomputing Challenge, and the Sunrise Project, all sponsored by LANL; the `CU-seeMe` live video software, various simulation software packages, networking help, and many other interesting and useful exhibits.
System programming languages

OpenAIRE

Šmit, Matej

2016-01-01

Most operating systems are written in the C programming language. Similar is with system software, for example, device drivers, compilers, debuggers, disk checkers, etc. Recently some new programming languages emerged, which are supposed to be suitable for system programming. In this thesis we present programming languages D, Go, Nim and Rust. We defined the criteria which are important for deciding whether programming language is suitable for system programming. We examine programming langua...
Equipment qualification research program: program plan

International Nuclear Information System (INIS)

Dong, R.G.; Smith, P.D.

1982-01-01

The Lawrence Livermore National Laboratory (LLNL) under the sponsorship of the US Nuclear Regulatory Commission (NRC) has developed this program plan for research in equipment qualification (EQA). In this report the research program which will be executed in accordance with this plan will be referred to as the Equipment Qualification Research Program (EQRP). Covered are electrical and mechanical equipment under the conditions described in the OBJECTIVE section of this report. The EQRP has two phases; Phase I is primarily to produce early results and to develop information for Phase II. Phase I will last 18 months and consists of six projects. The first project is program management. The second project is responsible for in-depth evaluation and review of EQ issues and EQ processes. The third project is responsible for detailed planning to initiate Phase II. The remaining three projects address specific equipment; i.e., valves, electrical equipment, and a pump
40 CFR 68.175 - Prevention program/Program 3.

Science.gov (United States)

2010-07-01

... (CONTINUED) CHEMICAL ACCIDENT PREVENTION PROVISIONS Risk Management Plan § 68.175 Prevention program/Program 3. (a) For each Program 3 process, the owner or operator shall provide the information indicated in paragraphs (b) through (p) of this section. If the same information applies to more than one covered process...
NETL Super Computer

Data.gov (United States)

Federal Laboratory Consortium — The NETL Super Computer was designed for performing engineering calculations that apply to fossil energy research. It is one of the world’s larger supercomputers,...
Sadhana | Indian Academy of Sciences

Indian Academy of Sciences (India)

... VLSI clock interconnects; delay variability; PDF; process variation; Gaussian random ... Supercomputer Education and Research Centre, Indian Institute of Science, ... Manuscript received: 27 February 2009; Manuscript revised: 9 February ...
A Scheduling-Based Framework for Efficient Massively Parallel Execution, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — The barrier to entry creating efficient, scalable applications for heterogeneous supercomputing environments is too high. EM Photonics has found that the majority of...
A strategy for automatically generating programs in the lucid programming language

Science.gov (United States)

Johnson, Sally C.

1987-01-01

A strategy for automatically generating and verifying simple computer programs is described. The programs are specified by a precondition and a postcondition in predicate calculus. The programs generated are in the Lucid programming language, a high-level, data-flow language known for its attractive mathematical properties and ease of program verification. The Lucid programming is described, and the automatic program generation strategy is described and applied to several example problems.
Stochastic integer programming by dynamic programming

NARCIS (Netherlands)

Lageweg, B.J.; Lenstra, J.K.; Rinnooy Kan, A.H.G.; Stougie, L.; Ermoliev, Yu.; Wets, R.J.B.

1988-01-01

Stochastic integer programming is a suitable tool for modeling hierarchical decision situations with combinatorial features. In continuation of our work on the design and analysis of heuristics for such problems, we now try to find optimal solutions. Dynamic programming techniques can be used to
Applied Energy Program

Science.gov (United States)

Science Programs Applied Energy Programs Civilian Nuclear Energy Programs Laboratory Directed Research » Applied Energy Program Applied Energy Program Los Alamos is using its world-class scientific capabilities to enhance national energy security by developing energy sources with limited environmental impact
Computer-Assisted Program Reasoning Based on a Relational Semantics of Programs

Directory of Open Access Journals (Sweden)

Wolfgang Schreiner

2012-02-01

Full Text Available We present an approach to program reasoning which inserts between a program and its verification conditions an additional layer, the denotation of the program expressed in a declarative form. The program is first translated into its denotation from which subsequently the verification conditions are generated. However, even before (and independently of any verification attempt, one may investigate the denotation itself to get insight into the "semantic essence" of the program, in particular to see whether the denotation indeed gives reason to believe that the program has the expected behavior. Errors in the program and in the meta-information may thus be detected and fixed prior to actually performing the formal verification. More concretely, following the relational approach to program semantics, we model the effect of a program as a binary relation on program states. A formal calculus is devised to derive from a program a logic formula that describes this relation and is subject for inspection and manipulation. We have implemented this idea in a comprehensive form in the RISC ProgramExplorer, a new program reasoning environment for educational purposes which encompasses the previously developed RISC ProofNavigator as an interactive proving assistant.
Abstract Interpretation of PIC programs through Logic Programming

DEFF Research Database (Denmark)

Henriksen, Kim Steen; Gallagher, John Patrick

2006-01-01

, are applied to the logic based model of the machine. A small PIC microcontroller is used as a case study. An emulator for this microcontroller is written in Prolog, and standard programming transformations and analysis techniques are used to specialise this emulator with respect to a given PIC program....... The specialised emulator can now be further analysed to gain insight into the given program for the PIC microcontroller. The method describes a general framework for applying abstractions, illustrated here by linear constraints and convex hull analysis, to logic programs. Using these techniques on the specialised...
Playing by Programming: Making Gameplay a Programming Activity

Science.gov (United States)

Weintrop, David; Wilensky, Uri

2016-01-01

Video games are an oft-cited reason for young learners getting interested in programming and computer science. As such, many learning opportunities build on this interest by having kids program their own video games. This approach, while sometimes successful, has its drawbacks stemming from the fact that the challenge of programming and game…
40 CFR 68.170 - Prevention program/Program 2.

Science.gov (United States)

2010-07-01

... (CONTINUED) CHEMICAL ACCIDENT PREVENTION PROVISIONS Risk Management Plan § 68.170 Prevention program/Program 2. (a) For each Program 2 process, the owner or operator shall provide in the RMP the information... the process. (c) The name(s) of the chemical(s) covered. (d) The date of the most recent review or...
Verification of Imperative Programs by Constraint Logic Program Transformation

Directory of Open Access Journals (Sweden)

Emanuele De Angelis

2013-09-01

Full Text Available We present a method for verifying partial correctness properties of imperative programs that manipulate integers and arrays by using techniques based on the transformation of constraint logic programs (CLP. We use CLP as a metalanguage for representing imperative programs, their executions, and their properties. First, we encode the correctness of an imperative program, say prog, as the negation of a predicate 'incorrect' defined by a CLP program T. By construction, 'incorrect' holds in the least model of T if and only if the execution of prog from an initial configuration eventually halts in an error configuration. Then, we apply to program T a sequence of transformations that preserve its least model semantics. These transformations are based on well-known transformation rules, such as unfolding and folding, guided by suitable transformation strategies, such as specialization and generalization. The objective of the transformations is to derive a new CLP program TransfT where the predicate 'incorrect' is defined either by (i the fact 'incorrect.' (and in this case prog is not correct, or by (ii the empty set of clauses (and in this case prog is correct. In the case where we derive a CLP program such that neither (i nor (ii holds, we iterate the transformation. Since the problem is undecidable, this process may not terminate. We show through examples that our method can be applied in a rather systematic way, and is amenable to automation by transferring to the field of program verification many techniques developed in the field of program transformation.
Knowledge based systems for nuclear applications in Germany

International Nuclear Information System (INIS)

Schmidt, F.

1987-01-01

Several national and international research programs which are dealing with artificial intelligence and other innovative computer applications are in progress in Germany. However in contrast to the development of computer applications in the past, the new research programs are not very much determined from needs of the nuclear industry. Thus, applications of AI techniques in German nuclear industry are not very innovative in the sense of artificial intelligence. They may be divided into two categories: 1. projects which are aimed to explore the new technologies, 2. projects which are aimed to open new areas of work. This situation changes due to the fact that supercomputers with large memory, workstations with cheap disc devices and fast networks are becoming available. These hardware devices allow the connection of locally available knowledge and data bases with powerful central computer capacity. Using such hardware tools new applications can be developed in nuclear engineering using even existing software tools. These new applications may be characterized as integrated systems. The Integral Planning Simulation System IPSS which is under development at the University of Stuttgart is such a system
Building Strong Geoscience Programs: Perspectives From Three New Programs

Science.gov (United States)

Flood, T. P.; Munk, L.; Anderson, S. W.

2005-12-01

During the past decade, at least sixteen geoscience departments in the U.S. that offer a B.S. degree or higher have been eliminated or dispersed. During that same time, three new geoscience departments with degree-granting programs have been developed. Each program has unique student demographics, affiliation (i.e. public institution versus private liberal arts college), geoscience curricula and reasons for initiation. Some of the common themes for each program include; 1) strong devotion to providing field experiences, 2) commitment to student-faculty collaborative research, 3) maintaining traditional geology program elements in the core curriculum and 4) placing students into high quality graduate programs and geoscience careers. Although the metrics for each school vary, each program can claim success in the area of maintaining solid enrollments. This metric is critical because programs are successful only if they have enough students, either in the major and/or general education courses, to convince administrators that continued support of faculty, including space and funding is warranted. Some perspectives gained through the establishment of these new programs may also be applicable to established programs. The success and personality of a program can be greatly affected by the personality of a single faculty member. Therefore, it may not be in the best interest of a program to distribute programmatic work equally among all faculty. For example, critical responsibilities such as teaching core and introductory courses should be the responsibility of faculty who are fully committed to these pursuits. However, if these responsibilities reduce scholarly output, well-articulated arguments should be developed in order to promote program quality and sustainability rather than individual productivity. Field and undergraduate research experiences should be valued as much as high-quality classroom and laboratory instruction. To gain the support of the administration
Interfacing ANSYS to user's programs using UNIX shell program

Energy Technology Data Exchange (ETDEWEB)

Kim, In Yong; Kim, Beom Shig [Korea Atomic Energy Research Institute, Taejon (Korea, Republic of)

1994-01-01

It has been considered to be impossible to interface the ANSYS, which is the commercial finite element code and whose program is not open to public, to the other user's program. When the analysis need to be iterated, the user should wait until the analysis is finished and read the ANSYS result to make the input data for every iteration. In this report the direct interfacing techniques between the ANSYS and the other program using UNIX shell programming are proposed. The detail program lists and the application example are also provided. (Author) 19 refs., 6 figs., 7 tabs.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.