high-performance embedded computing: Topics by WorldWideScience.org

Sample records for high-performance embedded computing

Embedded High Performance Scalable Computing Systems

National Research Council Canada - National Science Library

Ngo, David

2003-01-01

The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...
Micromagnetics on high-performance workstation and mobile computational platforms

Science.gov (United States)

Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

2015-05-01

The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.
Nested Interrupt Analysis of Low Cost and High Performance Embedded Systems Using GSPN Framework

Science.gov (United States)

Lin, Cheng-Min

Interrupt service routines are a key technology for embedded systems. In this paper, we introduce the standard approach for using Generalized Stochastic Petri Nets (GSPNs) as a high-level model for generating CTMC Continuous-Time Markov Chains (CTMCs) and then use Markov Reward Models (MRMs) to compute the performance for embedded systems. This framework is employed to analyze two embedded controllers with low cost and high performance, ARM7 and Cortex-M3. Cortex-M3 is designed with a tail-chaining mechanism to improve the performance of ARM7 when a nested interrupt occurs on an embedded controller. The Platform Independent Petri net Editor 2 (PIPE2) tool is used to model and evaluate the controllers in terms of power consumption and interrupt overhead performance. Using numerical results, in spite of the power consumption or interrupt overhead, Cortex-M3 performs better than ARM7.
Proceedings of the High Performance Embedded Computing Workshop (HPEC 2006) (10th). Held in Lexington, Massachusetts on September 19-21, 2006 (CD-ROM)

National Research Council Canada - National Science Library

Kepner, Jeremy

2007-01-01

...: 1 CD-ROM; 4 3/4 in.; 78.3 MB. ABSTRACT: The High-Performance Embedded Computing (HPEC) technical committee announced the tenth annual HPEC Workshop held in September 2006 at MIT Lincoln Laboratory in Lexington, MA...
Modern Embedded Computing Designing Connected, Pervasive, Media-Rich Systems

CERN Document Server

Barry, Peter

2012-01-01

Modern embedded systems are used for connected, media-rich, and highly integrated handheld devices such as mobile phones, digital cameras, and MP3 players. All of these embedded systems require networking, graphic user interfaces, and integration with PCs, as opposed to traditional embedded processors that can perform only limited functions for industrial applications. While most books focus on these controllers, Modern Embedded Computing provides a thorough understanding of the platform architecture of modern embedded computing systems that drive mobile devices. The book offers a comprehen
A low-cost high-performance embedded platform for accelerator controls

International Nuclear Information System (INIS)

Cleva, Stefano; Bogani, Alessio Igor; Pivetta, Lorenzo

2012-01-01

Over the last years the mobile and hand-held device market has seen a dramatic performance improvement of the microprocessors employed for these systems. As an interesting side effect, this brings the opportunity of adopting these microprocessors to build small low-cost embedded boards, featuring lots of processing power and input/output capabilities. Moreover, being capable of running a full featured operating system such as Gnu/Linux, and even a control system toolkit such as Tango, these boards can also be used in control systems as front-end or embedded computers. In order to evaluate the feasibility of this idea, an activity has started at Elettra to select, evaluate and validate a commercial embedded device able to guarantee production grade reliability, competitive costs and an open source platform. The preliminary results of this work are presented. (author)
Integrating Embedded Computing Systems into High School and Early Undergraduate Education

Science.gov (United States)

Benson, B.; Arfaee, A.; Choon Kim; Kastner, R.; Gupta, R. K.

2011-01-01

Early exposure to embedded computing systems is crucial for students to be prepared for the embedded computing demands of today's world. However, exposure to systems knowledge often comes too late in the curriculum to stimulate students' interests and to provide a meaningful difference in how they direct their choice of electives for future…
Embedded, everywhere: a research agenda for networked systems of embedded computers

National Research Council Canada - National Science Library

Committee on Networked Systems of Embedded Computers; National Research Council Staff; Division on Engineering and Physical Sciences; Computer Science and Telecommunications Board; National Academy of Sciences

2001-01-01

.... Embedded, Everywhere explores the potential of networked systems of embedded computers and the research challenges arising from embedding computation and communications technology into a wide variety of applicationsâ...
Perbandingan Kemampuan Embedded Computer dengan General Purpose Computer untuk Pengolahan Citra

Directory of Open Access Journals (Sweden)

Herryawan Pujiharsono

2017-08-01

Full Text Available Perkembangan teknologi komputer membuat pengolahan citra saat ini banyak dikembangkan untuk dapat membantu manusia di berbagai bidang pekerjaan. Namun, tidak semua bidang pekerjaan dapat dikembangkan dengan pengolahan citra karena tidak mendukung penggunaan komputer sehingga mendorong pengembangan pengolahan citra dengan mikrokontroler atau mikroprosesor khusus. Perkembangan mikrokontroler dan mikroprosesor memungkinkan pengolahan citra saat ini dapat dikembangkan dengan embedded computer atau single board computer (SBC. Penelitian ini bertujuan untuk menguji kemampuan embedded computer dalam mengolah citra dan membandingkan hasilnya dengan komputer pada umumnya (general purpose computer. Pengujian dilakukan dengan mengukur waktu eksekusi dari empat operasi pengolahan citra yang diberikan pada sepuluh ukuran citra. Hasil yang diperoleh pada penelitian ini menunjukkan bahwa optimasi waktu eksekusi embedded computer lebih baik jika dibandingkan dengan general purpose computer dengan waktu eksekusi rata-rata embedded computer adalah 4-5 kali waktu eksekusi general purpose computer dan ukuran citra maksimal yang tidak membebani CPU terlalu besar untuk embedded computer adalah 256x256 piksel dan untuk general purpose computer adalah 400x300 piksel.
Computers as components principles of embedded computing system design

CERN Document Server

Wolf, Marilyn

2012-01-01

Computers as Components: Principles of Embedded Computing System Design, 3e, presents essential knowledge on embedded systems technology and techniques. Updated for today's embedded systems design methods, this edition features new examples including digital signal processing, multimedia, and cyber-physical systems. Author Marilyn Wolf covers the latest processors from Texas Instruments, ARM, and Microchip Technology plus software, operating systems, networks, consumer devices, and more. Like the previous editions, this textbook: Uses real processors to demonstrate both technology and tec
Implementation of an embedded computer

OpenAIRE

Pikl, Bojan

2011-01-01

The goal of this thesis is to describe a production of an embedded computer. The thesis describes development and production of an embedded computer for the medical diode laser DL30 that is being developed in Robomed d.o.o.. The first part of the thesis describes the choice of hardware devices. I mostly describe the technologies that one can buy on the market. Moreover for every part of the computer installed and developed there is an argument why we selected that exact part. The second part ...
High-performance computing using FPGAs

CERN Document Server

Benkrid, Khaled

2013-01-01

This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community. The book includes: Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation. Seven architecture chapters which...
Trusted computing for embedded systems

CERN Document Server

Soudris, Dimitrios; Anagnostopoulos, Iraklis

2015-01-01

This book describes the state-of-the-art in trusted computing for embedded systems. It shows how a variety of security and trusted computing problems are addressed currently and what solutions are expected to emerge in the coming years. The discussion focuses on attacks aimed at hardware and software for embedded systems, and the authors describe specific solutions to create security features. Case studies are used to present new techniques designed as industrial security solutions. Coverage includes development of tamper resistant hardware and firmware mechanisms for lightweight embedded devices, as well as those serving as security anchors for embedded platforms required by applications such as smart power grids, smart networked and home appliances, environmental and infrastructure sensor networks, etc. · Enables readers to address a variety of security threats to embedded hardware and software; · Describes design of secure wireless sensor networks, to address secure authen...
Advanced Technologies, Embedded and Multimedia for Human-Centric Computing

CERN Document Server

Chao, Han-Chieh; Deng, Der-Jiunn; Park, James; HumanCom and EMC 2013

2014-01-01

The theme of HumanCom and EMC are focused on the various aspects of human-centric computing for advances in computer science and its applications, embedded and multimedia computing and provides an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of human-centric computing. And the theme of EMC (Advanced in Embedded and Multimedia Computing) is focused on the various aspects of embedded system, smart grid, cloud and multimedia computing, and it provides an opportunity for academic, industry professionals to discuss the latest issues and progress in the area of embedded and multimedia computing. Therefore this book will be include the various theories and practical applications in human-centric computing and embedded and multimedia computing.
Advances in embedded computer vision

CERN Document Server

Kisacanin, Branislav

2014-01-01

This illuminating collection offers a fresh look at the very latest advances in the field of embedded computer vision. Emerging areas covered by this comprehensive text/reference include the embedded realization of 3D vision technologies for a variety of applications, such as stereo cameras on mobile devices. Recent trends towards the development of small unmanned aerial vehicles (UAVs) with embedded image and video processing algorithms are also examined. The authoritative insights range from historical perspectives to future developments, reviewing embedded implementation, tools, technolog
Rad-hard embedded computers for nuclear robotics

International Nuclear Information System (INIS)

Giraud, A.; Joffre, F.; Marceau, M.; Robiolle, M.; Brunet, J.P.; Mijuin, D.

1993-01-01

For requirements of nuclear industries, it is necessary to use robots with embedded rad hard electronics and high level safety. The computer developed for french research program SYROCO is presented in this paper. (authors). 8 refs., 5 figs
High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael; HLRS 2017

2018-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
Design of massively parallel hardware multi-processors for highly-demanding embedded applications

NARCIS (Netherlands)

Jozwiak, L.; Jan, Y.

2013-01-01

Many new embedded applications require complex computations to be performed to tight schedules, while at the same time demanding low energy consumption and low cost. For implementation of these highly-demanding applications, highly-optimized application-specific multi-processor system-on-a-chip
High performance embedded system for real-time pattern matching

Energy Technology Data Exchange (ETDEWEB)

Sotiropoulou, C.-L., E-mail: c.sotiropoulou@cern.ch [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Luciano, P. [University of Cassino and Southern Lazio, Gaetano di Biasio 43, Cassino 03043 (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Gkaitatzis, S. [Aristotle University of Thessaloniki, 54124 Thessaloniki (Greece); Citraro, S. [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Giannetti, P. [INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy); Dell' Orso, M. [University of Pisa, Largo B. Pontecorvo 3, 56127 Pisa (Italy); INFN-Pisa Section, Largo B. Pontecorvo 3, 56127 Pisa (Italy)

2017-02-11

In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton–proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device. - Highlights: • A high performance embedded system for real-time pattern matching is proposed. • It is based on a system developed for High Energy Physics experiment triggers. • It mimics the operation of the human brain (cognitive image processing). • The process can be executed on 2D and 3D, black and white or grayscale images. • The implementation uses FPGAs and custom designed associative memory (AM) chips.

High performance embedded system for real-time pattern matching

International Nuclear Information System (INIS)

Sotiropoulou, C.-L.; Luciano, P.; Gkaitatzis, S.; Citraro, S.; Giannetti, P.; Dell'Orso, M.

2017-01-01

In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton–proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering) are also implemented on the FPGA. The pattern matching can be executed on a 2D or 3D space, on black and white or grayscale images, depending on the application and thus increasing exponentially the processing requirements of the system. We present the firmware implementation of the training and pattern matching algorithm, performance and results on a latest generation Xilinx Kintex Ultrascale FPGA device. - Highlights: • A high performance embedded system for real-time pattern matching is proposed. • It is based on a system developed for High Energy Physics experiment triggers. • It mimics the operation of the human brain (cognitive image processing). • The process can be executed on 2D and 3D, black and white or grayscale images. • The implementation uses FPGAs and custom designed associative memory (AM) chips.
Computational Biology and High Performance Computing 2000

Energy Technology Data Exchange (ETDEWEB)

Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

2000-10-19

The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.
Embedded computer systems for control applications in EBR-II

International Nuclear Information System (INIS)

Carlson, R.B.; Start, S.E.

1993-01-01

The purpose of this paper is to describe the embedded computer systems approach taken at Experimental Breeder Reactor II (EBR-II) for non-safety related systems. The hardware and software structures for typical embedded systems are presented The embedded systems development process is described. Three examples are given which illustrate typical embedded computer applications in EBR-II
Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

Directory of Open Access Journals (Sweden)

Daniel Pierre Bovet

2008-01-01

Full Text Available Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.
Hard Real-Time Performances in Multiprocessor-Embedded Systems Using ASMP-Linux

Directory of Open Access Journals (Sweden)

Betti Emiliano

2008-01-01

Full Text Available Abstract Multiprocessor systems, especially those based on multicore or multithreaded processors, and new operating system architectures can satisfy the ever increasing computational requirements of embedded systems. ASMP-LINUX is a modified, high responsiveness, open-source hard real-time operating system for multiprocessor systems capable of providing high real-time performance while maintaining the code simple and not impacting on the performances of the rest of the system. Moreover, ASMP-LINUX does not require code changing or application recompiling/relinking. In order to assess the performances of ASMP-LINUX, benchmarks have been performed on several hardware platforms and configurations.
Embedded Volttron specification - benchmarking small footprint compute device for Volttron

Energy Technology Data Exchange (ETDEWEB)

Sanyal, Jibonananda [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Fugate, David L. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Woodworth, Ken [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Nutaro, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kuruganti, Teja [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2015-08-17

An embedded system is a small footprint computing unit that typically serves a specific purpose closely associated with measurements and control of hardware devices. These units are designed for reasonable durability and operations in a wide range of operating conditions. Some embedded systems support real-time operations and can demonstrate high levels of reliability. Many have failsafe mechanisms built to handle graceful shutdown of the device in exception conditions. The available memory, processing power, and network connectivity of these devices are limited due to the nature of their specific-purpose design and intended application. Industry practice is to carefully design the software for the available hardware capability to suit desired deployment needs. Volttron is an open source agent development and deployment platform designed to enable researchers to interact with devices and appliances without having to write drivers themselves. Hosting Volttron on small footprint embeddable devices enables its demonstration for embedded use. This report details the steps required and the experience in setting up and running Volttron applications on three small footprint devices: the Intel Next Unit of Computing (NUC), the Raspberry Pi 2, and the BeagleBone Black. In addition, the report also details preliminary investigation of the execution performance of Volttron on these devices.
The selection of embedded computer using in the nuclear physics instruments

International Nuclear Information System (INIS)

Zhang Jianchuan; Nan Gangyang; Wang Yanyu; Su Hong

2010-01-01

It introduces the requirement for embedded PC and the benefits of using it in the experimental nuclear physics instrument developing and improving project. A cording to the specific requirements in the project of improving laboratory instruments. several kinds of embedded computer are compared and specifically tested. Thus, a x86 architecture embedded computer, which have ultra-low-power consumption and a small in size, is selected to be the main component of the controller using in the nuclear physics instrument, and this will be used in the high-speed data acquisition and electronic control system. (authors)
High Performance Embedded System for Real-Time Pattern Matching

CERN Document Server

Sotiropoulou, Calliope Louisa; The ATLAS collaboration; Gkaitatzis, Stamatios; Citraro, Saverio; Giannetti, Paola; Dell'Orso, Mauro

2016-01-01

We present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics (HEP) and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton-proton collisions in hadron collider experiments. A miniaturized version of this complex system is being developed for pattern matching in generic image processing applications. The design uses the flexibility of Field Programmable Gate Arrays (FPGAs) and the powerful Associative Memory Chip (ASIC) to achieve real-time performance. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain.
Computer vision camera with embedded FPGA processing

Science.gov (United States)

Lecerf, Antoine; Ouellet, Denis; Arias-Estrada, Miguel

2000-03-01

Traditional computer vision is based on a camera-computer system in which the image understanding algorithms are embedded in the computer. To circumvent the computational load of vision algorithms, low-level processing and imaging hardware can be integrated in a single compact module where a dedicated architecture is implemented. This paper presents a Computer Vision Camera based on an open architecture implemented in an FPGA. The system is targeted to real-time computer vision tasks where low level processing and feature extraction tasks can be implemented in the FPGA device. The camera integrates a CMOS image sensor, an FPGA device, two memory banks, and an embedded PC for communication and control tasks. The FPGA device is a medium size one equivalent to 25,000 logic gates. The device is connected to two high speed memory banks, an IS interface, and an imager interface. The camera can be accessed for architecture programming, data transfer, and control through an Ethernet link from a remote computer. A hardware architecture can be defined in a Hardware Description Language (like VHDL), simulated and synthesized into digital structures that can be programmed into the FPGA and tested on the camera. The architecture of a classical multi-scale edge detection algorithm based on a Laplacian of Gaussian convolution has been developed to show the capabilities of the system.
7th International Conference on Embedded and Multimedia Computing (EMC-12)

CERN Document Server

Jeong, Young-Sik; Park, Sang; Chen, Hsing-Chung; Embedded and Multimedia Computing Technology and Service

2012-01-01

The 7th International Conference on Embedded and Multimedia Computing (EMC-12), will be held in Gwangju, Korea on September 6 - 8, 2012. EMC-12 will be the most comprehensive conference focused on the various aspects of advances in Embedded and Multimedia (EM) Computing. EMC-12 will provide an opportunity for academic and industry professionals to discuss the latest issues and progress in the area of EM. In addition, the conference will publish high quality papers which are closely related to the various theories and practical applications in EM. Furthermore, we expect that the conference and its publications will be a trigger for further related research and technology improvements in this important subject. The EMC-12 is the next event, in a series of highly successful International Conference on Embedded and Multimedia Computing, previously held as EMC 2011 (China, Aug. 2011), EMC 2010 (Philippines, Aug. 2010), EM-Com 2009 (Korea, Dec. 2009), UMC-08 (Australia, Oct. 2008), ESO-08(China, Dec. 2008), UMS-08 ...
Air Force Science & Technology Issues & Opportunities Regarding High Performance Embedded Computing

Science.gov (United States)

2009-09-23

price-performance advantage include: large scale simulations of neuromorphic computing models GOTCHA radar video SAR for wide area persistent...the handcuffs were not for me and that the military had so far got … Neuromorphic example: Robust recognition of occluded text Gotcha SAR PCID Image...Architecture 16 cores / chip 10 x 10 stacks / board50 chips / stack EDRAM AFPGA EDRAM AFPGA EDRAM AFPGA EDRAM AFPGA EDRAM AFPGA EDRAM AFPGA EDRAM AFPGA EDRAM
Cluster Computing for Embedded/Real-Time Systems

Science.gov (United States)

Katz, D.; Kepner, J.

1999-01-01

Embedded and real-time systems, like other computing systems, seek to maximize computing power for a given price, and thus can significantly benefit from the advancing capabilities of cluster computing.
DOE research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-12-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
High-performance computing — an overview

Science.gov (United States)

Marksteiner, Peter

1996-08-01

An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.
High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

CERN Document Server

Kröner, Dietmar; Resch, Michael

2016-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
High-performance computing in seismology

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-09-01

The scientific, technical, and economic importance of the issues discussed here presents a clear agenda for future research in computational seismology. In this way these problems will drive advances in high-performance computing in the field of seismology. There is a broad community that will benefit from this work, including the petroleum industry, research geophysicists, engineers concerned with seismic hazard mitigation, and governments charged with enforcing a comprehensive test ban treaty. These advances may also lead to new applications for seismological research. The recent application of high-resolution seismic imaging of the shallow subsurface for the environmental remediation industry is an example of this activity. This report makes the following recommendations: (1) focused efforts to develop validated documented software for seismological computations should be supported, with special emphasis on scalable algorithms for parallel processors; (2) the education of seismologists in high-performance computing technologies and methodologies should be improved; (3) collaborations between seismologists and computational scientists and engineers should be increased; (4) the infrastructure for archiving, disseminating, and processing large volumes of seismological data should be improved.
High performance computing in Windows Azure cloud

OpenAIRE

Ambruš, Dejan

2013-01-01

High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...
High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2003-01-01

This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.
Highly conductive porous Na-embedded carbon nanowalls for high-performance capacitive deionization

Science.gov (United States)

Chang, Liang; Hu, Yun Hang

2018-05-01

Highly conductive porous Na-embedded carbon nanowalls (Na@C), which were recently invented, have exhibited excellent performance for dye-sensitized solar cells and electric double-layer capacitors. In this work, Na@C was demonstrated as an excellent electrode material for capacitive deionization (CDI). In a three-electrode configuration system, the specific capacity of the Na@C electrodes can achieve 306.4 F/g at current density of 0.2 A/g in 1 M NaCl, which is higher than that (235.2 F/g) of activated carbon (AC) electrodes. Furthermore, a high electrosorption capacity of 8.75 mg g-1 in 100 mg/L NaCl was obtained with the Na@C electrodes in a batch-mode capacitive deionization cell. It exceeds the electrosorption capacity (4.08 mg g-1) of AC electrodes. The Na@C electrode also showed a promising cycle stability. The excellent performance of Na@C electrode for capacitive deionization (CDI) can be attributed to its high electrical conductivity and large accessible surface area.
A Heterogeneous High-Performance System for Computational and Computer Science

Science.gov (United States)

2016-11-15

expand the research infrastructure at the institution but also to enhance the high -performance computing training provided to both undergraduate and... cloud computing, supercomputing, and the availability of cheap memory and storage led to enormous amounts of data to be sifted through in forensic... High -Performance Computing (HPC) tools that can be integrated with existing curricula and support our research to modernize and dramatically advance

An integrated compact airborne multispectral imaging system using embedded computer

Science.gov (United States)

Zhang, Yuedong; Wang, Li; Zhang, Xuguo

2015-08-01

An integrated compact airborne multispectral imaging system using embedded computer based control system was developed for small aircraft multispectral imaging application. The multispectral imaging system integrates CMOS camera, filter wheel with eight filters, two-axis stabilized platform, miniature POS (position and orientation system) and embedded computer. The embedded computer has excellent universality and expansibility, and has advantages in volume and weight for airborne platform, so it can meet the requirements of control system of the integrated airborne multispectral imaging system. The embedded computer controls the camera parameters setting, filter wheel and stabilized platform working, image and POS data acquisition, and stores the image and data. The airborne multispectral imaging system can connect peripheral device use the ports of the embedded computer, so the system operation and the stored image data management are easy. This airborne multispectral imaging system has advantages of small volume, multi-function, and good expansibility. The imaging experiment results show that this system has potential for multispectral remote sensing in applications such as resource investigation and environmental monitoring.
Quantum Accelerators for High-performance Computing Systems

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

2017-11-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.
High-performance zig-zag and meander inductors embedded in ferrite material

International Nuclear Information System (INIS)

Stojanovic, Goran; Damnjanovic, Mirjana; Desnica, Vladan; Zivanov, Ljiljana; Raghavendra, Ramesh; Bellew, Pat; Mcloughlin, Neil

2006-01-01

This paper describes the design, modeling, simulation and fabrication of zig-zag and meander inductors embedded in low- or high-permeability soft ferrite material. These microinductors have been developed with ceramic coprocessing technology. We compare the electrical properties of zig-zag and meander inductors structures installed as surface-mount devices. The equivalent model of the new structures is presented, suitable for design, circuit simulations and for prediction of the performance of proposed inductors. The relatively high impedance values allow these microinductors to be used in high-frequency suppressors. The components were tested in the frequency range of 1 MHz-3 GHz using an Agilent 4287A RF LCR meter. The measurements confirm the validity of the analytical model
Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2013-01-01

Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the
Multiple Embedded Processors for Fault-Tolerant Computing

Science.gov (United States)

Bolotin, Gary; Watson, Robert; Katanyoutanant, Sunant; Burke, Gary; Wang, Mandy

2005-01-01

A fault-tolerant computer architecture has been conceived in an effort to reduce vulnerability to single-event upsets (spurious bit flips caused by impingement of energetic ionizing particles or photons). As in some prior fault-tolerant architectures, the redundancy needed for fault tolerance is obtained by use of multiple processors in one computer. Unlike prior architectures, the multiple processors are embedded in a single field-programmable gate array (FPGA). What makes this new approach practical is the recent commercial availability of FPGAs that are capable of having multiple embedded processors. A working prototype (see figure) consists of two embedded IBM PowerPC 405 processor cores and a comparator built on a Xilinx Virtex-II Pro FPGA. This relatively simple instantiation of the architecture implements an error-detection scheme. A planned future version, incorporating four processors and two comparators, would correct some errors in addition to detecting them.
High performance computing in linear control

International Nuclear Information System (INIS)

Datta, B.N.

1993-01-01

Remarkable progress has been made in both theory and applications of all important areas of control. The theory is rich and very sophisticated. Some beautiful applications of control theory are presently being made in aerospace, biomedical engineering, industrial engineering, robotics, economics, power systems, etc. Unfortunately, the same assessment of progress does not hold in general for computations in control theory. Control Theory is lagging behind other areas of science and engineering in this respect. Nowadays there is a revolution going on in the world of high performance scientific computing. Many powerful computers with vector and parallel processing have been built and have been available in recent years. These supercomputers offer very high speed in computations. Highly efficient software, based on powerful algorithms, has been developed to use on these advanced computers, and has also contributed to increased performance. While workers in many areas of science and engineering have taken great advantage of these hardware and software developments, control scientists and engineers, unfortunately, have not been able to take much advantage of these developments
High Performance Embedded System for Real-Time Pattern Matching

CERN Document Server

Sotiropoulou, Calliope Louisa; The ATLAS collaboration; Gkaitatzis, Stamatios; Citraro, Saverio; Giannetti, Paola; Dell'Orso, Mauro

2016-01-01

In this paper we present an innovative and high performance embedded system for real-time pattern matching. This system is based on the evolution of hardware and algorithms developed for the field of High Energy Physics (HEP) and more specifically for the execution of extremely fast pattern matching for tracking of particles produced by proton-proton collisions in hadron collider experiments. A miniaturised version of this complex system is being developed for pattern matching in generic image processing applications. The system works as a contour identifier able to extract the salient features of an image. It is based on the principles of cognitive image processing, which means that it executes fast pattern matching and data reduction mimicking the operation of the human brain. The pattern matching can be executed by a custom designed Associative Memory (AM) chip. The reference patterns are chosen by a complex training algorithm implemented on an FPGA device. Post processing algorithms (e.g. pixel clustering...
Messaging Performance of FIPA Interaction Protocols in Networked Embedded Controllers

Directory of Open Access Journals (Sweden)

García JoséAPérez

2008-01-01

Full Text Available Abstract Agent-based technologies in production control systems could facilitate seamless reconfiguration and integration of mechatronic devices/modules into systems. Advances in embedded controllers which are continuously improving computational capabilities allow for software modularization and distribution of decisions. Agent platforms running on embedded controllers could hide the complexity of bootstrap and communication. Therefore, it is important to investigate the messaging performance of the agents whose main motivation is the resource allocation in manufacturing systems (i.e., conveyor system. The tests were implemented using the FIPA-compliant JADE-LEAP agent platform. Agent containers were distributed through networked embedded controllers, and agents were communicating using request and contract-net FIPA interaction protocols. The test scenarios are organized in intercontainer and intracontainer communications. The work shows the messaging performance for the different test scenarios using both interaction protocols.
Messaging Performance of FIPA Interaction Protocols in Networked Embedded Controllers

Directory of Open Access Journals (Sweden)

Omar Jehovani López Orozco

2007-12-01

Full Text Available Agent-based technologies in production control systems could facilitate seamless reconfiguration and integration of mechatronic devices/modules into systems. Advances in embedded controllers which are continuously improving computational capabilities allow for software modularization and distribution of decisions. Agent platforms running on embedded controllers could hide the complexity of bootstrap and communication. Therefore, it is important to investigate the messaging performance of the agents whose main motivation is the resource allocation in manufacturing systems (i.e., conveyor system. The tests were implemented using the FIPA-compliant JADE-LEAP agent platform. Agent containers were distributed through networked embedded controllers, and agents were communicating using request and contract-net FIPA interaction protocols. The test scenarios are organized in intercontainer and intracontainer communications. The work shows the messaging performance for the different test scenarios using both interaction protocols.
High Performance Computing in Science and Engineering '14

CERN Document Server

Kröner, Dietmar; Resch, Michael

2015-01-01

This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.
A Trusted Computing Architecture of Embedded System Based on Improved TPM

Directory of Open Access Journals (Sweden)

Wang Xiaosheng

2017-01-01

Full Text Available The Trusted Platform Module (TPM currently used by PCs is not suitable for embedded systems, it is necessary to improve existing TPM. The paper proposes a trusted computing architecture with new TPM and the cryptographic system developed by China for the embedded system. The improved TPM consists of the Embedded System Trusted Cryptography Module (eTCM and the Embedded System Trusted Platform Control Module (eTPCM, which are combined and implemented the TPM’s autonomous control, active defense, high-speed encryption/decryption and other function through its internal bus arbitration module and symmetric and asymmetric cryptographic engines to effectively protect the security of embedded system. In our improved TPM, a trusted measurement method with chain model and star type model is used. Finally, the improved TPM is designed by FPGA, and it is used to a trusted PDA to carry out experimental verification. Experiments show that the trusted architecture of the embedded system based on the improved TPM is efficient, reliable and secure.
Embedding Moodle into Ubiquitous Computing Environments

NARCIS (Netherlands)

Glahn, Christian; Specht, Marcus

2010-01-01

Glahn, C., & Specht, M. (2010). Embedding Moodle into Ubiquitous Computing Environments. In M. Montebello, et al. (Eds.), 9th World Conference on Mobile and Contextual Learning (MLearn2010) (pp. 100-107). October, 19-22, 2010, Valletta, Malta.
High-performance computing for airborne applications

International Nuclear Information System (INIS)

Quinn, Heather M.; Manuzatto, Andrea; Fairbanks, Tom; Dallmann, Nicholas; Desgeorges, Rose

2010-01-01

Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.
Department of Energy research in utilization of high-performance computers

International Nuclear Information System (INIS)

Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

1980-08-01

Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure
Computing in high energy physics

International Nuclear Information System (INIS)

Hertzberger, L.O.; Hoogland, W.

1986-01-01

This book deals with advanced computing applications in physics, and in particular in high energy physics environments. The main subjects covered are networking; vector and parallel processing; and embedded systems. Also examined are topics such as operating systems, future computer architectures and commercial computer products. The book presents solutions that are foreseen as coping, in the future, with computing problems in experimental and theoretical High Energy Physics. In the experimental environment the large amounts of data to be processed offer special problems on-line as well as off-line. For on-line data reduction, embedded special purpose computers, which are often used for trigger applications are applied. For off-line processing, parallel computers such as emulator farms and the cosmic cube may be employed. The analysis of these topics is therefore a main feature of this volume
Optical interconnection networks for high-performance computing systems

International Nuclear Information System (INIS)

Biberman, Aleksandr; Bergman, Keren

2012-01-01

Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)
High-performance liquid chromatography separation of unsaturated organic compounds by a monolithic silica column embedded with silver nanoparticles.

Science.gov (United States)

Zhu, Yang; Morisato, Kei; Hasegawa, George; Moitra, Nirmalya; Kiyomura, Tsutomu; Kurata, Hiroki; Kanamori, Kazuyoshi; Nakanishi, Kazuki

2015-08-01

The optimization of a porous structure to ensure good separation performances is always a significant issue in high-performance liquid chromatography column design. Recently we reported the homogeneous embedment of Ag nanoparticles in periodic mesoporous silica monolith and the application of such Ag nanoparticles embedded silica monolith for the high-performance liquid chromatography separation of polyaromatic hydrocarbons. However, the separation performance remains to be improved and the retention mechanism as compared with the Ag ion high-performance liquid chromatography technique still needs to be clarified. In this research, Ag nanoparticles were introduced into a macro/mesoporous silica monolith with optimized pore parameters for high-performance liquid chromatography separations. Baseline separation of benzene, naphthalene, anthracene, and pyrene was achieved with the theoretical plate number for analyte naphthalene as 36,000 m(-1). Its separation function was further extended to cis/trans isomers of aromatic compounds where cis/trans stilbenes were chosen as a benchmark. Good separation of cis/trans-stilbene with separation factor as 7 and theoretical plate number as 76,000 m(-1) for cis-stilbene was obtained. The trans isomer, however, is retained more strongly, which contradicts the long- established retention rule of Ag ion chromatography. Such behavior of Ag nanoparticles embedded in a silica column can be attributed to the differences in the molecular geometric configuration of cis/trans stilbenes. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Tools for Embedded Computing Systems Software

Science.gov (United States)

1978-01-01

A workshop was held to assess the state of tools for embedded systems software and to determine directions for tool development. A synopsis of the talk and the key figures of each workshop presentation, together with chairmen summaries, are presented. The presentations covered four major areas: (1) tools and the software environment (development and testing); (2) tools and software requirements, design, and specification; (3) tools and language processors; and (4) tools and verification and validation (analysis and testing). The utility and contribution of existing tools and research results for the development and testing of embedded computing systems software are described and assessed.
Debugging a high performance computing program

Science.gov (United States)

Gooding, Thomas M.

2013-08-20

Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.
High-performance scientific computing in the cloud

Science.gov (United States)

Jorissen, Kevin; Vila, Fernando; Rehr, John

2011-03-01

Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.

Rad-hard embedded computers for nuclear robotics

International Nuclear Information System (INIS)

Giraud, A.; Joffre, F.; Marceau, M.; Robiolle, M.; Brunet, J.P.; Mijuin, D.

1994-01-01

Nuclear industries require robots with embedded rad hard electronics and high reliability. The SYROCO research program allowed to perform efficient industrial prototypes, build according to MICADO architecture, and to design CADMOS architecture. MICADO architecture uses the auto healing property that have CMOS circuits when being switched off during irradiation. (D.L.). 8 refs., 5 figs
Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

Science.gov (United States)

Ahn, Sul-Ah; Jung, Youngim

2016-10-01

The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.
Disclosive Computer Ethics: The Exposure and Evaluation of Embedded Normativity in Computer Technology.

NARCIS (Netherlands)

Brey, Philip A.E.

2000-01-01

This essay provides a critique of mainstream computer ethics and argues for the importance of a complementary approach called disclosive computer ethics, which is concerned with the moral deciphering of embedded values and norms in computer systems, applications and practices. Also, four key values
A high-performance, flexible and robust metal nanotrough-embedded transparent conducting film for wearable touch screen panels

Science.gov (United States)

Im, Hyeon-Gyun; An, Byeong Wan; Jin, Jungho; Jang, Junho; Park, Young-Geun; Park, Jang-Ung; Bae, Byeong-Soo

2016-02-01

We report a high-performance, flexible and robust metal nanotrough-embedded transparent conducting hybrid film (metal nanotrough-GFRHybrimer). Using an electro-spun polymer nanofiber web as a template and vacuum-deposited gold as a conductor, a junction resistance-free continuous metal nanotrough network is formed. Subsequently, the metal nanotrough is embedded on the surface of a glass-fabric reinforced composite substrate (GFRHybrimer). The monolithic composite structure of our transparent conducting film allows simultaneously high thermal stability (24 h at 250 °C in air), a smooth surface topography (Rrms touch screen panel (TSP) is fabricated using the transparent conducting films. The flexible TSP device stably operates on the back of a human hand and on a wristband.We report a high-performance, flexible and robust metal nanotrough-embedded transparent conducting hybrid film (metal nanotrough-GFRHybrimer). Using an electro-spun polymer nanofiber web as a template and vacuum-deposited gold as a conductor, a junction resistance-free continuous metal nanotrough network is formed. Subsequently, the metal nanotrough is embedded on the surface of a glass-fabric reinforced composite substrate (GFRHybrimer). The monolithic composite structure of our transparent conducting film allows simultaneously high thermal stability (24 h at 250 °C in air), a smooth surface topography (Rrms touch screen panel (TSP) is fabricated using the transparent conducting films. The flexible TSP device stably operates on the back of a human hand and on a wristband. Electronic supplementary information (ESI) available. See DOI: 10.1039/c5nr07657a
Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

Science.gov (United States)

Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

2013-01-01

With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…
High performance parallel computers for science

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

1989-01-01

This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction
Fabrication of highly dispersed ZnO nanoparticles embedded in graphene nanosheets for high performance supercapacitors

International Nuclear Information System (INIS)

Fang, Linxia; Zhang, Baoliang; Li, Wei; Zhang, Jizhong; Huang, Kejing; Zhang, Qiuyu

2014-01-01

We report a facile strategy to synthesize ZnO-graphene nanocomposites as an advanced electrode material for high-performance supercapacitors. The ZnO-graphene nanocomposites have been fabricated via a facile, low-temperature in situ wet chemistry process. During this process, high dispersed ZnO nanoparticles are embedded in graphene nanosheets, leading to sandwich-structured ZnO-graphene nanocomposites. Thus, intimate interfacial contact between ZnO nanoparticles and graphene nanosheets are achieved, which facilitates electrochemical activity and enhance electrochemical properties due to fast electron transfer. The as-prepared ZnO-graphene nanocomposites exhibit a maximum specific capacitance of 786 F g −1 and excellent cycle life with capacity retention of about 92% after 500 cycles. This facile design and rational synthesis offers an effective strategy to enhance the electrochemical performance of supercapacitors and shows promising potential for large-scale application in energy storage
High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

2000-01-01

The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.
Fault tolerant embedded computers and power electronics for nuclear robotics

International Nuclear Information System (INIS)

Giraud, A.; Robiolle, M.

1995-01-01

For requirements of nuclear industries, it is necessary to use embedded rad-tolerant electronics and high-level safety. In this paper, we first describe a computer architecture called MICADO designed for French nuclear industry. We then present outgoing projects on our industry. A special point is made on power electronics for remote-operated and legged robots. (authors). 7 refs., 2 figs
Fault tolerant embedded computers and power electronics for nuclear robotics

Energy Technology Data Exchange (ETDEWEB)

Giraud, A.; Robiolle, M.

1995-12-31

For requirements of nuclear industries, it is necessary to use embedded rad-tolerant electronics and high-level safety. In this paper, we first describe a computer architecture called MICADO designed for French nuclear industry. We then present outgoing projects on our industry. A special point is made on power electronics for remote-operated and legged robots. (authors). 7 refs., 2 figs.
A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems.

Science.gov (United States)

Brunelli, Davide

2016-03-04

Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm³. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions.
A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems

Science.gov (United States)

Brunelli, Davide

2016-01-01

Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm3. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions. PMID:26959018
A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems

Directory of Open Access Journals (Sweden)

Davide Brunelli

2016-03-01

Full Text Available Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm3. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions.
Design of embedded control system for high-power tetrode modulator

International Nuclear Information System (INIS)

Tu Rui; Yao Lieying; Xuan Weimin

2010-01-01

The design of embedded control system for the high-power tetrode modulator and its test results are given. This control system is a closed-loop feedback system based on the DSP and embedded into the high-voltage modulator. A new modified method of VF fiber transmission is used in the embedded control system. The new method improves the speed of the transmission of feedback system. The results of the experiment demonstrate that the embedded feedback control system greatly increases the response speed of the whole system and improves the performance of the high-power tetrode on the HL-2A tokamak. This embedded feedback control system greatly simplifies the complexity of the original centralized control system. The operation of the control system is reliable. (authors)
GPU-based high-performance computing for radiation therapy

International Nuclear Information System (INIS)

Jia, Xun; Jiang, Steve B; Ziegenhein, Peter

2014-01-01

Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. (topical review)
Performance Aspects of Synthesizable Computing Systems

DEFF Research Database (Denmark)

Schleuniger, Pascal

Embedded systems are used in a broad range of applications that demand high performance within severely constrained mechanical, power, and cost requirements. Embedded systems implemented in ASIC technology tend to provide the highest performance, lowest power consumption and lowest unit cost. How...
High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

CERN Document Server

Jäger, Willi

1999-01-01

The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.
High-Performance Java Codes for Computational Fluid Dynamics

Science.gov (United States)

Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

2001-01-01

The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.
Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

Science.gov (United States)

Beck, Jeffrey; Bos, Jeremy P.

2017-05-01

We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.
Soft-error tolerance and energy consumption evaluation of embedded computer with magnetic random access memory in practical systems using computer simulations

Science.gov (United States)

Nebashi, Ryusuke; Sakimura, Noboru; Sugibayashi, Tadahiko

2017-08-01

We evaluated the soft-error tolerance and energy consumption of an embedded computer with magnetic random access memory (MRAM) using two computer simulators. One is a central processing unit (CPU) simulator of a typical embedded computer system. We simulated the radiation-induced single-event-upset (SEU) probability in a spin-transfer-torque MRAM cell and also the failure rate of a typical embedded computer due to its main memory SEU error. The other is a delay tolerant network (DTN) system simulator. It simulates the power dissipation of wireless sensor network nodes of the system using a revised CPU simulator and a network simulator. We demonstrated that the SEU effect on the embedded computer with 1 Gbit MRAM-based working memory is less than 1 failure in time (FIT). We also demonstrated that the energy consumption of the DTN sensor node with MRAM-based working memory can be reduced to 1/11. These results indicate that MRAM-based working memory enhances the disaster tolerance of embedded computers.

Quantum Accelerators for High-Performance Computing Systems

OpenAIRE

Britt, Keith A.; Mohiyaddin, Fahd A.; Humble, Travis S.

2017-01-01

We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantu...
Visualization and Data Analysis for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

2016-09-27

This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.
Very High-Performance Embedded Computing Will Allow Ambitious Space Science Investigation

National Research Council Canada - National Science Library

Pignol, Michel

2005-01-01

.... developed on radiation tolerant technologies. Unfortunately, the microprocessors today available on such technologies have the computing throughput which was available about 10 years ago on the commercial market...
High performance computing on vector systems

CERN Document Server

Roller, Sabine

2008-01-01

Presents the developments in high-performance computing and simulation on modern supercomputer architectures. This book covers trends in hardware and software development in general and specifically the vector-based systems and heterogeneous architectures. It presents innovative fields like coupled multi-physics or multi-scale simulations.
High Performance Computing Modernization Program Kerberos Throughput Test Report

Science.gov (United States)

2017-10-26

Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5524--17-9751 High Performance Computing Modernization Program Kerberos Throughput Test ...NUMBER 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 2. REPORT TYPE1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 6. AUTHOR(S) 8. PERFORMING...PAGE 18. NUMBER OF PAGES 17. LIMITATION OF ABSTRACT High Performance Computing Modernization Program Kerberos Throughput Test Report Daniel G. Gdula* and
CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

OpenAIRE

YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

2009-01-01

[ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...
Software Systems for High-performance Quantum Computing

Energy Technology Data Exchange (ETDEWEB)

Humble, Travis S [ORNL; Britt, Keith A [ORNL

2016-01-01

Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventional computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.
Rad-hard embedded computers for nuclear robotics; Calculateurs durcis embarques pour la robotique nucleaire

Energy Technology Data Exchange (ETDEWEB)

Giraud, A; Joffre, F; Marceau, M; Robiolle, M; Brunet, J P; Mijuin, D

1994-12-31

For requirements of nuclear industries, it is necessary to use robots with embedded rad hard electronics and high level safety. The computer developed for french research program SYROCO is presented in this paper. (authors). 8 refs., 5 figs.
An embedded implementation based on adaptive filter bank for brain-computer interface systems.

Science.gov (United States)

Belwafi, Kais; Romain, Olivier; Gannouni, Sofien; Ghaffari, Fakhreddine; Djemal, Ridha; Ouni, Bouraoui

2018-07-15

Brain-computer interface (BCI) is a new communication pathway for users with neurological deficiencies. The implementation of a BCI system requires complex electroencephalography (EEG) signal processing including filtering, feature extraction and classification algorithms. Most of current BCI systems are implemented on personal computers. Therefore, there is a great interest in implementing BCI on embedded platforms to meet system specifications in terms of time response, cost effectiveness, power consumption, and accuracy. This article presents an embedded-BCI (EBCI) system based on a Stratix-IV field programmable gate array. The proposed system relays on the weighted overlap-add (WOLA) algorithm to perform dynamic filtering of EEG-signals by analyzing the event-related desynchronization/synchronization (ERD/ERS). The EEG-signals are classified, using the linear discriminant analysis algorithm, based on their spatial features. The proposed system performs fast classification within a time delay of 0.430 s/trial, achieving an average accuracy of 76.80% according to an offline approach and 80.25% using our own recording. The estimated power consumption of the prototype is approximately 0.7 W. Results show that the proposed EBCI system reduces the overall classification error rate for the three datasets of the BCI-competition by 5% compared to other similar implementations. Moreover, experiment shows that the proposed system maintains a high accuracy rate with a short processing time, a low power consumption, and a low cost. Performing dynamic filtering of EEG-signals using WOLA increases the recognition rate of ERD/ERS patterns of motor imagery brain activity. This approach allows to develop a complete prototype of a EBCI system that achieves excellent accuracy rates. Copyright © 2018 Elsevier B.V. All rights reserved.
Verification and Performance Analysis for Embedded Systems

DEFF Research Database (Denmark)

Larsen, Kim Guldstrand

2009-01-01

This talk provides a thorough tutorial of the UPPAAL tool suite for, modeling, simulation, verification, optimal scheduling, synthesis, testing and performance analysis of embedded and real-time systems.......This talk provides a thorough tutorial of the UPPAAL tool suite for, modeling, simulation, verification, optimal scheduling, synthesis, testing and performance analysis of embedded and real-time systems....
Monitoring SLAC High Performance UNIX Computing Systems

International Nuclear Information System (INIS)

Lettsome, Annette K.

2005-01-01

Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia. Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface
High Performance Computing Software Applications for Space Situational Awareness

Science.gov (United States)

Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.
Development of embedded real-time and high-speed vision platform

Science.gov (United States)

Ouyang, Zhenxing; Dong, Yimin; Yang, Hua

2015-12-01

Currently, high-speed vision platforms are widely used in many applications, such as robotics and automation industry. However, a personal computer (PC) whose over-large size is not suitable and applicable in compact systems is an indispensable component for human-computer interaction in traditional high-speed vision platforms. Therefore, this paper develops an embedded real-time and high-speed vision platform, ER-HVP Vision which is able to work completely out of PC. In this new platform, an embedded CPU-based board is designed as substitution for PC and a DSP and FPGA board is developed for implementing image parallel algorithms in FPGA and image sequential algorithms in DSP. Hence, the capability of ER-HVP Vision with size of 320mm x 250mm x 87mm can be presented in more compact condition. Experimental results are also given to indicate that the real-time detection and counting of the moving target at a frame rate of 200 fps at 512 x 512 pixels under the operation of this newly developed vision platform are feasible.
DURIP: High Performance Computing in Biomathematics Applications

Science.gov (United States)

2017-05-10

Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied
AHPCRC - Army High Performance Computing Research Center

Science.gov (United States)

2010-01-01

computing. Of particular interest is the ability of a distrib- uted jamming network (DJN) to jam signals in all or part of a sensor or communications net...and reasoning, assistive technologies. FRIEDRICH (FRITZ) PRINZ Finmeccanica Professor of Engineering, Robert Bosch Chair, Department of Engineering...High Performance Computing Research Center www.ahpcrc.org BARBARA BRYAN AHPCRC Research and Outreach Manager, HPTi (650) 604-3732 bbryan@hpti.com Ms
Enabling high performance computational science through combinatorial algorithms

International Nuclear Information System (INIS)

Boman, Erik G; Bozdag, Doruk; Catalyurek, Umit V; Devine, Karen D; Gebremedhin, Assefaw H; Hovland, Paul D; Pothen, Alex; Strout, Michelle Mills

2007-01-01

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation
Enabling high performance computational science through combinatorial algorithms

Energy Technology Data Exchange (ETDEWEB)

Boman, Erik G [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Bozdag, Doruk [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Catalyurek, Umit V [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Devine, Karen D [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Gebremedhin, Assefaw H [Computer Science and Center for Computational Science, Old Dominion University (United States); Hovland, Paul D [Mathematics and Computer Science Division, Argonne National Laboratory (United States); Pothen, Alex [Computer Science and Center for Computational Science, Old Dominion University (United States); Strout, Michelle Mills [Computer Science, Colorado State University (United States)

2007-07-15

The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation.
Computer Game Play as an Imaginary Stage for Reading: Implicit Spatial Effects of Computer Games Embedded in Hard Copy Books

Science.gov (United States)

Smith, Glenn Gordon

2012-01-01

This study compared books with embedded computer games (via pentop computers with microdot paper and audio feedback) with regular books with maps, in terms of fifth graders' comprehension and retention of spatial details from stories. One group read a story in hard copy with embedded computer games, the other group read it in regular book format…
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Anon.

1991-03-15

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour.
COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

International Nuclear Information System (INIS)

Anon.

1991-01-01

In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour

Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa; Parashar, Manish; Kim, Hyunjoo; Jordan, Kirk E.; Sachdeva, Vipin; Sexton, James; Jamjoom, Hani; Shae, Zon-Yin; Pencheva, Gergina; Tavakoli, Reza; Wheeler, Mary F.

2012-01-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a
Effect of foundation embedment on the seismic response of a high-temperature gas-cooled reactor plant

International Nuclear Information System (INIS)

Lee, T.H.; Thompson, R.W.; Charman, C.M.

1983-01-01

The effects of soil-structure interaction during seismic events upon the dynamic response of a High Temperature Gas-Cooled Reactor plant (HTGR) have been investigated for both surface-founded and embedded basemats. The influence from foundation embedment has been quantitatively assessed through a series of theoretical studies on plants of various sizes. The surface-founded analyses were performed using frequency-independent soil impedance parameters, while the embedded plant analyses utilized finite element models simulated on the FLUSH computer program. The seismic response of the surface-founded plants has been used to establish the standard-site design in-structure response spectra. These analyses were performed by using the linear modal formulation based on conventional soil stiffness and damping values. They serve as reference solutions to which the response data of the corresponding embedded plants are compared. In these comparison studies the responses of embedded plants were generally found to be lower than those of the corresponding surface-founded plants. Additional studies on the surface-founded plants have recently been performed by considering inelastic soil behavior. These inelastic solutions, which treat the soil as an elasto-plastic medium exhibiting hysteretic unloading-reloading characteristics in time, have reduced the response of surface-founded plants. Numerical results are presented in terms of in-structure response spectra along with other pertinent seismic load data at key levels of the plant. Analysis techniques for future studies using viscoelastic halfspace representation and inelastic finite element modeling for soil are also discussed
Embedded Systems

Indian Academy of Sciences (India)

Embedded system, micro-con- troller ... Embedded systems differ from general purpose computers in many ... Low cost: As embedded systems are extensively used in con- .... operating systems for the desktop computers where scheduling.
High-performance computing in accelerating structure design and analysis

International Nuclear Information System (INIS)

Li Zenghai; Folwell, Nathan; Ge Lixin; Guetz, Adam; Ivanov, Valentin; Kowalski, Marc; Lee, Lie-Quan; Ng, Cho-Kuen; Schussman, Greg; Stingelin, Lukas; Uplenchwar, Ravindra; Wolf, Michael; Xiao, Liling; Ko, Kwok

2006-01-01

Future high-energy accelerators such as the Next Linear Collider (NLC) will accelerate multi-bunch beams of high current and low emittance to obtain high luminosity, which put stringent requirements on the accelerating structures for efficiency and beam stability. While numerical modeling has been quite standard in accelerator R and D, designing the NLC accelerating structure required a new simulation capability because of the geometric complexity and level of accuracy involved. Under the US DOE Advanced Computing initiatives (first the Grand Challenge and now SciDAC), SLAC has developed a suite of electromagnetic codes based on unstructured grids and utilizing high-performance computing to provide an advanced tool for modeling structures at accuracies and scales previously not possible. This paper will discuss the code development and computational science research (e.g. domain decomposition, scalable eigensolvers, adaptive mesh refinement) that have enabled the large-scale simulations needed for meeting the computational challenges posed by the NLC as well as projects such as the PEP-II and RIA. Numerical results will be presented to show how high-performance computing has made a qualitative improvement in accelerator structure modeling for these accelerators, either at the component level (single cell optimization), or on the scale of an entire structure (beam heating and long-range wakefields)
Enabling High-Performance Computing as a Service

KAUST Repository

AbdelBaky, Moustafa

2012-10-01

With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a representative HPC application. © 2012 IEEE.
Embedded systems for supporting computer accessibility.

Science.gov (United States)

Mulfari, Davide; Celesti, Antonio; Fazio, Maria; Villari, Massimo; Puliafito, Antonio

2015-01-01

Nowadays, customized AT software solutions allow their users to interact with various kinds of computer systems. Such tools are generally available on personal devices (e.g., smartphones, laptops and so on) commonly used by a person with a disability. In this paper, we investigate a way of using the aforementioned AT equipments in order to access many different devices without assistive preferences. The solution takes advantage of open source hardware and its core component consists of an affordable Linux embedded system: it grabs data coming from the assistive software, which runs on the user's personal device, then, after processing, it generates native keyboard and mouse HID commands for the target computing device controlled by the end user. This process supports any operating system available on the target machine and it requires no specialized software installation; therefore the user with a disability can rely on a single assistive tool to control a wide range of computing platforms, including conventional computers and many kinds of mobile devices, which receive input commands through the USB HID protocol.
The Future of Software Engineering for High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Pope, G [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2015-07-16

DOE ASCR requested that from May through mid-July 2015 a study group identify issues and recommend solutions from a software engineering perspective transitioning into the next generation of High Performance Computing. The approach used was to ask some of the DOE complex experts who will be responsible for doing this work to contribute to the study group. The technique used was to solicit elevator speeches: a short and concise write up done as if the author was a speaker with only a few minutes to convince a decision maker of their top issues. Pages 2-18 contain the original texts of the contributed elevator speeches and end notes identifying the 20 contributors. The study group also ranked the importance of each topic, and those scores are displayed with each topic heading. A perfect score (and highest priority) is three, two is medium priority, and one is lowest priority. The highest scoring topic areas were software engineering and testing resources; the lowest scoring area was compliance to DOE standards. The following two paragraphs are an elevator speech summarizing the contributed elevator speeches. Each sentence or phrase in the summary is hyperlinked to its source via a numeral embedded in the text. A risk one liner has also been added to each topic to allow future risk tracking and mitigation.
High Performance Networks From Supercomputing to Cloud Computing

CERN Document Server

Abts, Dennis

2011-01-01

Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati
Hilbert-Twin – A Novel Hilbert Transform-Based Method To Compute Envelope Of Free Decaying Oscillations Embedded In Noise, And The Logarithmic Decrement In High-Resolution Mechanical Spectroscopy HRMS

Directory of Open Access Journals (Sweden)

Magalas L.B.

2015-06-01

Full Text Available In this work, we present a novel Hilbert-twin method to compute an envelope and the logarithmic decrement, δ, from exponentially damped time-invariant harmonic strain signals embedded in noise. The results obtained from five computing methods: (1 the parametric OMI (Optimization in Multiple Intervals method, two interpolated discrete Fourier transform-based (IpDFT methods: (2 the Yoshida-Magalas (YM method and (3 the classic Yoshida (Y method, (4 the novel Hilbert-twin (H-twin method based on the Hilbert transform, and (5 the conventional Hilbert transform (HT method are analyzed and compared. The fundamental feature of the Hilbert-twin method is the efficient elimination of intrinsic asymmetrical oscillations of the envelope, aHT (t, obtained from the discrete Hilbert transform of analyzed signals. Excellent performance in estimation of the logarithmic decrement from the Hilbert-twin method is comparable to that of the OMI and YM for the low- and high-damping levels. The Hilbert-twin method proved to be robust and effective in computing the logarithmic decrement and the resonant frequency of exponentially damped free decaying signals embedded in experimental noise. The Hilbert-twin method is also appropriate to detect nonlinearities in mechanical loss measurements of metals and alloys.
Evaluation of high-performance computing software

Energy Technology Data Exchange (ETDEWEB)

Browne, S.; Dongarra, J. [Univ. of Tennessee, Knoxville, TN (United States); Rowan, T. [Oak Ridge National Lab., TN (United States)

1996-12-31

The absence of unbiased and up to date comparative evaluations of high-performance computing software complicates a user`s search for the appropriate software package. The National HPCC Software Exchange (NHSE) is attacking this problem using an approach that includes independent evaluations of software, incorporation of author and user feedback into the evaluations, and Web access to the evaluations. We are applying this approach to the Parallel Tools Library (PTLIB), a new software repository for parallel systems software and tools, and HPC-Netlib, a high performance branch of the Netlib mathematical software repository. Updating the evaluations with feed-back and making it available via the Web helps ensure accuracy and timeliness, and using independent reviewers produces unbiased comparative evaluations difficult to find elsewhere.
A real-time spike sorting method based on the embedded GPU.

Science.gov (United States)

Zelan Yang; Kedi Xu; Xiang Tian; Shaomin Zhang; Xiaoxiang Zheng

2017-07-01

Microelectrode arrays with hundreds of channels have been widely used to acquire neuron population signals in neuroscience studies. Online spike sorting is becoming one of the most important challenges for high-throughput neural signal acquisition systems. Graphic processing unit (GPU) with high parallel computing capability might provide an alternative solution for increasing real-time computational demands on spike sorting. This study reported a method of real-time spike sorting through computing unified device architecture (CUDA) which was implemented on an embedded GPU (NVIDIA JETSON Tegra K1, TK1). The sorting approach is based on the principal component analysis (PCA) and K-means. By analyzing the parallelism of each process, the method was further optimized in the thread memory model of GPU. Our results showed that the GPU-based classifier on TK1 is 37.92 times faster than the MATLAB-based classifier on PC while their accuracies were the same with each other. The high-performance computing features of embedded GPU demonstrated in our studies suggested that the embedded GPU provide a promising platform for the real-time neural signal processing.
High Performance Computing Operations Review Report

Energy Technology Data Exchange (ETDEWEB)

Cupps, Kimberly C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

2013-12-19

The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.
High Performance Computing in Science and Engineering '08 : Transactions of the High Performance Computing Center

CERN Document Server

Kröner, Dietmar; Resch, Michael

2009-01-01

The discussions and plans on all scienti?c, advisory, and political levels to realize an even larger “European Supercomputer” in Germany, where the hardware costs alone will be hundreds of millions Euro – much more than in the past – are getting closer to realization. As part of the strategy, the three national supercomputing centres HLRS (Stuttgart), NIC/JSC (Julic ¨ h) and LRZ (Munich) have formed the Gauss Centre for Supercomputing (GCS) as a new virtual organization enabled by an agreement between the Federal Ministry of Education and Research (BMBF) and the state ministries for research of Baden-Wurttem ¨ berg, Bayern, and Nordrhein-Westfalen. Already today, the GCS provides the most powerful high-performance computing - frastructure in Europe. Through GCS, HLRS participates in the European project PRACE (Partnership for Advances Computing in Europe) and - tends its reach to all European member countries. These activities aligns well with the activities of HLRS in the European HPC infrastructur...
High performance computing in science and engineering '09: transactions of the High Performance Computing Center, Stuttgart (HLRS) 2009

National Research Council Canada - National Science Library

Nagel, Wolfgang E; Kröner, Dietmar; Resch, Michael

2010-01-01

...), NIC/JSC (J¨ u lich), and LRZ (Munich). As part of that strategic initiative, in May 2009 already NIC/JSC has installed the first phase of the GCS HPC Tier-0 resources, an IBM Blue Gene/P with roughly 300.000 Cores, this time in J¨ u lich, With that, the GCS provides the most powerful high-performance computing infrastructure in Europe alread...
NCI's High Performance Computing (HPC) and High Performance Data (HPD) Computing Platform for Environmental and Earth System Data Science

Science.gov (United States)

Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

2015-04-01

The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially
The contribution of high-performance computing and modelling for industrial development

CSIR Research Space (South Africa)

Sithole, Happy

2017-10-01

Full Text Available Performance Computing and Modelling for Industrial Development Dr Happy Sithole and Dr Onno Ubbink 2 Strategic context • High-performance computing (HPC) combined with machine Learning and artificial intelligence present opportunities to non...
The path toward HEP High Performance Computing

International Nuclear Information System (INIS)

Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit
High performance computing and communications: FY 1997 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-12-01

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A provides a detailed look at HPCC Program activities within each agency.
High performance computing and communications: FY 1996 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1995-05-16

The High Performance Computing and Communications (HPCC) Program was formally authorized by passage of the High Performance Computing Act of 1991, signed on December 9, 1991. Twelve federal agencies, in collaboration with scientists and managers from US industry, universities, and research laboratories, have developed the Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1995 and FY 1996. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency.
High performance computing in science and engineering Garching/Munich 2016

Energy Technology Data Exchange (ETDEWEB)

Wagner, Siegfried; Bode, Arndt; Bruechle, Helmut; Brehm, Matthias (eds.)

2016-11-01

Computer simulations are the well-established third pillar of natural sciences along with theory and experimentation. Particularly high performance computing is growing fast and constantly demands more and more powerful machines. To keep pace with this development, in spring 2015, the Leibniz Supercomputing Centre installed the high performance computing system SuperMUC Phase 2, only three years after the inauguration of its sibling SuperMUC Phase 1. Thereby, the compute capabilities were more than doubled. This book covers the time-frame June 2014 until June 2016. Readers will find many examples of outstanding research in the more than 130 projects that are covered in this book, with each one of these projects using at least 4 million core-hours on SuperMUC. The largest scientific communities using SuperMUC in the last two years were computational fluid dynamics simulations, chemistry and material sciences, astrophysics, and life sciences.

Benchmarking high performance computing architectures with CMS’ skeleton framework

Science.gov (United States)

Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

2017-10-01

In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.
Contemporary high performance computing from petascale toward exascale

CERN Document Server

Vetter, Jeffrey S

2015-01-01

A continuation of Contemporary High Performance Computing: From Petascale toward Exascale, this second volume continues the discussion of HPC flagship systems, major application workloads, facilities, and sponsors. The book includes of figures and pictures that capture the state of existing systems: pictures of buildings, systems in production, floorplans, and many block diagrams and charts to illustrate system design and performance.
Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

Energy Technology Data Exchange (ETDEWEB)

Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

2017-07-14

As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.
A high performance scientific cloud computing environment for materials simulations

Science.gov (United States)

Jorissen, K.; Vila, F. D.; Rehr, J. J.

2012-09-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.
Inclusive vision for high performance computing at the CSIR

CSIR Research Space (South Africa)

Gazendam, A

2006-02-01

Full Text Available and computationally intensive applications. A number of different technologies and standards were identified as core to the open and distributed high-performance infrastructure envisaged...
High Performance Computing Multicast

Science.gov (United States)

2012-02-01

A History of the Virtual Synchrony Replication Model,” in Replication: Theory and Practice, Charron-Bost, B., Pedone, F., and Schiper, A. (Eds...Performance Computing IP / IPv4 Internet Protocol (version 4.0) IPMC Internet Protocol MultiCast LAN Local Area Network MCMD Dr. Multicast MPI
A high performance scientific cloud computing environment for materials simulations

OpenAIRE

Jorissen, Kevin; Vila, Fernando D.; Rehr, John J.

2011-01-01

We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including...
The path toward HEP High Performance Computing

CERN Document Server

Apostolakis, John; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

2014-01-01

High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on th...
Algorithms and Methods for High-Performance Model Predictive Control

DEFF Research Database (Denmark)

Frison, Gianluca

routines employed in the numerical tests. The main focus of this thesis is on linear MPC problems. In this thesis, both the algorithms and their implementation are equally important. About the implementation, a novel implementation strategy for the dense linear algebra routines in embedded optimization...... is proposed, aiming at improving the computational performance in case of small matrices. About the algorithms, they are built on top of the proposed linear algebra, and they are tailored to exploit the high-level structure of the MPC problems, with special care on reducing the computational complexity....
High-performance computing for structural mechanics and earthquake/tsunami engineering

CERN Document Server

Hori, Muneo; Ohsaki, Makoto

2016-01-01

Huge earthquakes and tsunamis have caused serious damage to important structures such as civil infrastructure elements, buildings and power plants around the globe. To quantitatively evaluate such damage processes and to design effective prevention and mitigation measures, the latest high-performance computational mechanics technologies, which include telascale to petascale computers, can offer powerful tools. The phenomena covered in this book include seismic wave propagation in the crust and soil, seismic response of infrastructure elements such as tunnels considering soil-structure interactions, seismic response of high-rise buildings, seismic response of nuclear power plants, tsunami run-up over coastal towns and tsunami inundation considering fluid-structure interactions. The book provides all necessary information for addressing these phenomena, ranging from the fundamentals of high-performance computing for finite element methods, key algorithms of accurate dynamic structural analysis, fluid flows ...
High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

International Nuclear Information System (INIS)

Khaleel, Mohammad A.

2009-01-01

This report is an account of the deliberations and conclusions of the workshop on 'Forefront Questions in Nuclear Science and the Role of High Performance Computing' held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to (1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; (2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; (3) provide nuclear physicists the opportunity to influence the development of high performance computing; and (4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Khaleel, Mohammad A.

2009-10-01

This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.
High performance parallel computers for science: New developments at the Fermilab advanced computer program

International Nuclear Information System (INIS)

Nash, T.; Areti, H.; Atac, R.

1988-08-01

Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs
Simple, parallel, high-performance virtual machines for extreme computations

International Nuclear Information System (INIS)

Chokoufe Nejad, Bijan; Ohl, Thorsten; Reuter, Jurgen

2014-11-01

We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new processes or complex higher order corrections that are currently out of reach could be evaluated with a VM given enough computing power.
Improving developer productivity with C++ embedded domain specific languages

Science.gov (United States)

Kozacik, Stephen; Chao, Evenie; Paolini, Aaron; Bonnett, James; Kelmelis, Eric

2017-05-01

Domain-specific languages are a useful tool for productivity allowing domain experts to program using familiar concepts and vocabulary while benefiting from performance choices made by computing experts. Embedding the domain specific language into an existing language allows easy interoperability with non-domain-specific code and use of standard compilers and build systems. In C++, this is enabled through the template and preprocessor features. C++ embedded domain specific languages (EDSLs) allow the user to write simple, safe, performant, domain specific code that has access to all the low-level functionality that C and C++ offer as well as the diverse set of libraries available in the C/C++ ecosystem. In this paper, we will discuss several tools available for building EDSLs in C++ and show examples of projects successfully leveraging EDSLs. Modern C++ has added many useful new features to the language which we have leveraged to further extend the capability of EDSLs. At EM Photonics, we have used EDSLs to allow developers to transparently benefit from using high performance computing (HPC) hardware. We will show ways EDSLs combine with existing technologies and EM Photonics high performance tools and libraries to produce clean, short, high performance code in ways that were not previously possible.
Polarizable Density Embedding

DEFF Research Database (Denmark)

Reinholdt, Peter; Kongsted, Jacob; Olsen, Jógvan Magnus Haugaard

2017-01-01

We analyze the performance of the polarizable density embedding (PDE) model-a new multiscale computational approach designed for prediction and rationalization of general molecular properties of large and complex systems. We showcase how the PDE model very effectively handles the use of large...
A High Performance VLSI Computer Architecture For Computer Graphics

Science.gov (United States)

Chin, Chi-Yuan; Lin, Wen-Tai

1988-10-01

A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.
High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

International Nuclear Information System (INIS)

Pop, Florin

2014-01-01

Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.
FY 1992 Blue Book: Grand Challenges: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

An embedded single-board computer for BPM of SSRF

International Nuclear Information System (INIS)

Chen Kai; Liu Shubin; Yan Han; Wu Weihao; Zhao Lei; An Qi; Leng Yongbin; Yi Xing; Yan Yingbing; Lai Longwei

2011-01-01

An embedded single-board computer (SBC) system based on AT91RM9200 was designed for monitoring and controlling the digital beam position monitor system of Shanghai Synchrotron Radiation Facility (SSRF) through the Virtex-4 FPGA in the digital processing board. The SBC transfers the configuration commands from the remote EPICS to the FPGA, and calculates the beam position data. The interface between the FPGA and the SBC is the Static Memory Controller (SMC) with a peak transfer speed of up to 349 Mbps. The 100 Mb Ethernet is used for data transfer between the EPICS and SBC board, and a serial port serves as monitoring the status of the embedded system. Test results indicate that the SBC board functions well. (authors)
High performance computing and communications: Advancing the frontiers of information technology

Energy Technology Data Exchange (ETDEWEB)

NONE

1997-12-31

This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.
Enhanced performance of microfluidic soft pressure sensors with embedded solid microspheres

Science.gov (United States)

Shin, Hee-Sup; Ryu, Jaiyoung; Majidi, Carmel; Park, Yong-Lae

2016-02-01

The cross-sectional geometry of an embedded microchannel influences the electromechanical response of a soft microfluidic sensor to applied surface pressure. When a pressure is exerted on the surface of the sensor deforming the soft structure, the cross-sectional area of the embedded channel filled with a conductive fluid decreases, increasing the channel’s electrical resistance. This electromechanical coupling can be tuned by adding solid microspheres into the channel. In order to determine the influence of microspheres, we use both analytic and computational methods to predict the pressure responses of soft microfluidic sensors with two different channel cross-sections: a square and an equilateral triangular. The analytical models were derived from contact mechanics in which microspheres were regarded as spherical indenters, and finite element analysis (FEA) was used for simulation. For experimental validation, sensor samples with the two different channel cross-sections were prepared and tested. For comparison, the sensor samples were tested both with and without microspheres. All three results from the analytical models, the FEA simulations, and the experiments showed reasonable agreement confirming that the multi-material soft structure significantly improved its pressure response in terms of both linearity and sensitivity. The embedded solid particles enhanced the performance of soft sensors while maintaining their flexible and stretchable mechanical characteristic. We also provide analytical and experimental analyses of hysteresis of microfluidic soft sensors considering a resistive force to the shape recovery of the polymer structure by the embedded viscous fluid.
Computational Fluid Dynamics (CFD) Computations With Zonal Navier-Stokes Flow Solver (ZNSFLOW) Common High Performance Computing Scalable Software Initiative (CHSSI) Software

National Research Council Canada - National Science Library

Edge, Harris

1999-01-01

...), computational fluid dynamics (CFD) 6 project. Under the project, a proven zonal Navier-Stokes solver was rewritten for scalable parallel performance on both shared memory and distributed memory high performance computers...
Component-based software for high-performance scientific computing

Energy Technology Data Exchange (ETDEWEB)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.
Component-based software for high-performance scientific computing

International Nuclear Information System (INIS)

Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

2005-01-01

Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly
Bootstrap embedding: An internally consistent fragment-based method

Energy Technology Data Exchange (ETDEWEB)

Welborn, Matthew; Tsuchimochi, Takashi; Van Voorhis, Troy [Department of Chemistry, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, Massachusetts 02139 (United States)

2016-08-21

Strong correlation poses a difficult problem for electronic structure theory, with computational cost scaling quickly with system size. Fragment embedding is an attractive approach to this problem. By dividing a large complicated system into smaller manageable fragments “embedded” in an approximate description of the rest of the system, we can hope to ameliorate the steep cost of correlated calculations. While appealing, these methods often converge slowly with fragment size because of small errors at the boundary between fragment and bath. We describe a new electronic embedding method, dubbed “Bootstrap Embedding,” a self-consistent wavefunction-in-wavefunction embedding theory that uses overlapping fragments to improve the description of fragment edges. We apply this method to the one dimensional Hubbard model and a translationally asymmetric variant, and find that it performs very well for energies and populations. We find Bootstrap Embedding converges rapidly with embedded fragment size, overcoming the surface-area-to-volume-ratio error typical of many embedding methods. We anticipate that this method may lead to a low-scaling, high accuracy treatment of electron correlation in large molecular systems.
Research on Face Recognition Based on Embedded System

Directory of Open Access Journals (Sweden)

Hong Zhao

2013-01-01

Full Text Available Because a number of image feature data to store, complex calculation to execute during the face recognition, therefore the face recognition process was realized only by PCs with high performance. In this paper, the OpenCV facial Haar-like features were used to identify face region; the Principal Component Analysis (PCA was employed in quick extraction of face features and the Euclidean Distance was also adopted in face recognition; as thus, data amount and computational complexity would be reduced effectively in face recognition, and the face recognition could be carried out on embedded platform. Finally, based on Tiny6410 embedded platform, a set of embedded face recognition systems was constructed. The test results showed that the system has stable operation and high recognition rate can be used in portable and mobile identification and authentication.
High-performance computing on GPUs for resistivity logging of oil and gas wells

Science.gov (United States)

Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

2017-10-01

We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.
High performance computing and communications: FY 1995 implementation plan

Energy Technology Data Exchange (ETDEWEB)

NONE

1994-04-01

The High Performance Computing and Communications (HPCC) Program was formally established following passage of the High Performance Computing Act of 1991 signed on December 9, 1991. Ten federal agencies in collaboration with scientists and managers from US industry, universities, and laboratories have developed the HPCC Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1994 and FY 1995. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency. Although the Department of Education is an official HPCC agency, its current funding and reporting of crosscut activities goes through the Committee on Education and Health Resources, not the HPCC Program. For this reason the Implementation Plan covers nine HPCC agencies.
Computational Environments and Analysis methods available on the NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform

Science.gov (United States)

Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.

2014-12-01

The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will
A Middleware Platform for Providing Mobile and Embedded Computing Instruction to Software Engineering Students

Science.gov (United States)

Mattmann, C. A.; Medvidovic, N.; Malek, S.; Edwards, G.; Banerjee, S.

2012-01-01

As embedded software systems have grown in number, complexity, and importance in the modern world, a corresponding need to teach computer science students how to effectively engineer such systems has arisen. Embedded software systems, such as those that control cell phones, aircraft, and medical equipment, are subject to requirements and…
High-performance simulation-based algorithms for an alpine ski racer’s trajectory optimization in heterogeneous computer systems

Directory of Open Access Journals (Sweden)

Dębski Roman

2014-09-01

Full Text Available Effective, simulation-based trajectory optimization algorithms adapted to heterogeneous computers are studied with reference to the problem taken from alpine ski racing (the presented solution is probably the most general one published so far. The key idea behind these algorithms is to use a grid-based discretization scheme to transform the continuous optimization problem into a search problem over a specially constructed finite graph, and then to apply dynamic programming to find an approximation of the global solution. In the analyzed example it is the minimum-time ski line, represented as a piecewise-linear function (a method of elimination of unfeasible solutions is proposed. Serial and parallel versions of the basic optimization algorithm are presented in detail (pseudo-code, time and memory complexity. Possible extensions of the basic algorithm are also described. The implementation of these algorithms is based on OpenCL. The included experimental results show that contemporary heterogeneous computers can be treated as μ-HPC platforms-they offer high performance (the best speedup was equal to 128 while remaining energy and cost efficient (which is crucial in embedded systems, e.g., trajectory planners of autonomous robots. The presented algorithms can be applied to many trajectory optimization problems, including those having a black-box represented performance measure
Unravelling the structure of matter on high-performance computers

International Nuclear Information System (INIS)

Kieu, T.D.; McKellar, B.H.J.

1992-11-01

The various phenomena and the different forms of matter in nature are believed to be the manifestation of only a handful set of fundamental building blocks-the elementary particles-which interact through the four fundamental forces. In the study of the structure of matter at this level one has to consider forces which are not sufficiently weak to be treated as small perturbations to the system, an example of which is the strong force that binds the nucleons together. High-performance computers, both vector and parallel machines, have facilitated the necessary non-perturbative treatments. The principles and the techniques of computer simulations applied to Quantum Chromodynamics are explained examples include the strong interactions, the calculation of the mass of nucleons and their decay rates. Some commercial and special-purpose high-performance machines for such calculations are also mentioned. 3 refs., 2 tabs
Embedded Web Technology: Applying World Wide Web Standards to Embedded Systems

Science.gov (United States)

Ponyik, Joseph G.; York, David W.

2002-01-01

Embedded Systems have traditionally been developed in a highly customized manner. The user interface hardware and software along with the interface to the embedded system are typically unique to the system for which they are built, resulting in extra cost to the system in terms of development time and maintenance effort. World Wide Web standards have been developed in the passed ten years with the goal of allowing servers and clients to intemperate seamlessly. The client and server systems can consist of differing hardware and software platforms but the World Wide Web standards allow them to interface without knowing about the details of system at the other end of the interface. Embedded Web Technology is the merging of Embedded Systems with the World Wide Web. Embedded Web Technology decreases the cost of developing and maintaining the user interface by allowing the user to interface to the embedded system through a web browser running on a standard personal computer. Embedded Web Technology can also be used to simplify an Embedded System's internal network.
Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

Science.gov (United States)

Moon, Hongsik

What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M.F.; Ethier, S.; Wichmann, N.

2009-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores.
Performance of particle in cell methods on highly concurrent computational architectures

International Nuclear Information System (INIS)

Adams, M F; Ethier, S; Wichmann, N

2007-01-01

Particle in cell (PIC) methods are effective in computing Vlasov-Poisson system of equations used in simulations of magnetic fusion plasmas. PIC methods use grid based computations, for solving Poisson's equation or more generally Maxwell's equations, as well as Monte-Carlo type methods to sample the Vlasov equation. The presence of two types of discretizations, deterministic field solves and Monte-Carlo methods for the Vlasov equation, pose challenges in understanding and optimizing performance on today large scale computers which require high levels of concurrency. These challenges arises from the need to optimize two very different types of processes and the interactions between them. Modern cache based high-end computers have very deep memory hierarchies and high degrees of concurrency which must be utilized effectively to achieve good performance. The effective use of these machines requires maximizing concurrency by eliminating serial or redundant work and minimizing global communication. A related issue is minimizing the memory traffic between levels of the memory hierarchy because performance is often limited by the bandwidths and latencies of the memory system. This paper discusses some of the performance issues, particularly in regard to parallelism, of PIC methods. The gyrokinetic toroidal code (GTC) is used for these studies and a new radial grid decomposition is presented and evaluated. Scaling of the code is demonstrated on ITER sized plasmas with up to 16K Cray XT3/4 cores
5th International Conference on High Performance Scientific Computing

CERN Document Server

Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

2014-01-01

This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...
3rd International Conference on High Performance Scientific Computing

CERN Document Server

Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

2008-01-01

This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...

6th International Conference on High Performance Scientific Computing

CERN Document Server

Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

2017-01-01

This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...
High performance computing system in the framework of the Higgs boson studies

CERN Document Server

Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

2017-01-01

The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...
8th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2015-01-01

Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.
Numerical simulation of a hovering rotor using embedded grids

Science.gov (United States)

Duque, Earl-Peter N.; Srinivasan, Ganapathi R.

1992-01-01

The flow field for a rotor blade in hover was computed by numerically solving the compressible thin-layer Navier-Stokes equations on embedded grids. In this work, three embedded grids were used to discretize the flow field - one for the rotor blade and two to convect the rotor wake. The computations were performed at two hovering test conditions, for a two-bladed rectangular rotor of aspect ratio six. The results compare fairly with experiment and illustrates the use of embedded grids in solving helicopter type flow fields.
A High Performance COTS Based Computer Architecture

Science.gov (United States)

Patte, Mathieu; Grimoldi, Raoul; Trautner, Roland

2014-08-01

Using Commercial Off The Shelf (COTS) electronic components for space applications is a long standing idea. Indeed the difference in processing performance and energy efficiency between radiation hardened components and COTS components is so important that COTS components are very attractive for use in mass and power constrained systems. However using COTS components in space is not straightforward as one must account with the effects of the space environment on the COTS components behavior. In the frame of the ESA funded activity called High Performance COTS Based Computer, Airbus Defense and Space and its subcontractor OHB CGS have developed and prototyped a versatile COTS based architecture for high performance processing. The rest of the paper is organized as follows: in a first section we will start by recapitulating the interests and constraints of using COTS components for space applications; then we will briefly describe existing fault mitigation architectures and present our solution for fault mitigation based on a component called the SmartIO; in the last part of the paper we will describe the prototyping activities executed during the HiP CBC project.
PCB Embedded Inductor for High-Frequency ZVS SEPIC Converter

DEFF Research Database (Denmark)

Dou, Yi; Ouyang, Ziwei; Thummala, Prasanth

2018-01-01

The volume and temperature rise of passive components, especially inductors, limit the momentum toward high power density in high-frequency power converters. To address the limitations, PCB integration of passive components should be considered with the benefit of low profile, excellent thermal...... characteristic and cost reduction. This paper investigates an embedded structure of inductors to further increase the power density of a low power DC-DC converter. A pair of coupling inductors have been embedded into the PCB. The detailed embedded process has been described and the characteristics of embedded...
High-performance computational fluid dynamics: a custom-code approach

International Nuclear Information System (INIS)

Fannon, James; Náraigh, Lennon Ó; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain

2016-01-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing. (paper)
High-performance computational fluid dynamics: a custom-code approach

Science.gov (United States)

Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

2016-07-01

We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.
Embedded Systems Design with FPGAs

CERN Document Server

Pnevmatikatos, Dionisios; Sklavos, Nicolas

2013-01-01

This book presents methodologies for modern applications of embedded systems design, using field programmable gate array (FPGA) devices. Coverage includes state-of-the-art research from academia and industry on a wide range of topics, including advanced electronic design automation (EDA), novel system architectures, embedded processors, arithmetic, dynamic reconfiguration and applications. Describes a variety of methodologies for modern embedded systems design; Implements methodologies presented on FPGAs; Covers a wide variety of applications for reconfigurable embedded systems, including Bioinformatics, Communications and networking, Application acceleration, Medical solutions, Experiments for high energy physics, Astronomy, Aerospace, Biologically inspired systems and Computational fluid dynamics (CFD).
Diamond High Assurance Security Program: Trusted Computing Exemplar

Science.gov (United States)

2002-09-01

computing component, the Embedded MicroKernel Prototype. A third-party evaluation of the component will be initiated during development (e.g., once...target technologies and larger projects is a topic for future research. Trusted Computing Reference Component – The Embedded MicroKernel Prototype We...Kernel The primary security function of the Embedded MicroKernel will be to enforce process and data-domain separation, while providing primitive
SCEAPI: A unified Restful Web API for High-Performance Computing

Science.gov (United States)

Rongqiang, Cao; Haili, Xiao; Shasha, Lu; Yining, Zhao; Xiaoning, Wang; Xuebin, Chi

2017-10-01

The development of scientific computing is increasingly moving to collaborative web and mobile applications. All these applications need high-quality programming interface for accessing heterogeneous computing resources consisting of clusters, grid computing or cloud computing. In this paper, we introduce our high-performance computing environment that integrates computing resources from 16 HPC centers across China. Then we present a bundle of web services called SCEAPI and describe how it can be used to access HPC resources with HTTP or HTTPs protocols. We discuss SCEAPI from several aspects including architecture, implementation and security, and address specific challenges in designing compatible interfaces and protecting sensitive data. We describe the functions of SCEAPI including authentication, file transfer and job management for creating, submitting and monitoring, and how to use SCEAPI in an easy-to-use way. Finally, we discuss how to exploit more HPC resources quickly for the ATLAS experiment by implementing the custom ARC compute element based on SCEAPI, and our work shows that SCEAPI is an easy-to-use and effective solution to extend opportunistic HPC resources.
Multi-Language Programming Environments for High Performance Java Computing

OpenAIRE

Vladimir Getov; Paul Gray; Sava Mintchev; Vaidy Sunderam

1999-01-01

Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI) tool which provides ...
FY 1993 Blue Book: Grand Challenges 1993: High Performance Computing and Communications

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...
Embedded Sensors and Controls to Improve Component Performance and Reliability: Conceptual Design Report

Energy Technology Data Exchange (ETDEWEB)

Kisner, Roger A [ORNL; Melin, Alexander M [ORNL; Burress, Timothy A [ORNL; Fugate, David L [ORNL; Holcomb, David Eugene [ORNL; Wilgen, John B [ORNL; Miller, John M [ORNL; Wilson, Dane F [ORNL; Silva, Pamela C [ORNL; Whitlow, Lynsie J [ORNL; Peretz, Fred J [ORNL

2012-10-01

The overall project objective is to demonstrate improved reliability and increased performance made possible by deeply embedding instrumentation and controls (I&C) in nuclear power plant components. The project is employing a highly instrumented canned rotor, magnetic bearing, fluoride salt pump as its I&C technology demonstration vehicle. The project s focus is not primarily on pump design, but instead is on methods to deeply embed I&C within a pump system. However, because the I&C is intimately part of the basic millisecond-by-millisecond functioning of the pump, the I&C design cannot proceed in isolation from the other aspects of the pump. The pump will not function if the characteristics of the I&C are not embedded within the design because the I&C enables performance of the basic function rather than merely monitoring quasi-stable performance. Traditionally, I&C has been incorporated in nuclear power plant (NPP) components after their design is nearly complete; adequate performance was obtained through over-design. This report describes the progress and status of the project and provides a conceptual design overview for the embedded I&C pump.
High-Performance Computing Paradigm and Infrastructure

CERN Document Server

Yang, Laurence T

2006-01-01

With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream
Computers as Components Principles of Embedded Computing System Design

CERN Document Server

Wolf, Wayne

2008-01-01

This book was the first to bring essential knowledge on embedded systems technology and techniques under a single cover. This second edition has been updated to the state-of-the-art by reworking and expanding performance analysis with more examples and exercises, and coverage of electronic systems now focuses on the latest applications. Researchers, students, and savvy professionals schooled in hardware or software design, will value Wayne Wolf's integrated engineering design approach.The second edition gives a more comprehensive view of multiprocessors including VLIW and superscalar archite
Real-Time Video Convolutional Face Finder on Embedded Platforms

Directory of Open Access Journals (Sweden)

Mamalet Franck

2007-01-01

Full Text Available A high-level optimization methodology is applied for implementing the well-known convolutional face finder (CFF algorithm for real-time applications on mobile phones, such as teleconferencing, advanced user interfaces, image indexing, and security access control. CFF is based on a feature extraction and classification technique which consists of a pipeline of convolutions and subsampling operations. The design of embedded systems requires a good trade-off between performance and code size due to the limited amount of available resources. The followed methodology copes with the main drawbacks of the original implementation of CFF such as floating-point computation and memory allocation, in order to allow parallelism exploitation and perform algorithm optimizations. Experimental results show that our embedded face detection system can accurately locate faces with less computational load and memory cost. It runs on a 275 MHz Starcore DSP at 35 QCIF images/s with state-of-the-art detection rates and very low false alarm rates.
Real-Time Video Convolutional Face Finder on Embedded Platforms

Directory of Open Access Journals (Sweden)

Franck Mamalet

2007-03-01

Full Text Available A high-level optimization methodology is applied for implementing the well-known convolutional face finder (CFF algorithm for real-time applications on mobile phones, such as teleconferencing, advanced user interfaces, image indexing, and security access control. CFF is based on a feature extraction and classification technique which consists of a pipeline of convolutions and subsampling operations. The design of embedded systems requires a good trade-off between performance and code size due to the limited amount of available resources. The followed methodology copes with the main drawbacks of the original implementation of CFF such as floating-point computation and memory allocation, in order to allow parallelism exploitation and perform algorithm optimizations. Experimental results show that our embedded face detection system can accurately locate faces with less computational load and memory cost. It runs on a 275 MHz Starcore DSP at 35 QCIF images/s with state-of-the-art detection rates and very low false alarm rates.
High-Level Synthesis: Productivity, Performance, and Software Constraints

Directory of Open Access Journals (Sweden)

Yun Liang

2012-01-01

Full Text Available FPGAs are an attractive platform for applications with high computation demand and low energy consumption requirements. However, design effort for FPGA implementations remains high—often an order of magnitude larger than design effort using high-level languages. Instead of this time-consuming process, high-level synthesis (HLS tools generate hardware implementations from algorithm descriptions in languages such as C/C++ and SystemC. Such tools reduce design effort: high-level descriptions are more compact and less error prone. HLS tools promise hardware development abstracted from software designer knowledge of the implementation platform. In this paper, we present an unbiased study of the performance, usability and productivity of HLS using AutoPilot (a state-of-the-art HLS tool. In particular, we first evaluate AutoPilot using the popular embedded benchmark kernels. Then, to evaluate the suitability of HLS on real-world applications, we perform a case study of stereo matching, an active area of computer vision research that uses techniques also common for image denoising, image retrieval, feature matching, and face recognition. Based on our study, we provide insights on current limitations of mapping general-purpose software to hardware using HLS and some future directions for HLS tool development. We also offer several guidelines for hardware-friendly software design. For popular embedded benchmark kernels, the designs produced by HLS achieve 4X to 126X speedup over the software version. The stereo matching algorithms achieve between 3.5X and 67.9X speedup over software (but still less than manual RTL design with a fivefold reduction in design effort versus manual RTL design.
Autonomous Multicamera Tracking on Embedded Smart Cameras

Directory of Open Access Journals (Sweden)

Bischof Horst

2007-01-01

Full Text Available There is currently a strong trend towards the deployment of advanced computer vision methods on embedded systems. This deployment is very challenging since embedded platforms often provide limited resources such as computing performance, memory, and power. In this paper we present a multicamera tracking method on distributed, embedded smart cameras. Smart cameras combine video sensing, processing, and communication on a single embedded device which is equipped with a multiprocessor computation and communication infrastructure. Our multicamera tracking approach focuses on a fully decentralized handover procedure between adjacent cameras. The basic idea is to initiate a single tracking instance in the multicamera system for each object of interest. The tracker follows the supervised object over the camera network, migrating to the camera which observes the object. Thus, no central coordination is required resulting in an autonomous and scalable tracking approach. We have fully implemented this novel multicamera tracking approach on our embedded smart cameras. Tracking is achieved by the well-known CamShift algorithm; the handover procedure is realized using a mobile agent system available on the smart camera network. Our approach has been successfully evaluated on tracking persons at our campus.

STFTP: Secure TFTP Protocol for Embedded Multi-Agent Systems Communication

Directory of Open Access Journals (Sweden)

ZAGAR, D.

2013-05-01

Full Text Available Today's embedded systems have evolved into multipurpose devices moving towards an embedded multi-agent system (MAS infrastructure. With the involvement of MAS in embedded systems, one remaining issues is establishing communication between agents in low computational power and low memory embedded systems without present Embedded Operating System (EOS. One solution is the extension of an outdated Trivial File Transfer Protocol (TFTP. The main advantage of using TFTP in embedded systems is the easy implementation. However, the problem at hand is the overall lack of security mechanisms in TFTP. This paper proposes an extension to the existing TFTP in a form of added security mechanisms: STFTP. The authentication is proposed using Digest Access Authentication process whereas the data encryption can be performed by various cryptographic algorithms. The proposal is experimentally tested using two embedded systems based on micro-controller architecture. Communication is analyzed for authentication, data rate and transfer time versus various data encryption ciphers and files sizes. STFTP results in an expected drop in performance, which is in the range of similar encryption algorithms. The system could be improved by using embedded systems of higher computational power or by the use of hardware encryption modules.
High Performance Computing - Power Application Programming Interface Specification.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H.,; Kelly, Suzanne M.; Pedretti, Kevin; Grant, Ryan; Olivier, Stephen Lecler; Levenhagen, Michael J.; DeBonis, David

2014-08-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
Nuclear forces and high-performance computing: The perfect match

International Nuclear Information System (INIS)

Luu, T; Walker-Loud, A

2009-01-01

High-performance computing is now enabling the calculation of certain hadronic interaction parameters directly from Quantum Chromodynamics, the quantum field theory that governs the behavior of quarks and gluons and is ultimately responsible for the nuclear strong force. In this paper we briefly describe the state of the field and show how other aspects of hadronic interactions will be ascertained in the near future. We give estimates of computational requirements needed to obtain these goals, and outline a procedure for incorporating these results into the broader nuclear physics community.
Integrated Optical Interconnect Architectures for Embedded Systems

CERN Document Server

Nicolescu, Gabriela

2013-01-01

This book provides a broad overview of current research in optical interconnect technologies and architectures. Introductory chapters on high-performance computing and the associated issues in conventional interconnect architectures, and on the fundamental building blocks for integrated optical interconnect, provide the foundations for the bulk of the book which brings together leading experts in the field of optical interconnect architectures for data communication. Particular emphasis is given to the ways in which the photonic components are assembled into architectures to address the needs of data-intensive on-chip communication, and to the performance evaluation of such architectures for specific applications. Provides state-of-the-art research on the use of optical interconnects in Embedded Systems; Begins with coverage of the basics for high-performance computing and optical interconnect; Includes a variety of on-chip optical communication topologies; Features coverage of system integration and opti...
Adaptive GDDA-BLAST: fast and efficient algorithm for protein sequence embedding.

Directory of Open Access Journals (Sweden)

Yoojin Hong

2010-10-01

Full Text Available A major computational challenge in the genomic era is annotating structure/function to the vast quantities of sequence information that is now available. This problem is illustrated by the fact that most proteins lack comprehensive annotations, even when experimental evidence exists. We previously theorized that embedded-alignment profiles (simply "alignment profiles" hereafter provide a quantitative method that is capable of relating the structural and functional properties of proteins, as well as their evolutionary relationships. A key feature of alignment profiles lies in the interoperability of data format (e.g., alignment information, physio-chemical information, genomic information, etc.. Indeed, we have demonstrated that the Position Specific Scoring Matrices (PSSMs are an informative M-dimension that is scored by quantitatively measuring the embedded or unmodified sequence alignments. Moreover, the information obtained from these alignments is informative, and remains so even in the "twilight zone" of sequence similarity (<25% identity. Although our previous embedding strategy was powerful, it suffered from contaminating alignments (embedded AND unmodified and high computational costs. Herein, we describe the logic and algorithmic process for a heuristic embedding strategy named "Adaptive GDDA-BLAST." Adaptive GDDA-BLAST is, on average, up to 19 times faster than, but has similar sensitivity to our previous method. Further, data are provided to demonstrate the benefits of embedded-alignment measurements in terms of detecting structural homology in highly divergent protein sequences and isolating secondary structural elements of transmembrane and ankyrin-repeat domains. Together, these advances allow further exploration of the embedded alignment data space within sufficiently large data sets to eventually induce relevant statistical inferences. We show that sequence embedding could serve as one of the vehicles for measurement of low
NINJA: Java for High Performance Numerical Computing

Directory of Open Access Journals (Sweden)

José E. Moreira

2002-01-01

Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.
High performance parallel computing of flows in complex geometries: II. Applications

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F; Poinsot, T

2009-01-01

Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.
High frequency characterization of conductive inks embedded within a structural composite

Science.gov (United States)

Pa, Peter; McCauley, Raymond; Larimore, Zachary; Mills, Matthew; Yarlaggada, Shridhar; Mirotznik, Mark S.

2015-06-01

Woven fabric composites provide an attractive platform for integrating electromagnetic functionality—such as conformal load-bearing antennas and frequency selective surfaces—into a structural platform. One practical fabrication method for integrating conductive elements within a woven fabric composite system involves using additive manufacturing systems such as screen printing. While screen printing is an inherently scalable, flexible and cost effective method, little is known about the high frequency electrical properties of its conductive inks when they are embedded within the woven fabric composite. Thus, we have completed numerical and experimental studies to determine the electrical conductivity of screen printable conductive inks that are embedded within this composite. We have also performed mechanical studies to evaluate how printing affects the structural performance of the composite.
Embedded Thermal Control for Spacecraft Subsystems Miniaturization

Science.gov (United States)

Didion, Jeffrey R.

2014-01-01

Optimization of spacecraft size, weight and power (SWaP) resources is an explicit technical priority at Goddard Space Flight Center. Embedded Thermal Control Subsystems are a promising technology with many cross cutting NSAA, DoD and commercial applications: 1.) CubeSatSmallSat spacecraft architecture, 2.) high performance computing, 3.) On-board spacecraft electronics, 4.) Power electronics and RF arrays. The Embedded Thermal Control Subsystem technology development efforts focus on component, board and enclosure level devices that will ultimately include intelligent capabilities. The presentation will discuss electric, capillary and hybrid based hardware research and development efforts at Goddard Space Flight Center. The Embedded Thermal Control Subsystem development program consists of interrelated sub-initiatives, e.g., chip component level thermal control devices, self-sensing thermal management, advanced manufactured structures. This presentation includes technical status and progress on each of these investigations. Future sub-initiatives, technical milestones and program goals will be presented.
RAPPORT: running scientific high-performance computing applications on the cloud.

Science.gov (United States)

Cohen, Jeremy; Filippis, Ioannis; Woodbridge, Mark; Bauer, Daniela; Hong, Neil Chue; Jackson, Mike; Butcher, Sarah; Colling, David; Darlington, John; Fuchs, Brian; Harvey, Matt

2013-01-28

Cloud computing infrastructure is now widely used in many domains, but one area where there has been more limited adoption is research computing, in particular for running scientific high-performance computing (HPC) software. The Robust Application Porting for HPC in the Cloud (RAPPORT) project took advantage of existing links between computing researchers and application scientists in the fields of bioinformatics, high-energy physics (HEP) and digital humanities, to investigate running a set of scientific HPC applications from these domains on cloud infrastructure. In this paper, we focus on the bioinformatics and HEP domains, describing the applications and target cloud platforms. We conclude that, while there are many factors that need consideration, there is no fundamental impediment to the use of cloud infrastructure for running many types of HPC applications and, in some cases, there is potential for researchers to benefit significantly from the flexibility offered by cloud platforms.
FY 1995 Blue Book: High Performance Computing and Communications: Technology for the National Information Infrastructure

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program was created to accelerate the development of future generations of high performance computers...
User interfaces for computational science: A domain specific language for OOMMF embedded in Python

Science.gov (United States)

Beg, Marijan; Pepper, Ryan A.; Fangohr, Hans

2017-05-01

Computer simulations are used widely across the engineering and science disciplines, including in the research and development of magnetic devices using computational micromagnetics. In this work, we identify and review different approaches to configuring simulation runs: (i) the re-compilation of source code, (ii) the use of configuration files, (iii) the graphical user interface, and (iv) embedding the simulation specification in an existing programming language to express the computational problem. We identify the advantages and disadvantages of different approaches and discuss their implications on effectiveness and reproducibility of computational studies and results. Following on from this, we design and describe a domain specific language for micromagnetics that is embedded in the Python language, and allows users to define the micromagnetic simulations they want to carry out in a flexible way. We have implemented this micromagnetic simulation description language together with a computational backend that executes the simulation task using the Object Oriented MicroMagnetic Framework (OOMMF). We illustrate the use of this Python interface for OOMMF by solving the micromagnetic standard problem 4. All the code is publicly available and is open source.
Human Computer Music Performance

OpenAIRE

Dannenberg, Roger B.

2012-01-01

Human Computer Music Performance (HCMP) is the study of music performance by live human performers and real-time computer-based performers. One goal of HCMP is to create a highly autonomous artificial performer that can fill the role of a human, especially in a popular music setting. This will require advances in automated music listening and understanding, new representations for music, techniques for music synchronization, real-time human-computer communication, music generation, sound synt...
The high performance cluster computing system for BES offline data analysis

International Nuclear Information System (INIS)

Sun Yongzhao; Xu Dong; Zhang Shaoqiang; Yang Ting

2004-01-01

A high performance cluster computing system (EPCfarm) is introduced, which used for BES offline data analysis. The setup and the characteristics of the hardware and software of EPCfarm are described. The PBS, a queue management package, and the performance of EPCfarm is presented also. (authors)
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY

International Nuclear Information System (INIS)

FENG, H.; JONES, K.W.; MCGUIGAN, M.; SMITH, G.J.; SPILETIC, J.

2001-01-01

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data
HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY.

Energy Technology Data Exchange (ETDEWEB)

FENG,H.; JONES,K.W.; MCGUIGAN,M.; SMITH,G.J.; SPILETIC,J.

2001-10-12

Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.
A Micro-Computed Tomography Technique to Study the Quality of Fibre Optics Embedded in Composite Materials

Directory of Open Access Journals (Sweden)

Gabriele Chiesura

2015-05-01

Full Text Available Quality of embedment of optical fibre sensors in carbon fibre-reinforced polymers plays an important role in the resultant properties of the composite, as well as for the correct monitoring of the structure. Therefore, availability of a tool able to check the optical fibre sensor-composite interaction becomes essential. High-resolution 3D X-ray Micro-Computed Tomography, or Micro-CT, is a relatively new non-destructive inspection technique which enables investigations of the internal structure of a sample without actually compromising its integrity. In this work the feasibility of inspecting the position, the orientation and, more generally, the quality of the embedment of an optical fibre sensor in a carbon fibre reinforced laminate at unit cell level have been proven.
Thinking processes used by high-performing students in a computer programming task

Directory of Open Access Journals (Sweden)

Marietjie Havenga

2011-07-01

Full Text Available Computer programmers must be able to understand programming source code and write programs that execute complex tasks to solve real-world problems. This article is a trans- disciplinary study at the intersection of computer programming, education and psychology. It outlines the role of mental processes in the process of programming and indicates how successful thinking processes can support computer science students in writing correct and well-defined programs. A mixed methods approach was used to better understand the thinking activities and programming processes of participating students. Data collection involved both computer programs and students’ reflective thinking processes recorded in their journals. This enabled analysis of psychological dimensions of participants’ thinking processes and their problem-solving activities as they considered a programming problem. Findings indicate that the cognitive, reflective and psychological processes used by high-performing programmers contributed to their success in solving a complex programming problem. Based on the thinking processes of high performers, we propose a model of integrated thinking processes, which can support computer programming students. Keywords: Computer programming, education, mixed methods research, thinking processes. Disciplines: Computer programming, education, psychology
A high-performance Riccati based solver for tree-structured quadratic programs

DEFF Research Database (Denmark)

Frison, Gianluca; Kouzoupis, Dimitris; Diehl, Moritz

2017-01-01

the online solution of such problems challenging and the development of tailored solvers crucial. In this paper, an interior point method is presented that can solve Quadratic Programs (QPs) arising in multi-stage MPC efficiently by means of a tree-structured Riccati recursion and a high-performance linear...... algebra library. A performance comparison with code-generated and general purpose sparse QP solvers shows that the computation times can be significantly reduced for all problem sizes that are practically relevant in embedded MPC applications. The presented implementation is freely available as part...
Spatial Processing of Urban Acoustic Wave Fields from High-Performance Computations

National Research Council Canada - National Science Library

Ketcham, Stephen A; Wilson, D. K; Cudney, Harley H; Parker, Michael W

2007-01-01

.... The objective of this work is to develop spatial processing techniques for acoustic wave propagation data from three-dimensional high-performance computations to quantify scattering due to urban...

Smart device definition and application on embedded system: performance and optimi-zation on a RGBD sensor

Directory of Open Access Journals (Sweden)

Jose-Luis JIMÉNEZ-GARCÍA

2014-10-01

Full Text Available Embedded control systems usually are characterized by its limitations in terms of computational power and memory. Although this systems must deal with perpection and actuation signal adaptation and calculate control actions ensuring its reliability and providing a certain degree of fault tolerance. The allocation of these tasks between some different embedded nodes conforming a distributed control system allows to solve many of these issues. For that reason is proposed the application of smart devices aims to perform the data processing tasks related with the perception and actuation and offer a simple interface to be configured by other nodes in order to share processed information and raise QoS based alarms. In this work is introduced the procedure of implementing a smart device as a sensor as an embedded node in a distributed control system. In order to analyze its benefits an application based on a RGBD sensor implemented as an smart device is proposed.
DEISA2: supporting and developing a European high-performance computing ecosystem

International Nuclear Information System (INIS)

Lederer, H

2008-01-01

The DEISA Consortium has deployed and operated the Distributed European Infrastructure for Supercomputing Applications. Through the EU FP7 DEISA2 project (funded for three years as of May 2008), the consortium is continuing to support and enhance the distributed high-performance computing infrastructure and its activities and services relevant for applications enabling, operation, and technologies, as these are indispensable for the effective support of computational sciences for high-performance computing (HPC). The service-provisioning model will be extended from one that supports single projects to one supporting virtual European communities. Collaborative activities will also be carried out with new European and other international initiatives. Of strategic importance is cooperation with the PRACE project, which is preparing for the installation of a limited number of leadership-class Tier-0 supercomputers in Europe. The key role and aim of DEISA will be to deliver a turnkey operational solution for a persistent European HPC ecosystem that will integrate national Tier-1 centers and the new Tier-0 centers
OPERATIONAL PERFORMANCES DEMONSTRATION OF POLYMER-CERAMIC EMBEDDED CAPACITORS FOR MMIC APPLICATIONS

OpenAIRE

Bord-Majek , Isabelle; Kertesz , Philippe; Mazeau , Julie; Caban-Chastas , Daniel; Levrier , Bruno; Bechou , Laurent; Ousten , Yves

2011-01-01

International audience; Embedded passives are becoming increasingly important for the manufacture of highly integrated electronic boards and packages. The need for embedded passives emerges from the growing consumer demand for product miniaturization thus requiring smaller components and space efficient packaging. This can be realized by replacing discrete components that demands a higher volume than embedded passives. Embedded passives have already been investigated in the last few years. Ho...
Embedded Processor Laboratory

Data.gov (United States)

Federal Laboratory Consortium — The Embedded Processor Laboratory provides the means to design, develop, fabricate, and test embedded computers for missile guidance electronics systems in support...
High performance simulation for the Silva project using the tera computer

International Nuclear Information System (INIS)

Bergeaud, V.; La Hargue, J.P.; Mougery, F.; Boulet, M.; Scheurer, B.; Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A.

2003-01-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
High performance simulation for the Silva project using the tera computer

Energy Technology Data Exchange (ETDEWEB)

Bergeaud, V.; La Hargue, J.P.; Mougery, F. [CS Communication and Systemes, 92 - Clamart (France); Boulet, M.; Scheurer, B. [CEA Bruyeres-le-Chatel, 91 - Bruyeres-le-Chatel (France); Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A. [CEA Saclay, 91 - Gif sur Yvette (France)

2003-07-01

In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)
Scalability of DL_POLY on High Performance Computing Platform

Directory of Open Access Journals (Sweden)

Mabule Samuel Mabakane

2017-12-01

Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.
Embedded control system for high power RF amplifiers

International Nuclear Information System (INIS)

Sharma, Deepak Kumar; Gupta, Alok Kumar; Jain, Akhilesh; Hannurkar, P.R.

2011-01-01

RF power devices are usually very sensitive to overheat and reflected RF power; hence a protective interlock system is required to be embedded with high power solid state RF amplifiers. The solid state RF amplifiers have salient features of graceful degradation and very low mean time to repair (MTTR). In order to exploit these features in favour of lowest system downtime, a real-time control system is embedded with high power RF amplifiers. The control system is developed with the features of monitoring, measurement and network publishing of various parameters, historical data logging, alarm generation, displaying data to the operator and tripping the system in case of any interlock failure. This paper discusses the design philosophy, features, functions and implementation details of the embedded control system. (author)
User interfaces for computational science: A domain specific language for OOMMF embedded in Python

Directory of Open Access Journals (Sweden)

Marijan Beg

2017-05-01

Full Text Available Computer simulations are used widely across the engineering and science disciplines, including in the research and development of magnetic devices using computational micromagnetics. In this work, we identify and review different approaches to configuring simulation runs: (i the re-compilation of source code, (ii the use of configuration files, (iii the graphical user interface, and (iv embedding the simulation specification in an existing programming language to express the computational problem. We identify the advantages and disadvantages of different approaches and discuss their implications on effectiveness and reproducibility of computational studies and results. Following on from this, we design and describe a domain specific language for micromagnetics that is embedded in the Python language, and allows users to define the micromagnetic simulations they want to carry out in a flexible way. We have implemented this micromagnetic simulation description language together with a computational backend that executes the simulation task using the Object Oriented MicroMagnetic Framework (OOMMF. We illustrate the use of this Python interface for OOMMF by solving the micromagnetic standard problem 4. All the code is publicly available and is open source.
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto

2013-09-10

State-of-the art computers need high performance transistors, which consume ultra-low power resulting in longer battery lifetime. Billions of transistors are integrated neatly using matured silicon fabrication process to maintain the performance per cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today\\'s computers. One limitation is silicon\\'s rigidity and brittleness. Here we show a generic batch process to convert high performance silicon electronics into flexible and semi-transparent one while retaining its performance, process compatibility, integration density and cost. We demonstrate high-k/metal gate stack based p-type metal oxide semiconductor field effect transistors on 4 inch silicon fabric released from bulk silicon (100) wafers with sub-threshold swing of 80 mV dec(-1) and on/off ratio of near 10(4) within 10% device uniformity with a minimum bending radius of 5 mm and an average transmittance of similar to 7% in the visible spectrum.
High performance SONOS flash memory with in-situ silicon nanocrystals embedded in silicon nitride charge trapping layer

Science.gov (United States)

Lim, Jae-Gab; Yang, Seung-Dong; Yun, Ho-Jin; Jung, Jun-Kyo; Park, Jung-Hyun; Lim, Chan; Cho, Gyu-seok; Park, Seong-gye; Huh, Chul; Lee, Hi-Deok; Lee, Ga-Won

2018-02-01

In this paper, SONOS-type flash memory device with highly improved charge-trapping efficiency is suggested by using silicon nanocrystals (Si-NCs) embedded in silicon nitride (SiNX) charge trapping layer. The Si-NCs were in-situ grown by PECVD without additional post annealing process. The fabricated device shows high program/erase speed and retention property which is suitable for multi-level cell (MLC) application. Excellent performance and reliability for MLC are demonstrated with large memory window of ∼8.5 V and superior retention characteristics of 7% charge loss for 10 years. High resolution transmission electron microscopy image confirms the Si-NC formation and the size is around 1-2 nm which can be verified again in X-ray photoelectron spectroscopy (XPS) where pure Si bonds increase. Besides, XPS analysis implies that more nitrogen atoms make stable bonds at the regular lattice point. Photoluminescence spectra results also illustrate that Si-NCs formation in SiNx is an effective method to form deep trap states.
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai; Shae, Zon Yin; Zhang, Xiangliang; Jamjoom, Hani T.; Fong, Liana

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies
Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

KAUST Repository

Downes, Turlough P.

2013-01-01

As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.
Bringing high-performance computing to the biologist's workbench: approaches, applications, and challenges

International Nuclear Information System (INIS)

Oehmen, C S; Cannon, W R

2008-01-01

Data-intensive and high-performance computing are poised to significantly impact the future of biological research which is increasingly driven by the prevalence of high-throughput experimental methodologies for genome sequencing, transcriptomics, proteomics, and other areas. Large centers such as NIH's National Center for Biotechnology Information, The Institute for Genomic Research, and the DOE's Joint Genome Institute) have made extensive use of multiprocessor architectures to deal with some of the challenges of processing, storing and curating exponentially growing genomic and proteomic datasets, thus enabling users to rapidly access a growing public data source, as well as use analysis tools transparently on high-performance computing resources. Applying this computational power to single-investigator analysis, however, often relies on users to provide their own computational resources, forcing them to endure the learning curve of porting, building, and running software on multiprocessor architectures. Solving the next generation of large-scale biology challenges using multiprocessor machines-from small clusters to emerging petascale machines-can most practically be realized if this learning curve can be minimized through a combination of workflow management, data management and resource allocation as well as intuitive interfaces and compatibility with existing common data formats
High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

Science.gov (United States)

Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

2012-09-01

By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda; Yokota, Rio; Keyes, David E.

2016-01-01

model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization
Ultra-fine CuO Nanoparticles Embedded in Three-dimensional Graphene Network Nano-structure for High-performance Flexible Supercapacitors

International Nuclear Information System (INIS)

Li, Yanrong; Wang, Xue; Yang, Qi; Javed, Muhammad Sufyan; Liu, Qipeng; Xu, Weina; Hu, Chenguo; Wei, Dapeng

2017-01-01

High conductivity, large specific surface area and excellent performance redox materials are urgently desired for improving electrochemical energy storage. However, with single redox material it is hard to achieve these properties. Herein, we develop ultra-fine CuO nanoparticles embedded in three-dimensional graphene network grown on carbon cloth (CuO/3DGN/CC) to construct a novel electrode material with advantages of high conductivity, large specific area and excellent redox activity for supercapacitor application. The CuO/3DGN/CC with different CuO mass ratios are utilized to fabricate supercapacitors and the optimized mass loading achieves the high areal capacitance of 2787 mF cm"−"2 and specific capacitance of 1539.8 F g"−"1 at current density of 6 mA cm"−"2 with good stability. In addition, a high-flexible solid-state symmetric supercapacitor is also fabricated by using this CuO/3DGN/CC composite. The device shows excellent electrochemical performance even at various bending angles indicating a promising application for wearable electronic devices, and two devices with area 2 × 4 cm"2 in series can light nine light emitting diodes for more than 3 minutes.
Amorphous Silicon-Germanium Films with Embedded Nanocrystals for Thermal Detectors with Very High Sensitivity

Directory of Open Access Journals (Sweden)

Cesar Calleja

2016-01-01

Full Text Available We have optimized the deposition conditions of amorphous silicon-germanium films with embedded nanocrystals in a plasma enhanced chemical vapor deposition (PECVD reactor, working at a standard frequency of 13.56 MHz. The objective was to produce films with very large Temperature Coefficient of Resistance (TCR, which is a signature of the sensitivity in thermal detectors (microbolometers. Morphological, electrical, and optical characterization were performed in the films, and we found optimal conditions for obtaining films with very high values of thermal coefficient of resistance (TCR = 7.9% K−1. Our results show that amorphous silicon-germanium films with embedded nanocrystals can be used as thermosensitive films in high performance infrared focal plane arrays (IRFPAs used in commercial thermal cameras.
Securing Embedded Smart Cameras with Trusted Computing

Directory of Open Access Journals (Sweden)

Winkler Thomas

2011-01-01

Full Text Available Camera systems are used in many applications including video surveillance for crime prevention and investigation, traffic monitoring on highways or building monitoring and automation. With the shift from analog towards digital systems, the capabilities of cameras are constantly increasing. Today's smart camera systems come with considerable computing power, large memory, and wired or wireless communication interfaces. With onboard image processing and analysis capabilities, cameras not only open new possibilities but also raise new challenges. Often overlooked are potential security issues of the camera system. The increasing amount of software running on the cameras turns them into attractive targets for attackers. Therefore, the protection of camera devices and delivered data is of critical importance. In this work we present an embedded camera prototype that uses Trusted Computing to provide security guarantees for streamed videos. With a hardware-based security solution, we ensure integrity, authenticity, and confidentiality of videos. Furthermore, we incorporate image timestamping, detection of platform reboots, and reporting of the system status. This work is not limited to theoretical considerations but also describes the implementation of a prototype system. Extensive evaluation results illustrate the practical feasibility of the approach.
Overview of Parallel Platforms for Common High Performance Computing

Directory of Open Access Journals (Sweden)

T. Fryza

2012-04-01

Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.

Graphene-Embedded Co3O4 Rose-Spheres for Enhanced Performance in Lithium Ion Batteries.

Science.gov (United States)

Jing, Mingjun; Zhou, Minjie; Li, Gangyong; Chen, Zhengu; Xu, Wenyuan; Chen, Xiaobo; Hou, Zhaohui

2017-03-22

Co 3 O 4 has been widely studied as a promising candidate as an anode material for lithium ion batteries. However, the huge volume change and structural strain associated with the Li + insertion and extraction process leads to the pulverization and deterioration of the electrode, resulting in a poor performance in lithium ion batteries. In this paper, Co 3 O 4 rose-spheres obtained via hydrothermal technique are successfully embedded in graphene through an electrostatic self-assembly process. Graphene-embedded Co 3 O 4 rose-spheres (G-Co 3 O 4 ) show a high reversible capacity, a good cyclic performance, and an excellent rate capability, e.g., a stable capacity of 1110.8 mAh g -1 at 90 mA g -1 (0.1 C), and a reversible capacity of 462.3 mAh g -1 at 1800 mA g -1 (2 C), benefitted from the novel architecture of graphene-embedded Co 3 O 4 rose-spheres. This work has demonstrated a feasible strategy to improve the performance of Co 3 O 4 for lithium-ion battery application.
10th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

2017-01-01

This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.
Embedded Sensors and Controls to Improve Component Performance and Reliability -- Loop-scale Testbed Design Report

Energy Technology Data Exchange (ETDEWEB)

Melin, Alexander M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kisner, Roger A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2016-09-01

Embedded instrumentation and control systems that can operate in extreme environments are challenging to design and operate. Extreme environments limit the options for sensors and actuators and degrade their performance. Because sensors and actuators are necessary for feedback control, these limitations mean that designing embedded instrumentation and control systems for the challenging environments of nuclear reactors requires advanced technical solutions that are not available commercially. This report details the development of testbed that will be used for cross-cutting embedded instrumentation and control research for nuclear power applications. This research is funded by the Department of Energy's Nuclear Energy Enabling Technology program's Advanced Sensors and Instrumentation topic. The design goal of the loop-scale testbed is to build a low temperature pump that utilizes magnetic bearing that will be incorporated into a water loop to test control system performance and self-sensing techniques. Specifically, this testbed will be used to analyze control system performance in response to nonlinear and cross-coupling fluid effects between the shaft axes of motion, rotordynamics and gyroscopic effects, and impeller disturbances. This testbed will also be used to characterize the performance losses when using self-sensing position measurement techniques. Active magnetic bearings are a technology that can reduce failures and maintenance costs in nuclear power plants. They are particularly relevant to liquid salt reactors that operate at high temperatures (700 C). Pumps used in the extreme environment of liquid salt reactors provide many engineering challenges that can be overcome with magnetic bearings and their associated embedded instrumentation and control. This report will give details of the mechanical design and electromagnetic design of the loop-scale embedded instrumentation and control testbed.
HIGH PERFORMANCE PHOTOGRAMMETRIC PROCESSING ON COMPUTER CLUSTERS

Directory of Open Access Journals (Sweden)

V. N. Adrov

2012-07-01

Full Text Available Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.
Embedded engineering education

CERN Document Server

Kaštelan, Ivan; Temerinac, Miodrag; Barak, Moshe; Sruk, Vlado

2016-01-01

This book focuses on the outcome of the European research project “FP7-ICT-2011-8 / 317882: Embedded Engineering Learning Platform” E2LP. Additionally, some experiences and researches outside this project have been included. This book provides information about the achieved results of the E2LP project as well as some broader views about the embedded engineering education. It captures project results and applications, methodologies, and evaluations. It leads to the history of computer architectures, brings a touch of the future in education tools and provides a valuable resource for anyone interested in embedded engineering education concepts, experiences and material. The book contents 12 original contributions and will open a broader discussion about the necessary knowledge and appropriate learning methods for the new profile of embedded engineers. As a result, the proposed Embedded Computer Engineering Learning Platform will help to educate a sufficient number of future engineers in Europe, capable of d...
Embedded systems design for high-speed data acquisition and control

CERN Document Server

Di Paolo Emilio, Maurizio

2015-01-01

This book serves as a practical guide for practicing engineers who need to design embedded systems for high-speed data acquisition and control systems. A minimum amount of theory is presented, along with a review of analog and digital electronics, followed by detailed explanations of essential topics in hardware design and software development. The discussion of hardware focuses on microcontroller design (ARM microcontrollers and FPGAs), techniques of embedded design, high speed data acquisition (DAQ) and control systems. Coverage of software development includes main programming techniques, culminating in the study of real-time operating systems. All concepts are introduced in a manner to be highly-accessible to practicing engineers and lead to the practical implementation of an embedded board that can be used in various industrial fields as a control system and high speed data acquisition system. • Describes fundamentals of embedded systems design in an accessible manner; • Takes a problem-solving ...
Multicore Challenges and Benefits for High Performance Scientific Computing

Directory of Open Access Journals (Sweden)

Ida M.B. Nielsen

2008-01-01

Full Text Available Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.
High Performance Computing Facility Operational Assessment 2015: Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Barker, Ashley D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bland, Arthur S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Gary, Jeff D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Hack, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; McNally, Stephen T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Rogers, James H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Smith, Brian E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Straatsma, T. P. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Sukumar, Sreenivas Rangan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Thach, Kevin G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Tichenor, Suzy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Vazhkudai, Sudharshan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Wells, Jack C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility

2016-03-01

Oak Ridge National Laboratory’s (ORNL’s) Leadership Computing Facility (OLCF) continues to surpass its operational target goals: supporting users; delivering fast, reliable systems; creating innovative solutions for high-performance computing (HPC) needs; and managing risks, safety, and security aspects associated with operating one of the most powerful computers in the world. The results can be seen in the cutting-edge science delivered by users and the praise from the research community. Calendar year (CY) 2015 was filled with outstanding operational results and accomplishments: a very high rating from users on overall satisfaction that ties the highest-ever mark set in CY 2014; the greatest number of core-hours delivered to research projects; the largest percentage of capability usage since the OLCF began tracking the metric in 2009; and success in delivering on the allocation of 60, 30, and 10% of core hours offered for the INCITE (Innovative and Novel Computational Impact on Theory and Experiment), ALCC (Advanced Scientific Computing Research Leadership Computing Challenge), and Director’s Discretionary programs, respectively. These accomplishments, coupled with the extremely high utilization rate, represent the fulfillment of the promise of Titan: maximum use by maximum-size simulations. The impact of all of these successes and more is reflected in the accomplishments of OLCF users, with publications this year in notable journals Nature, Nature Materials, Nature Chemistry, Nature Physics, Nature Climate Change, ACS Nano, Journal of the American Chemical Society, and Physical Review Letters, as well as many others. The achievements included in the 2015 OLCF Operational Assessment Report reflect first-ever or largest simulations in their communities; for example Titan enabled engineers in Los Angeles and the surrounding region to design and begin building improved critical infrastructure by enabling the highest-resolution Cybershake map for Southern
Energy-aware memory management for embedded multimedia systems a computer-aided design approach

CERN Document Server

Balasa, Florin

2011-01-01

Energy-Aware Memory Management for Embedded Multimedia Systems: A Computer-Aided Design Approach presents recent computer-aided design (CAD) ideas that address memory management tasks, particularly the optimization of energy consumption in the memory subsystem. It explains how to efficiently implement CAD solutions, including theoretical methods and novel algorithms. The book covers various energy-aware design techniques, including data-dependence analysis techniques, memory size estimation methods, extensions of mapping approaches, and memory banking approaches. It shows how these techniques
The readout performance evaluation of PowerPC

International Nuclear Information System (INIS)

Chu Yuanping; Zhang Hongyu; Zhao Jingwei; Ye Mei; Tao Ning; Zhu Kejun; Tang Suqiu; Guo Yanan

2003-01-01

PowerPC, as a powerful low-cost embedded computer, is one of the very important research objects in recent years in the project of BESIII data acquisition system. The researches on the embedded system and embedded computer have achieved many important results in the field of High Energy Physics especially in the data acquisition system. The one of the key points to design an acquisition system using PowerPC is to evaluate the readout ability of PowerPC correctly. The paper introduce some tests for the PowerPC readout performance. (authors)
Circuit-Switched Memory Access in Photonic Interconnection Networks for High-Performance Embedded Computing

Science.gov (United States)

2010-07-22

Memory Systems: Cadle. DRAM, Disk. Morgan Kaufmann , 2007. (2 1) A. Joshi, C. Ballen, Y.-J. Kwon. S . Beamcr, I. Shamim . K. Asano\\’ic, and V...COVERED (From - To) 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR( S ) 5d...PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 7. PERFORMING ORGANIZATION NAME( S ) AND ADDRESS(ES) 8. PERFORMING ORGANIZATION
Export Controls: Implementation of the 1998 Legislative Mandate for High Performance Computers

National Research Council Canada - National Science Library

1999-01-01

We found that most of the 938 proposed exports of high performance computers to civilian end users in countries of concern from February 3, 1998, when procedures implementing the 1998 authorization...
Amorphous Silicon-Germanium Films with Embedded Nano crystals for Thermal Detectors with Very High Sensitivity

International Nuclear Information System (INIS)

Calleja, C.; Torres, A.; Rosales-Quintero, P.; Moreno, M.

2016-01-01

We have optimized the deposition conditions of amorphous silicon-germanium films with embedded nano crystals in a plasma enhanced chemical vapor deposition (PECVD) reactor, working at a standard frequency of 13.56 MHz. The objective was to produce films with very large Temperature Coefficient of Resistance (TCR), which is a signature of the sensitivity in thermal detectors (micro bolometers). Morphological, electrical, and optical characterization were performed in the films, and we found optimal conditions for obtaining films with very high values of thermal coefficient of resistance (TCR = 7.9%K -1 ). Our results show that amorphous silicon-germanium films with embedded nano crystals can be used as thermo sensitive films in high performance infrared focal plane arrays (IRFPAs) used in commercial thermal cameras.
Computer science of the high performance; Informatica del alto rendimiento

Energy Technology Data Exchange (ETDEWEB)

Moraleda, A.

2008-07-01

The high performance computing is taking shape as a powerful accelerator of the process of innovation, to drastically reduce the waiting times for access to the results and the findings in a growing number of processes and activities as complex and important as medicine, genetics, pharmacology, environment, natural resources management or the simulation of complex processes in a wide variety of industries. (Author)
Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems.

Science.gov (United States)

Ehsan, Shoaib; Clark, Adrian F; Naveed ur Rehman; McDonald-Maier, Klaus D

2015-07-10

The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF), allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video). Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44%) in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems.
Integral Images: Efficient Algorithms for Their Computation and Storage in Resource-Constrained Embedded Vision Systems

Directory of Open Access Journals (Sweden)

Shoaib Ehsan

2015-07-01

Full Text Available The integral image, an intermediate image representation, has found extensive use in multi-scale local feature detection algorithms, such as Speeded-Up Robust Features (SURF, allowing fast computation of rectangular features at constant speed, independent of filter size. For resource-constrained real-time embedded vision systems, computation and storage of integral image presents several design challenges due to strict timing and hardware limitations. Although calculation of the integral image only consists of simple addition operations, the total number of operations is large owing to the generally large size of image data. Recursive equations allow substantial decrease in the number of operations but require calculation in a serial fashion. This paper presents two new hardware algorithms that are based on the decomposition of these recursive equations, allowing calculation of up to four integral image values in a row-parallel way without significantly increasing the number of operations. An efficient design strategy is also proposed for a parallel integral image computation unit to reduce the size of the required internal memory (nearly 35% for common HD video. Addressing the storage problem of integral image in embedded vision systems, the paper presents two algorithms which allow substantial decrease (at least 44.44% in the memory requirements. Finally, the paper provides a case study that highlights the utility of the proposed architectures in embedded vision systems.
Challenges and Opportunities in Gen3 Embedded Cooling with High-Quality Microgap Flow

Science.gov (United States)

Bar-Cohen, Avram; Robinson, Franklin L.; Deisenroth, David C.

2018-01-01

Gen3, Embedded Cooling, promises to revolutionize thermal management of advanced microelectronic systems by eliminating the sequential conductive and interfacial thermal resistances which dominate the present 'remote cooling' paradigm. Single-phase interchip microfluidic flow with high thermal conductivity chips and substrates has been used successfully to cool single transistors dissipating more than 40kW/sq cm, but efficient heat removal from transistor arrays, larger chips, and chip stacks operating at these prodigious heat fluxes would require the use of high vapor fraction (quality), two-phase cooling in intra- and inter-chip microgap channels. The motivation, as well as the challenges and opportunities associated with evaporative embedded cooling in realistic form factors, is the focus of this paper. The paper will begin with a brief review of the history of thermal packaging, reflecting the 70-year 'inward migration' of cooling technology from the computer-room, to the rack, and then to the single chip and multichip module with 'remote' or attached air- and liquid-cooled coldplates. Discussion of the limitations of this approach and recent results from single-phase embedded cooling will follow. This will set the stage for discussion of the development challenges associated with application of this Gen3 thermal management paradigm to commercial semiconductor hardware, including dealing with the effects of channel length, orientation, and manifold-driven centrifugal acceleration on the governing behavior.
Small private key MQPKS on an embedded microprocessor.

Science.gov (United States)

Seo, Hwajeong; Kim, Jihyun; Choi, Jongseok; Park, Taehwan; Liu, Zhe; Kim, Howon

2014-03-19

Multivariate quadratic (MQ) cryptography requires the use of long public and private keys to ensure a sufficient security level, but this is not favorable to embedded systems, which have limited system resources. Recently, various approaches to MQ cryptography using reduced public keys have been studied. As a result of this, at CHES2011 (Cryptographic Hardware and Embedded Systems, 2011), a small public key MQ scheme, was proposed, and its feasible implementation on an embedded microprocessor was reported at CHES2012. However, the implementation of a small private key MQ scheme was not reported. For efficient implementation, random number generators can contribute to reduce the key size, but the cost of using a random number generator is much more complex than computing MQ on modern microprocessors. Therefore, no feasible results have been reported on embedded microprocessors. In this paper, we propose a feasible implementation on embedded microprocessors for a small private key MQ scheme using a pseudo-random number generator and hash function based on a block-cipher exploiting a hardware Advanced Encryption Standard (AES) accelerator. To speed up the performance, we apply various implementation methods, including parallel computation, on-the-fly computation, optimized logarithm representation, vinegar monomials and assembly programming. The proposed method reduces the private key size by about 99.9% and boosts signature generation and verification by 5.78% and 12.19% than previous results in CHES2012.
Small Private Key PKS on an Embedded Microprocessor

Science.gov (United States)

Seo, Hwajeong; Kim, Jihyun; Choi, Jongseok; Park, Taehwan; Liu, Zhe; Kim, Howon

2014-01-01

Multivariate quadratic ( ) cryptography requires the use of long public and private keys to ensure a sufficient security level, but this is not favorable to embedded systems, which have limited system resources. Recently, various approaches to cryptography using reduced public keys have been studied. As a result of this, at CHES2011 (Cryptographic Hardware and Embedded Systems, 2011), a small public key scheme, was proposed, and its feasible implementation on an embedded microprocessor was reported at CHES2012. However, the implementation of a small private key scheme was not reported. For efficient implementation, random number generators can contribute to reduce the key size, but the cost of using a random number generator is much more complex than computing on modern microprocessors. Therefore, no feasible results have been reported on embedded microprocessors. In this paper, we propose a feasible implementation on embedded microprocessors for a small private key scheme using a pseudo-random number generator and hash function based on a block-cipher exploiting a hardware Advanced Encryption Standard (AES) accelerator. To speed up the performance, we apply various implementation methods, including parallel computation, on-the-fly computation, optimized logarithm representation, vinegar monomials and assembly programming. The proposed method reduces the private key size by about 99.9% and boosts signature generation and verification by 5.78% and 12.19% than previous results in CHES2012. PMID:24651722
Embedding potentials for excited states of embedded species

International Nuclear Information System (INIS)

Wesolowski, Tomasz A.

2014-01-01

Frozen-Density-Embedding Theory (FDET) is a formalism to obtain the upper bound of the ground-state energy of the total system and the corresponding embedded wavefunction by means of Euler-Lagrange equations [T. A. Wesolowski, Phys. Rev. A 77(1), 012504 (2008)]. FDET provides the expression for the embedding potential as a functional of the electron density of the embedded species, electron density of the environment, and the field generated by other charges in the environment. Under certain conditions, FDET leads to the exact ground-state energy and density of the whole system. Following Perdew-Levy theorem on stationary states of the ground-state energy functional, the other-than-ground-state stationary states of the FDET energy functional correspond to excited states. In the present work, we analyze such use of other-than-ground-state embedded wavefunctions obtained in practical calculations, i.e., when the FDET embedding potential is approximated. Three computational approaches based on FDET, that assure self-consistent excitation energy and embedded wavefunction dealing with the issue of orthogonality of embedded wavefunctions for different states in a different manner, are proposed and discussed

Embedded high-contrast distributed grating structures

Science.gov (United States)

Zubrzycki, Walter J.; Vawter, Gregory A.; Allerman, Andrew A.

2002-01-01

A new class of fabrication methods for embedded distributed grating structures is claimed, together with optical devices which include such structures. These new methods are the only known approach to making defect-free high-dielectric contrast grating structures, which are smaller and more efficient than are conventional grating structures.
Business Models of High Performance Computing Centres in Higher Education in Europe

Science.gov (United States)

Eurich, Markus; Calleja, Paul; Boutellier, Roman

2013-01-01

High performance computing (HPC) service centres are a vital part of the academic infrastructure of higher education organisations. However, despite their importance for research and the necessary high capital expenditures, business research on HPC service centres is mostly missing. From a business perspective, it is important to find an answer to…
An Overview of Reconfigurable Hardware in Embedded Systems

Directory of Open Access Journals (Sweden)

Wenyin Fu

2006-09-01

Full Text Available Over the past few years, the realm of embedded systems has expanded to include a wide variety of products, ranging from digital cameras, to sensor networks, to medical imaging systems. Consequently, engineers strive to create ever smaller and faster products, many of which have stringent power requirements. Coupled with increasing pressure to decrease costs and time-to-market, the design constraints of embedded systems pose a serious challenge to embedded systems designers. Reconfigurable hardware can provide a flexible and efficient platform for satisfying the area, performance, cost, and power requirements of many embedded systems. This article presents an overview of reconfigurable computing in embedded systems, in terms of benefits it can provide, how it has already been used, design issues, and hurdles that have slowed its adoption.
Confabulation Based Real-time Anomaly Detection for Wide-area Surveillance Using Heterogeneous High Performance Computing Architecture

Science.gov (United States)

2015-06-01

CONFABULATION BASED REAL-TIME ANOMALY DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE SYRACUSE...DETECTION FOR WIDE-AREA SURVEILLANCE USING HETEROGENEOUS HIGH PERFORMANCE COMPUTING ARCHITECTURE 5a. CONTRACT NUMBER FA8750-12-1-0251 5b. GRANT...processors including graphic processor units (GPUs) and Intel Xeon Phi processors. Experimental results showed significant speedups, which can enable
High Performance Computing Facility Operational Assessment, FY 2010 Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; White, Julia C [ORNL

2010-08-01

Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energy assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools
Using high performance interconnects in a distributed computing and mass storage environment

International Nuclear Information System (INIS)

Ernst, M.

1994-01-01

Detector Collaborations of the HERA Experiments typically involve more than 500 physicists from a few dozen institutes. These physicists require access to large amounts of data in a fully transparent manner. Important issues include Distributed Mass Storage Management Systems in a Distributed and Heterogeneous Computing Environment. At the very center of a distributed system, including tens of CPUs and network attached mass storage peripherals are the communication links. Today scientists are witnessing an integration of computing and communication technology with the open-quote network close-quote becoming the computer. This contribution reports on a centrally operated computing facility for the HERA Experiments at DESY, including Symmetric Multiprocessor Machines (84 Processors), presently more than 400 GByte of magnetic disk and 40 TB of automoted tape storage, tied together by a HIPPI open-quote network close-quote. Focussing on the High Performance Interconnect technology, details will be provided about the HIPPI based open-quote Backplane close-quote configured around a 20 Gigabit/s Multi Media Router and the performance and efficiency of the related computer interfaces
Embedded Sensors and Controls to Improve Component Performance and Reliability Conceptual Design Report

Energy Technology Data Exchange (ETDEWEB)

Kisner, R.; Melin, A.; Burress, T.; Fugate, D.; Holcomb, D.; Wilgen, J.; Miller, J.; Wilson, D.; Silva, P.; Whitlow, L.; Peretz, F.

2012-09-15

The objective of this project is to demonstrate improved reliability and increased performance made possible by deeply embedding instrumentation and controls (I&C) in nuclear power plant (NPP) components and systems. The project is employing a highly instrumented canned rotor, magnetic bearing, fluoride salt pump as its I&C technology demonstration platform. I&C is intimately part of the basic millisecond-by-millisecond functioning of the system; treating I&C as an integral part of the system design is innovative and will allow significant improvement in capabilities and performance. As systems become more complex and greater performance is required, traditional I&C design techniques become inadequate and more advanced I&C needs to be applied. New I&C techniques enable optimal and reliable performance and tolerance of noise and uncertainties in the system rather than merely monitoring quasistable performance. Traditionally, I&C has been incorporated in NPP components after the design is nearly complete; adequate performance was obtained through over-design. By incorporating I&C at the beginning of the design phase, the control system can provide superior performance and reliability and enable designs that are otherwise impossible. This report describes the progress and status of the project and provides a conceptual design overview for the platform to demonstrate the performance and reliability improvements enabled by advanced embedded I&C.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

Cordes Ben

2009-01-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
Parallel Backprojection: A Case Study in High-Performance Reconfigurable Computing

Directory of Open Access Journals (Sweden)

2009-03-01

Full Text Available High-performance reconfigurable computing (HPRC is a novel approach to provide large-scale computing power to modern scientific applications. Using both general-purpose processors and FPGAs allows application designers to exploit fine-grained and coarse-grained parallelism, achieving high degrees of speedup. One scientific application that benefits from this technique is backprojection, an image formation algorithm that can be used as part of a synthetic aperture radar (SAR processing system. We present an implementation of backprojection for SAR on an HPRC system. Using simulated data taken at a variety of ranges, our implementation runs over 200 times faster than a similar software program, with an overall application speedup better than 50x. The backprojection application is easily parallelizable, achieving near-linear speedup when run on multiple nodes of a clustered HPRC system. The results presented can be applied to other systems and other algorithms with similar characteristics.
High performance computing environment for multidimensional image analysis.

Science.gov (United States)

Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

2007-07-10

The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup. Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.
A checkpoint compression study for high-performance computing systems

Energy Technology Data Exchange (ETDEWEB)

Ibtesham, Dewan [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science; Ferreira, Kurt B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States). Scalable System Software Dept.; Arnold, Dorian [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science

2015-02-17

As high-performance computing systems continue to increase in size and complexity, higher failure rates and increased overheads for checkpoint/restart (CR) protocols have raised concerns about the practical viability of CR protocols for future systems. Previously, compression has proven to be a viable approach for reducing checkpoint data volumes and, thereby, reducing CR protocol overhead leading to improved application performance. In this article, we further explore compression-based CR optimization by exploring its baseline performance and scaling properties, evaluating whether improved compression algorithms might lead to even better application performance and comparing checkpoint compression against and alongside other software- and hardware-based optimizations. Our results highlights are: (1) compression is a very viable CR optimization; (2) generic, text-based compression algorithms appear to perform near optimally for checkpoint data compression and faster compression algorithms will not lead to better application performance; (3) compression-based optimizations fare well against and alongside other software-based optimizations; and (4) while hardware-based optimizations outperform software-based ones, they are not as cost effective.
14th annual Results and Review Workshop on High Performance Computing in Science and Engineering

CERN Document Server

Nagel, Wolfgang E; Resch, Michael M; Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2011; High Performance Computing in Science and Engineering '11

2012-01-01

This book presents the state-of-the-art in simulation on supercomputers. Leading researchers present results achieved on systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2011. The reports cover all fields of computational science and engineering, ranging from CFD to computational physics and chemistry, to computer science, with a special emphasis on industrially relevant applications. Presenting results for both vector systems and microprocessor-based systems, the book allows readers to compare the performance levels and usability of various architectures. As HLRS
Acceleration of FDTD mode solver by high-performance computing techniques.

Science.gov (United States)

Han, Lin; Xi, Yanping; Huang, Wei-Ping

2010-06-21

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.
SISYPHUS: A high performance seismic inversion factory

Science.gov (United States)

Gokhberg, Alexey; Simutė, Saulė; Boehm, Christian; Fichtner, Andreas

2016-04-01

branches for the static process setup, inversion iterations, and solver runs, each branch specifying information at the event, station and channel levels. The workflow management framework is based on an embedded scripting engine that allows definition of various workflow scenarios using a high-level scripting language and provides access to all available inversion components represented as standard library functions. At present the SES3D wave propagation solver is integrated in the solution; the work is in progress for interfacing with SPECFEM3D. A separate framework is designed for interoperability with an optimization module; the workflow manager and optimization process run in parallel and cooperate by exchanging messages according to a specially designed protocol. A library of high-performance modules implementing signal pre-processing, misfit and adjoint computations according to established good practices is included. Monitoring is based on information stored in the inversion state database and at present implements a command line interface; design of a graphical user interface is in progress. The software design fits well into the common massively parallel system architecture featuring a large number of computational nodes running distributed applications under control of batch-oriented resource managers. The solution prototype has been implemented on the "Piz Daint" supercomputer provided by the Swiss Supercomputing Centre (CSCS).
Multi-Language Programming Environments for High Performance Java Computing

Directory of Open Access Journals (Sweden)

Vladimir Getov

1999-01-01

Full Text Available Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI tool which provides application programmers wishing to use Java with immediate accessibility to existing scientific packages. The JCI tool also facilitates rapid development and reuse of existing code. These benefits are provided at minimal cost to the programmer. While beneficial to the programmer, the additional advantages of mixed‐language programming in terms of application performance and portability are addressed in detail within the context of this paper. In addition, we discuss how the JCI tool is complementing other ongoing projects such as IBM’s High‐Performance Compiler for Java (HPCJ and IceT’s metacomputing environment.
What Physicists Should Know About High Performance Computing - Circa 2002

Science.gov (United States)

Frederick, Donald

2002-08-01

High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.
RISC Processors and High Performance Computing

Science.gov (United States)

Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

1995-01-01

This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.
Tensor Train Neighborhood Preserving Embedding

Science.gov (United States)

Wang, Wenqi; Aggarwal, Vaneet; Aeron, Shuchin

2018-05-01

In this paper, we propose a Tensor Train Neighborhood Preserving Embedding (TTNPE) to embed multi-dimensional tensor data into low dimensional tensor subspace. Novel approaches to solve the optimization problem in TTNPE are proposed. For this embedding, we evaluate novel trade-off gain among classification, computation, and dimensionality reduction (storage) for supervised learning. It is shown that compared to the state-of-the-arts tensor embedding methods, TTNPE achieves superior trade-off in classification, computation, and dimensionality reduction in MNIST handwritten digits and Weizmann face datasets.
Small Private Key MQPKS on an Embedded Microprocessor

Directory of Open Access Journals (Sweden)

Hwajeong Seo

2014-03-01

Full Text Available Multivariate quadratic (MQ cryptography requires the use of long public and private keys to ensure a sufficient security level, but this is not favorable to embedded systems, which have limited system resources. Recently, various approaches to MQ cryptography using reduced public keys have been studied. As a result of this, at CHES2011 (Cryptographic Hardware and Embedded Systems, 2011, a small public key MQ scheme, was proposed, and its feasible implementation on an embedded microprocessor was reported at CHES2012. However, the implementation of a small private key MQ scheme was not reported. For efficient implementation, random number generators can contribute to reduce the key size, but the cost of using a random number generator is much more complex than computing MQ on modern microprocessors. Therefore, no feasible results have been reported on embedded microprocessors. In this paper, we propose a feasible implementation on embedded microprocessors for a small private key MQ scheme using a pseudo-random number generator and hash function based on a block-cipher exploiting a hardware Advanced Encryption Standard (AES accelerator. To speed up the performance, we apply various implementation methods, including parallel computation, on-the-fly computation, optimized logarithm representation, vinegar monomials and assembly programming. The proposed method reduces the private key size by about 99.9% and boosts signature generation and verification by 5.78% and 12.19% than previous results in CHES2012.
Comprehensive Simulation Lifecycle Management for High Performance Computing Modeling and Simulation, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — There are significant logistical barriers to entry-level high performance computing (HPC) modeling and simulation (M IllinoisRocstar) sets up the infrastructure for...

How to build a high-performance compute cluster for the Grid

CERN Document Server

Reinefeld, A

2001-01-01

The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage resources at geographically distributed sites. Currently, much effort is spent on the design of Grid management software (Datagrid, Globus, etc.), while the effective integration of computing nodes has been largely neglected up to now. This is the focus of our work. We present a framework for a high- performance cluster that can be used as a reliable computing node in the Grid. We outline the cluster architecture, the management of distributed data and the seamless integration of the cluster into the Grid environment. (11 refs).
A Simulation Approach for Performance Validation during Embedded Systems Design

Science.gov (United States)

Wang, Zhonglei; Haberl, Wolfgang; Herkersdorf, Andreas; Wechs, Martin

Due to the time-to-market pressure, it is highly desirable to design hardware and software of embedded systems in parallel. However, hardware and software are developed mostly using very different methods, so that performance evaluation and validation of the whole system is not an easy task. In this paper, we propose a simulation approach to bridge the gap between model-driven software development and simulation based hardware design, by merging hardware and software models into a SystemC based simulation environment. An automated procedure has been established to generate software simulation models from formal models, while the hardware design is originally modeled in SystemC. As the simulation models are annotated with timing information, performance issues are tackled in the same pass as system functionality, rather than in a dedicated approach.
High-Performance Computing in Neuroscience for Data-Driven Discovery, Integration, and Dissemination

International Nuclear Information System (INIS)

Bouchard, Kristofer E.

2016-01-01

A lack of coherent plans to analyze, manage, and understand data threatens the various opportunities offered by new neuro-technologies. High-performance computing will allow exploratory analysis of massive datasets stored in standardized formats, hosted in open repositories, and integrated with simulations.
High Performance Computing - Power Application Programming Interface Specification Version 2.0.

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Ward, H. Lee [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Younge, Andrew J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-03-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
Bimetallic CoNiSx nanocrystallites embedded in nitrogen-doped carbon anchored on reduced graphene oxide for high-performance supercapacitors.

Science.gov (United States)

Chen, Qidi; Miao, Jinkang; Quan, Liang; Cai, Daoping; Zhan, Hongbing

2018-02-22

Exploring high-performance and low-priced electrode materials for supercapacitors is important but remains challenging. In this work, a unique sandwich-like nanocomposite of reduced graphene oxide (rGO)-supported N-doped carbon embedded with ultrasmall CoNiS x nanocrystallites (rGO/CoNiS x /N-C nanocomposite) has been successfully designed and synthesized by a simple one-step carbonization/sulfurization treatment of the rGO/Co-Ni precursor. The intriguing structural/compositional/morphological advantages endow the as-synthesized rGO/CoNiS x /N-C nanocomposite with excellent electrochemical performance as an advanced electrode material for supercapacitors. Compared with the other two rGO/CoNiO x and rGO/CoNiS x nanocomposites, the rGO/CoNiS x /N-C nanocomposite exhibits much enhanced performance, including a high specific capacitance (1028.2 F g -1 at 1 A g -1 ), excellent rate capability (89.3% capacitance retention at 10 A g -1 ) and good cycling stability (93.6% capacitance retention over 2000 cycles). In addition, an asymmetric supercapacitor (ASC) device based on the rGO/CoNiS x /N-C nanocomposite as the cathode and activated carbon (AC) as the anode is also fabricated, which can deliver a high energy density of 32.9 W h kg -1 at a power density of 229.2 W kg -1 with desirable cycling stability. These electrochemical results evidently indicate the great potential of the sandwich-like rGO/CoNiS x /N-C nanocomposite for applications in high-performance supercapacitors.
BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

Science.gov (United States)

Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

2012-01-01

Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.
Integrated State Estimation and Contingency Analysis Software Implementation using High Performance Computing Techniques

Energy Technology Data Exchange (ETDEWEB)

Chen, Yousu; Glaesemann, Kurt R.; Rice, Mark J.; Huang, Zhenyu

2015-12-31

Power system simulation tools are traditionally developed in sequential mode and codes are optimized for single core computing only. However, the increasing complexity in the power grid models requires more intensive computation. The traditional simulation tools will soon not be able to meet the grid operation requirements. Therefore, power system simulation tools need to evolve accordingly to provide faster and better results for grid operations. This paper presents an integrated state estimation and contingency analysis software implementation using high performance computing techniques. The software is able to solve large size state estimation problems within one second and achieve a near-linear speedup of 9,800 with 10,000 cores for contingency analysis application. The performance evaluation is presented to show its effectiveness.
Smart Multicore Embedded Systems

DEFF Research Database (Denmark)

This book provides a single-source reference to the state-of-the-art of high-level programming models and compilation tool-chains for embedded system platforms. The authors address challenges faced by programmers developing software to implement parallel applications in embedded systems, where very...... specificities of various embedded systems from different industries. Parallel programming tool-chains are described that take as input parameters both the application and the platform model, then determine relevant transformations and mapping decisions on the concrete platform, minimizing user intervention...... and hiding the difficulties related to the correct and efficient use of memory hierarchy and low level code generation. Describes tools and programming models for multicore embedded systems Emphasizes throughout performance per watt scalability Discusses realistic limits of software parallelization Enables...
Trends in high-performance computing for engineering calculations.

Science.gov (United States)

Giles, M B; Reguly, I

2014-08-13

High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
An Embedded Sensor Node Microcontroller with Crypto-Processors.

Science.gov (United States)

Panić, Goran; Stecklina, Oliver; Stamenković, Zoran

2016-04-27

Wireless sensor network applications range from industrial automation and control, agricultural and environmental protection, to surveillance and medicine. In most applications, data are highly sensitive and must be protected from any type of attack and abuse. Security challenges in wireless sensor networks are mainly defined by the power and computing resources of sensor devices, memory size, quality of radio channels and susceptibility to physical capture. In this article, an embedded sensor node microcontroller designed to support sensor network applications with severe security demands is presented. It features a low power 16-bitprocessor core supported by a number of hardware accelerators designed to perform complex operations required by advanced crypto algorithms. The microcontroller integrates an embedded Flash and an 8-channel 12-bit analog-to-digital converter making it a good solution for low-power sensor nodes. The article discusses the most important security topics in wireless sensor networks and presents the architecture of the proposed hardware solution. Furthermore, it gives details on the chip implementation, verification and hardware evaluation. Finally, the chip power dissipation and performance figures are estimated and analyzed.
Evaluation of Maintenance and EOL Operation Performance of Sensor-Embedded Laptops

Directory of Open Access Journals (Sweden)

Mehmet Talha Dulman

2018-01-01

Full Text Available Sensors are commonly employed to monitor products during their life cycles and to remotely and continuously track their usage patterns. Installing sensors into products can help generate useful data related to the conditions of products and their components, and this information can subsequently be used to inform EOL decision-making. As such, embedded sensors can enhance the performance of EOL product processing operations. The information collected by the sensors can also be used to estimate and predict product failures, thereby helping to improve maintenance operations. This paper describes a study in which system maintenance and EOL processes were combined and closed-loop supply chain systems were constructed to analyze the financial contribution that sensors can make to these procedures by using discrete event simulation to model and compare regular systems and sensor-embedded systems. The factors that had an impact on the performance measures, such as disassembly cost, maintenance cost, inspection cost, sales revenues, and profitability, were determined and a design of experiments study was carried out. The experiment results were compared, and pairwise t-tests were executed. The results reveal that sensor-embedded systems are significantly superior to regular systems in terms of the identified performance measures.
Towards OpenVL: Improving Real-Time Performance of Computer Vision Applications

Science.gov (United States)

Shen, Changsong; Little, James J.; Fels, Sidney

Meeting constraints for real-time performance is a main issue for computer vision, especially for embedded computer vision systems. This chapter presents our progress on our open vision library (OpenVL), a novel software architecture to address efficiency through facilitating hardware acceleration, reusability, and scalability for computer vision systems. A logical image understanding pipeline is introduced to allow parallel processing. We also discuss progress on our middleware—vision library utility toolkit (VLUT)—that enables applications to operate transparently over a heterogeneous collection of hardware implementations. OpenVL works as a state machine,with an event-driven mechanismto provide users with application-level interaction. Various explicit or implicit synchronization and communication methods are supported among distributed processes in the logical pipelines. The intent of OpenVL is to allow users to quickly and easily recover useful information from multiple scenes, in a cross-platform, cross-language manner across various software environments and hardware platforms. To validate the critical underlying concepts of OpenVL, a human tracking system and a local positioning system are implemented and described. The novel architecture separates the specification of algorithmic details from the underlying implementation, allowing for different components to be implemented on an embedded system without recompiling code.
The effects of perceived USB-delay for sensor and embedded system development.

Science.gov (United States)

Du, J; Kade, D; Gerdtman, C; Ozcan, O; Linden, M

2016-08-01

Perceiving delay in computer input devices is a problem which gets even more eminent when being used in healthcare applications and/or in small, embedded systems. Therefore, the amount of delay found as acceptable when using computer input devices was investigated in this paper. A device was developed to perform a benchmark test for the perception of delay. The delay can be set from 0 to 999 milliseconds (ms) between a receiving computer and an available USB-device. The USB-device can be a mouse, a keyboard or some other type of USB-connected input device. Feedback from performed user tests with 36 people form the basis for the determination of time limitations for the USB data processing in microprocessors and embedded systems without users' noticing the delay. For this paper, tests were performed with a personal computer and a common computer mouse, testing the perception of delays between 0 and 500 ms. The results of our user tests show that perceived delays up to 150 ms were acceptable and delays larger than 300 ms were not acceptable at all.
Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

Science.gov (United States)

Caulfield, H John; Dolev, Shlomi; Green, William M J

2009-08-01

The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.
Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

Science.gov (United States)

Ivanovic, Pavle; Richter, Harald

2018-01-01

High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.
Use of high performance computing to examine the effectiveness of aquifer remediation

International Nuclear Information System (INIS)

Tompson, A.F.B.; Ashby, S.F.; Falgout, R.D.; Smith, S.G.; Fogwell, T.W.; Loosmore, G.A.

1994-06-01

Large-scale simulation of fluid flow and chemical migration is being used to study the effectiveness of pump-and-treat restoration of a contaminated, saturated aquifer. A three-element approach focusing on geostatistical representations of heterogeneous aquifers, high-performance computing strategies for simulating flow, migration, and reaction processes in large three-dimensional systems, and highly-resolved simulations of flow and chemical migration in porous formations will be discussed. Results from a preliminary application of this approach to examine pumping behavior at a real, heterogeneous field site will be presented. Future activities will emphasize parallel computations in larger, dynamic, and nonlinear (two-phase) flow problems as well as improved interpretive methods for defining detailed material property distributions
High-Reynolds Number Viscous Flow Simulations on Embedded-Boundary Cartesian Grids

Science.gov (United States)

2016-05-05

AFRL-AFOSR-VA-TR-2016-0192 High- Reynolds Number Viscous Flow Simulations on Embedded-Boundary Cartesian Grids Marsha Berger NEW YORK UNIVERSITY Final...TO THE ABOVE ORGANIZATION. 1. REPORT DATE (DD-MM-YYYY) 30/04/2016 2. REPORT TYPE Final 3. DATES COVERED (From - To) High- Reynolds 4. TITLE AND...SUBTITLE High- Reynolds Number Viscous Flow Simulations on Embedded-Boundary Cartesian Grids 5a. CONTRACT NUMBER 5b. GRANT NUMBER FA9550-13-1
Implementation of an Embedded Web Server Application for Wireless Control of Brain Computer Interface Based Home Environments.

Science.gov (United States)

Aydın, Eda Akman; Bay, Ömer Faruk; Güler, İnan

2016-01-01

Brain Computer Interface (BCI) based environment control systems could facilitate life of people with neuromuscular diseases, reduces dependence on their caregivers, and improves their quality of life. As well as easy usage, low-cost, and robust system performance, mobility is an important functionality expected from a practical BCI system in real life. In this study, in order to enhance users' mobility, we propose internet based wireless communication between BCI system and home environment. We designed and implemented a prototype of an embedded low-cost, low power, easy to use web server which is employed in internet based wireless control of a BCI based home environment. The embedded web server provides remote access to the environmental control module through BCI and web interfaces. While the proposed system offers to BCI users enhanced mobility, it also provides remote control of the home environment by caregivers as well as the individuals in initial stages of neuromuscular disease. The input of BCI system is P300 potentials. We used Region Based Paradigm (RBP) as stimulus interface. Performance of the BCI system is evaluated on data recorded from 8 non-disabled subjects. The experimental results indicate that the proposed web server enables internet based wireless control of electrical home appliances successfully through BCIs.
Improving the Eco-Efficiency of High Performance Computing Clusters Using EECluster

Directory of Open Access Journals (Sweden)

Alberto Cocaña-Fernández

2016-03-01

Full Text Available As data and supercomputing centres increase their performance to improve service quality and target more ambitious challenges every day, their carbon footprint also continues to grow, and has already reached the magnitude of the aviation industry. Also, high power consumptions are building up to a remarkable bottleneck for the expansion of these infrastructures in economic terms due to the unavailability of sufficient energy sources. A substantial part of the problem is caused by current energy consumptions of High Performance Computing (HPC clusters. To alleviate this situation, we present in this work EECluster, a tool that integrates with multiple open-source Resource Management Systems to significantly reduce the carbon footprint of clusters by improving their energy efficiency. EECluster implements a dynamic power management mechanism based on Computational Intelligence techniques by learning a set of rules through multi-criteria evolutionary algorithms. This approach enables cluster operators to find the optimal balance between a reduction in the cluster energy consumptions, service quality, and number of reconfigurations. Experimental studies using both synthetic and actual workloads from a real world cluster support the adoption of this tool to reduce the carbon footprint of HPC clusters.
7th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Gracia, José; Nagel, Wolfgang; Resch, Michael

2014-01-01

Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.

Low-cost, high-performance and efficiency computational photometer design

Science.gov (United States)

Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly

2014-05-01

Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.
Research of real-time performance based on VxWorks embedded system

International Nuclear Information System (INIS)

Liu Daming; Li Haiming

2011-01-01

In the research of mechanism and heating efficiency of Ion Cyclotron Range of Frequency (ICRF) heating, data acquisition system with high real-time performance needed. By the means of system logic analyzer, SPY and other relevant software on VxWorks embedded operating system for real-time testing gives real-time data of the system. Real-time level to achieve balances used time and processor idle time, real-time data acquisition, and minimize the interference of external to the system, ensure the system work in its own set of scheduling trajectory. Interrupt switching time and task context switching time meet the system requirements. (authors)
Real-time Tsunami Inundation Prediction Using High Performance Computers

Science.gov (United States)

Oishi, Y.; Imamura, F.; Sugawara, D.

2014-12-01

Recently off-shore tsunami observation stations based on cabled ocean bottom pressure gauges are actively being deployed especially in Japan. These cabled systems are designed to provide real-time tsunami data before tsunamis reach coastlines for disaster mitigation purposes. To receive real benefits of these observations, real-time analysis techniques to make an effective use of these data are necessary. A representative study was made by Tsushima et al. (2009) that proposed a method to provide instant tsunami source prediction based on achieving tsunami waveform data. As time passes, the prediction is improved by using updated waveform data. After a tsunami source is predicted, tsunami waveforms are synthesized from pre-computed tsunami Green functions of linear long wave equations. Tsushima et al. (2014) updated the method by combining the tsunami waveform inversion with an instant inversion of coseismic crustal deformation and improved the prediction accuracy and speed in the early stages. For disaster mitigation purposes, real-time predictions of tsunami inundation are also important. In this study, we discuss the possibility of real-time tsunami inundation predictions, which require faster-than-real-time tsunami inundation simulation in addition to instant tsunami source analysis. Although the computational amount is large to solve non-linear shallow water equations for inundation predictions, it has become executable through the recent developments of high performance computing technologies. We conducted parallel computations of tsunami inundation and achieved 6.0 TFLOPS by using 19,000 CPU cores. We employed a leap-frog finite difference method with nested staggered grids of which resolution range from 405 m to 5 m. The resolution ratio of each nested domain was 1/3. Total number of grid points were 13 million, and the time step was 0.1 seconds. Tsunami sources of 2011 Tohoku-oki earthquake were tested. The inundation prediction up to 2 hours after the
High performance systems

Energy Technology Data Exchange (ETDEWEB)

Vigil, M.B. [comp.

1995-03-01

This document provides a written compilation of the presentations and viewgraphs from the 1994 Conference on High Speed Computing given at the High Speed Computing Conference, {open_quotes}High Performance Systems,{close_quotes} held at Gleneden Beach, Oregon, on April 18 through 21, 1994.
Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

KAUST Repository

Rojas, Jhonathan Prieto; Sevilla, Galo T.; Hussain, Muhammad Mustafa

2013-01-01

cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today's computers. One limitation is silicon's rigidity and brittleness. Here we show a generic batch process
Department of Energy Mathematical, Information, and Computational Sciences Division: High Performance Computing and Communications Program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-11-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, The DOE Program in HPCC), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW).
A performance model for the communication in fast multipole methods on high-performance computing platforms

KAUST Repository

Ibeid, Huda

2016-03-04

Exascale systems are predicted to have approximately 1 billion cores, assuming gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics but has recently been extended to a wider range of problems. Its high arithmetic intensity combined with its linear complexity and asynchronous communication patterns make it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on internode communication. We focus on the communication part only; the efficiency of the computational kernels are beyond the scope of the present study. We develop a performance model that considers the communication patterns of the FMM and observe a good match between our model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization of internode communication in FMM that validates the model against actual measurements of communication time. The ultimate communication model is predictive in an absolute sense; however, on complex systems, this objective is often out of reach or of a difficulty out of proportion to its benefit when there exists a simpler model that is inexpensive and sufficient to guide coding decisions leading to improved scaling. The current model provides such guidance.
Lightweight Provenance Service for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

2017-09-09

Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.
Embedded Leverage

DEFF Research Database (Denmark)

Frazzini, Andrea; Heje Pedersen, Lasse

find that asset classes with embedded leverage offer low risk-adjusted returns and, in the cross-section, higher embedded leverage is associated with lower returns. A portfolio which is long low-embedded-leverage securities and short high-embedded-leverage securities earns large abnormal returns...
Diverse Power Iteration Embeddings and Its Applications

Energy Technology Data Exchange (ETDEWEB)

Huang H.; Yoo S.; Yu, D.; Qin, H.

2014-12-14

Abstract—Spectral Embedding is one of the most effective dimension reduction algorithms in data mining. However, its computation complexity has to be mitigated in order to apply it for real-world large scale data analysis. Many researches have been focusing on developing approximate spectral embeddings which are more efficient, but meanwhile far less effective. This paper proposes Diverse Power Iteration Embeddings (DPIE), which not only retains the similar efficiency of power iteration methods but also produces a series of diverse and more effective embedding vectors. We test this novel method by applying it to various data mining applications (e.g. clustering, anomaly detection and feature selection) and evaluating their performance improvements. The experimental results show our proposed DPIE is more effective than popular spectral approximation methods, and obtains the similar quality of classic spectral embedding derived from eigen-decompositions. Moreover it is extremely fast on big data applications. For example in terms of clustering result, DPIE achieves as good as 95% of classic spectral clustering on the complex datasets but 4000+ times faster in limited memory environment.
Commercialization issues and funding opportunities for high-performance optoelectronic computing modules

Science.gov (United States)

Hessenbruch, John M.; Guilfoyle, Peter S.

1997-01-01

Low power, optoelectronic integrated circuits are being developed for high speed switching and data processing applications. These high performance optoelectronic computing modules consist of three primary components: vertical cavity surface emitting lasers, diffractive optical interconnect elements, and detector/amplifier/laser driver arrays. Following the design and fabrication of an HPOC module prototype, selected commercial funding sources will be evaluated to support a product development stage. These include the formation of a strategic alliance with one or more microprocessor or telecommunications vendors, and/or equity investment from one or more venture capital firms.
Quo vadis: Hydrologic inverse analyses using high-performance computing and a D-Wave quantum annealer

Science.gov (United States)

O'Malley, D.; Vesselinov, V. V.

2017-12-01

Classical microprocessors have had a dramatic impact on hydrology for decades, due largely to the exponential growth in computing power predicted by Moore's law. However, this growth is not expected to continue indefinitely and has already begun to slow. Quantum computing is an emerging alternative to classical microprocessors. Here, we demonstrated cutting edge inverse model analyses utilizing some of the best available resources in both worlds: high-performance classical computing and a D-Wave quantum annealer. The classical high-performance computing resources are utilized to build an advanced numerical model that assimilates data from O(10^5) observations, including water levels, drawdowns, and contaminant concentrations. The developed model accurately reproduces the hydrologic conditions at a Los Alamos National Laboratory contamination site, and can be leveraged to inform decision-making about site remediation. We demonstrate the use of a D-Wave 2X quantum annealer to solve hydrologic inverse problems. This work can be seen as an early step in quantum-computational hydrology. We compare and contrast our results with an early inverse approach in classical-computational hydrology that is comparable to the approach we use with quantum annealing. Our results show that quantum annealing can be useful for identifying regions of high and low permeability within an aquifer. While the problems we consider are small-scale compared to the problems that can be solved with modern classical computers, they are large compared to the problems that could be solved with early classical CPUs. Further, the binary nature of the high/low permeability problem makes it well-suited to quantum annealing, but challenging for classical computers.
Embedded Platforms for Computer Vision-based Advanced Driver Assistance Systems: a Survey

OpenAIRE

Velez, Gorka; Otaegui, Oihana

2015-01-01

Computer Vision, either alone or combined with other technologies such as radar or Lidar, is one of the key technologies used in Advanced Driver Assistance Systems (ADAS). Its role understanding and analysing the driving scene is of great importance as it can be noted by the number of ADAS applications that use this technology. However, porting a vision algorithm to an embedded automotive system is still very challenging, as there must be a trade-off between several design requisites. Further...
9th International Workshop on Parallel Tools for High Performance Computing

CERN Document Server

Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

2016-01-01

High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...
High Performance Computing - Power Application Programming Interface Specification Version 1.4

Energy Technology Data Exchange (ETDEWEB)

Laros III, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); DeBonis, David [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Grant, Ryan [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Levenhagen, Michael J. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Olivier, Stephen Lecler [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Pedretti, Kevin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2016-10-01

Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.
Enabling the ATLAS Experiment at the LHC for High Performance Computing

CERN Document Server

AUTHOR|(CDS)2091107; Ereditato, Antonio

In this thesis, I studied the feasibility of running computer data analysis programs from the Worldwide LHC Computing Grid, in particular large-scale simulations of the ATLAS experiment at the CERN LHC, on current general purpose High Performance Computing (HPC) systems. An approach for integrating HPC systems into the Grid is proposed, which has been implemented and tested on the „Todi” HPC machine at the Swiss National Supercomputing Centre (CSCS). Over the course of the test, more than 500000 CPU-hours of processing time have been provided to ATLAS, which is roughly equivalent to the combined computing power of the two ATLAS clusters at the University of Bern. This showed that current HPC systems can be used to efficiently run large-scale simulations of the ATLAS detector and of the detected physics processes. As a first conclusion of my work, one can argue that, in perspective, running large-scale tasks on a few large machines might be more cost-effective than running on relatively small dedicated com...
An embedded formula of the Chebyshev collocation method for stiff problems

Science.gov (United States)

Piao, Xiangfan; Bu, Sunyoung; Kim, Dojin; Kim, Philsu

2017-12-01

In this study, we have developed an embedded formula of the Chebyshev collocation method for stiff problems, based on the zeros of the generalized Chebyshev polynomials. A new strategy for the embedded formula, using a pair of methods to estimate the local truncation error, as performed in traditional embedded Runge-Kutta schemes, is proposed. The method is performed in such a way that not only the stability region of the embedded formula can be widened, but by allowing the usage of larger time step sizes, the total computational costs can also be reduced. In terms of concrete convergence and stability analysis, the constructed algorithm turns out to have an 8th order convergence and it exhibits A-stability. Through several numerical experimental results, we have demonstrated that the proposed method is numerically more efficient, compared to several existing implicit methods.
EOS: A project to investigate the design and construction of real-time distributed Embedded Operating Systems

Science.gov (United States)

Campbell, R. H.; Essick, Ray B.; Johnston, Gary; Kenny, Kevin; Russo, Vince

1987-01-01

Project EOS is studying the problems of building adaptable real-time embedded operating systems for the scientific missions of NASA. Choices (A Class Hierarchical Open Interface for Custom Embedded Systems) is an operating system designed and built by Project EOS to address the following specific issues: the software architecture for adaptable embedded parallel operating systems, the achievement of high-performance and real-time operation, the simplification of interprocess communications, the isolation of operating system mechanisms from one another, and the separation of mechanisms from policy decisions. Choices is written in C++ and runs on a ten processor Encore Multimax. The system is intended for use in constructing specialized computer applications and research on advanced operating system features including fault tolerance and parallelism.
Computing the Dilation of Edge-Augmented Graphs Embedded in Metric Spaces

DEFF Research Database (Denmark)

Wulff-Nilsen, Christian

2008-01-01

Let G = (V,E) be an undirected graph with n vertices embedded in a metric space. We consider the problem of adding a shortcut edge in G that minimizes the dilation of the resulting graph. The fastest algorithm to date for this problem has O(n^4) running time and uses O(n^2) space. We show how...... to improve running time to O(n^3*log n) while maintaining quadratic space requirement. In fact, our algorithm not only determines the best shortcut but computes the dilation of G U {(u,v)} for every pair of distinct vertices u and v....
Uranus: a rapid prototyping tool for FPGA embedded computer vision

Science.gov (United States)

Rosales-Hernández, Victor; Castillo-Jimenez, Liz; Viveros-Velez, Gilberto; Zuñiga-Grajeda, Virgilio; Treviño Torres, Abel; Arias-Estrada, M.

2007-01-01

The starting point for all successful system development is the simulation. Performing high level simulation of a system can help to identify, insolate and fix design problems. This work presents Uranus, a software tool for simulation and evaluation of image processing algorithms with support to migrate them to an FPGA environment for algorithm acceleration and embedded processes purposes. The tool includes an integrated library of previous coded operators in software and provides the necessary support to read and display image sequences as well as video files. The user can use the previous compiled soft-operators in a high level process chain, and code his own operators. Additional to the prototyping tool, Uranus offers FPGA-based hardware architecture with the same organization as the software prototyping part. The hardware architecture contains a library of FPGA IP cores for image processing that are connected with a PowerPC based system. The Uranus environment is intended for rapid prototyping of machine vision and the migration to FPGA accelerator platform, and it is distributed for academic purposes.

Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments

Directory of Open Access Journals (Sweden)

Jyh-Da Wei

2017-08-01

Full Text Available High-end graphics processing units (GPUs, such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1, which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs. Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform. Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.
Embedded-Based Graphics Processing Unit Cluster Platform for Multiple Sequence Alignments.

Science.gov (United States)

Wei, Jyh-Da; Cheng, Hui-Jun; Lin, Chun-Yuan; Ye, Jin; Yeh, Kuan-Yu

2017-01-01

High-end graphics processing units (GPUs), such as NVIDIA Tesla/Fermi/Kepler series cards with thousands of cores per chip, are widely applied to high-performance computing fields in a decade. These desktop GPU cards should be installed in personal computers/servers with desktop CPUs, and the cost and power consumption of constructing a GPU cluster platform are very high. In recent years, NVIDIA releases an embedded board, called Jetson Tegra K1 (TK1), which contains 4 ARM Cortex-A15 CPUs and 192 Compute Unified Device Architecture cores (belong to Kepler GPUs). Jetson Tegra K1 has several advantages, such as the low cost, low power consumption, and high applicability, and it has been applied into several specific applications. In our previous work, a bioinformatics platform with a single TK1 (STK platform) was constructed, and this previous work is also used to prove that the Web and mobile services can be implemented in the STK platform with a good cost-performance ratio by comparing a STK platform with the desktop CPU and GPU. In this work, an embedded-based GPU cluster platform will be constructed with multiple TK1s (MTK platform). Complex system installation and setup are necessary procedures at first. Then, 2 job assignment modes are designed for the MTK platform to provide services for users. Finally, ClustalW v2.0.11 and ClustalWtk will be ported to the MTK platform. The experimental results showed that the speedup ratios achieved 5.5 and 4.8 times for ClustalW v2.0.11 and ClustalWtk, respectively, by comparing 6 TK1s with a single TK1. The MTK platform is proven to be useful for multiple sequence alignments.
High performance parallel computing of flows in complex geometries: I. Methods

International Nuclear Information System (INIS)

Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F; Gazaix, M; Poinsot, T

2009-01-01

Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.
Broadband EM Performance Characteristics of Single Square Loop FSS Embedded Monolithic Radome

Directory of Open Access Journals (Sweden)

Raveendranath U. Nair

2013-01-01

Full Text Available A monolithic half-wave radome panel, centrally loaded with aperture-type single square loop frequency selective surface (SSL-FSS, is proposed here for broadband airborne radome applications. Equivalent transmission line method in conjunction with equivalent circuit model (ECM is used for modeling the SSL-FSS embedded monolithic half-wave radome panel and evaluating radome performance parameters. The design parameters of the SSL-FSS are optimized at different angles of incidence such that the new radome wall configuration offers superior EM performance from L-band to X-band as compared to the conventional monolithic half-wave slab of identical material and thickness. The superior EM performance of SSL-FSS embedded monolithic radome wall makes it suitable for the design of normal incidence and streamlined airborne radomes.
High performance statistical computing with parallel R: applications to biology and climate modelling

International Nuclear Information System (INIS)

Samatova, Nagiza F; Branstetter, Marcia; Ganguly, Auroop R; Hettich, Robert; Khan, Shiraj; Kora, Guruprasad; Li, Jiangtian; Ma, Xiaosong; Pan, Chongle; Shoshani, Arie; Yoginath, Srikanth

2006-01-01

Ultrascale computing and high-throughput experimental technologies have enabled the production of scientific data about complex natural phenomena. With this opportunity, comes a new problem - the massive quantities of data so produced. Answers to fundamental questions about the nature of those phenomena remain largely hidden in the produced data. The goal of this work is to provide a scalable high performance statistical data analysis framework to help scientists perform interactive analyses of these raw data to extract knowledge. Towards this goal we have been developing an open source parallel statistical analysis package, called Parallel R, that lets scientists employ a wide range of statistical analysis routines on high performance shared and distributed memory architectures without having to deal with the intricacies of parallelizing these routines
Integrated Design Tools for Embedded Control Systems

NARCIS (Netherlands)

Jovanovic, D.S.; Hilderink, G.H.; Broenink, Johannes F.; Karelse, F.

2001-01-01

Currently, computer-based control systems are still being implemented using the same techniques as 10 years ago. The purpose of this project is the development of a design framework, consisting of tools and libraries, which allows the designer to build high reliable heterogeneous real-time embedded
Cloud object store for checkpoints of high performance computing applications using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2016-04-19

Cloud object storage is enabled for checkpoints of high performance computing applications using a middleware process. A plurality of files, such as checkpoint files, generated by a plurality of processes in a parallel computing system are stored by obtaining said plurality of files from said parallel computing system; converting said plurality of files to objects using a log structured file system middleware process; and providing said objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
A Framework for Debugging Geoscience Projects in a High Performance Computing Environment

Science.gov (United States)

Baxter, C.; Matott, L.

2012-12-01

High performance computing (HPC) infrastructure has become ubiquitous in today's world with the emergence of commercial cloud computing and academic supercomputing centers. Teams of geoscientists, hydrologists and engineers can take advantage of this infrastructure to undertake large research projects - for example, linking one or more site-specific environmental models with soft computing algorithms, such as heuristic global search procedures, to perform parameter estimation and predictive uncertainty analysis, and/or design least-cost remediation systems. However, the size, complexity and distributed nature of these projects can make identifying failures in the associated numerical experiments using conventional ad-hoc approaches both time- consuming and ineffective. To address these problems a multi-tiered debugging framework has been developed. The framework allows for quickly isolating and remedying a number of potential experimental failures, including: failures in the HPC scheduler; bugs in the soft computing code; bugs in the modeling code; and permissions and access control errors. The utility of the framework is demonstrated via application to a series of over 200,000 numerical experiments involving a suite of 5 heuristic global search algorithms and 15 mathematical test functions serving as cheap analogues for the simulation-based optimization of pump-and-treat subsurface remediation systems.
SPECT detector system design based on embedded system

International Nuclear Information System (INIS)

Zhang Weizheng; Zhao Shujun; Zhang Lei; Sun Yuanling

2007-01-01

A single-photon emission computed tomography detector system based on embedded Linux designed. This system is composed of detector module, data acquisition module, ARM MPU module, network interface communication module and human machine interface module. Its software uses multithreading technology based on embedded Linux. It can achieve high speed data acquisition, real-time data correction and network data communication. It can accelerate the data acquisition and decrease the dead time. The accuracy and the stability of the system can be improved. (authors)
Modeling and optimization of parallel and distributed embedded systems

CERN Document Server

Munir, Arslan; Ranka, Sanjay

2016-01-01

This book introduces the state-of-the-art in research in parallel and distributed embedded systems, which have been enabled by developments in silicon technology, micro-electro-mechanical systems (MEMS), wireless communications, computer networking, and digital electronics. These systems have diverse applications in domains including military and defense, medical, automotive, and unmanned autonomous vehicles. The emphasis of the book is on the modeling and optimization of emerging parallel and distributed embedded systems in relation to the three key design metrics of performance, power and dependability.
High performance computing in power and energy systems

Energy Technology Data Exchange (ETDEWEB)

Khaitan, Siddhartha Kumar [Iowa State Univ., Ames, IA (United States); Gupta, Anshul (eds.) [IBM Watson Research Center, Yorktown Heights, NY (United States)

2013-07-01

The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, cascading and security analysis, interaction between hybrid systems (electric, transport, gas, oil, coal, etc.) and so on, to get meaningful information in real time to ensure a secure, reliable and stable power system grid. Advanced research on development and implementation of market-ready leading-edge high-speed enabling technologies and algorithms for solving real-time, dynamic, resource-critical problems will be required for dynamic security analysis targeted towards successful implementation of Smart Grid initiatives. This books aims to bring together some of the latest research developments as well as thoughts on the future research directions of the high performance computing applications in electric power systems planning, operations, security, markets, and grid integration of alternate sources of energy, etc.
High performance computer code for molecular dynamics simulations

International Nuclear Information System (INIS)

Levay, I.; Toekesi, K.

2007-01-01

Complete text of publication follows. Molecular Dynamics (MD) simulation is a widely used technique for modeling complicated physical phenomena. Since 2005 we are developing a MD simulations code for PC computers. The computer code is written in C++ object oriented programming language. The aim of our work is twofold: a) to develop a fast computer code for the study of random walk of guest atoms in Be crystal, b) 3 dimensional (3D) visualization of the particles motion. In this case we mimic the motion of the guest atoms in the crystal (diffusion-type motion), and the motion of atoms in the crystallattice (crystal deformation). Nowadays, it is common to use Graphics Devices in intensive computational problems. There are several ways to use this extreme processing performance, but never before was so easy to programming these devices as now. The CUDA (Compute Unified Device) Architecture introduced by nVidia Corporation in 2007 is a very useful for every processor hungry application. A Unified-architecture GPU include 96-128, or more stream processors, so the raw calculation performance is 576(!) GFLOPS. It is ten times faster, than the fastest dual Core CPU [Fig.1]. Our improved MD simulation software uses this new technology, which speed up our software and the code run 10 times faster in the critical calculation code segment. Although the GPU is a very powerful tool, it has a strongly paralleled structure. It means, that we have to create an algorithm, which works on several processors without deadlock. Our code currently uses 256 threads, shared and constant on-chip memory, instead of global memory, which is 100 times slower than others. It is possible to implement the total algorithm on GPU, therefore we do not need to download and upload the data in every iteration. On behalf of maximal throughput, every thread run with the same instructions
InfoMall: An Innovative Strategy for High-Performance Computing and Communications Applications Development.

Science.gov (United States)

Mills, Kim; Fox, Geoffrey

1994-01-01

Describes the InfoMall, a program led by the Northeast Parallel Architectures Center (NPAC) at Syracuse University (New York). The InfoMall features a partnership of approximately 24 organizations offering linked programs in High Performance Computing and Communications (HPCC) technology integration, software development, marketing, education and…
A C++11 implementation of arbitrary-rank tensors for high-performance computing

Science.gov (United States)

Aragón, Alejandro M.

2014-11-01

This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.
Alumina-coated and manganese monoxide embedded 3D carbon derived from avocado as high-performance anode for lithium-ion batteries

Science.gov (United States)

rehman, Wasif ur; Xu, Youlong; Du, Xianfeng; Sun, Xiaofei; Ullah, Inam; Zhang, Yuan; Jin, Yanling; Zhang, Baofeng; Li, Xifei

2018-07-01

Derived from avocado fruit, a three dimension (3D) carbon is prepared via a hydrothermal/pyrolysis process followed by embedding with MnO nanoparticles by a wet chemical method and coating with Al2O3 through an atomic layer deposition technique. The obtained material presents a hierarchical structure that MnO nanocrystals wrapped in 3D carbon and then encapsulated in a uniform Al2O3 layer with a thickness of about 5 nm. Benefiting from this hierarchical structure in which 3D carbon offers numerous electronic pathways to enhance the conductivity and Al2O3 nanolayer provide a shelter to keep away from dissolution of Mn4+ and volume changes during charge/discharge process. This material (marked as C/MnO@Al2O3) has exhibited high rate performance and excellent cyclability as an anode for lithium ion batteries. A high specific capacity of about 600 mA h g-1 is achieved at a current density of 1000 mA g-1 and the electrode can still deliver a high specific capacity of about 1165 mA h g-1 at 150 mA g-1 after 100 cycles. These results facilitate a green and high potential of anode materials towards promising devices for advance performance of lithium-ion batteries.
Embedded Sensors and Controls to Improve Component Performance and Reliability -- Bench-scale Testbed Design Report

Energy Technology Data Exchange (ETDEWEB)

Melin, Alexander M. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Kisner, Roger A. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Drira, Anis [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States); Reed, Frederick K. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

2015-09-01

Embedded instrumentation and control systems that can operate in extreme environments are challenging due to restrictions on sensors and materials. As a part of the Department of Energy's Nuclear Energy Enabling Technology cross-cutting technology development programs Advanced Sensors and Instrumentation topic, this report details the design of a bench-scale embedded instrumentation and control testbed. The design goal of the bench-scale testbed is to build a re-configurable system that can rapidly deploy and test advanced control algorithms in a hardware in the loop setup. The bench-scale testbed will be designed as a fluid pump analog that uses active magnetic bearings to support the shaft. The testbed represents an application that would improve the efficiency and performance of high temperature (700 C) pumps for liquid salt reactors that operate in an extreme environment and provide many engineering challenges that can be overcome with embedded instrumentation and control. This report will give details of the mechanical design, electromagnetic design, geometry optimization, power electronics design, and initial control system design.
Human and Robotic Space Mission Use Cases for High-Performance Spaceflight Computing

Science.gov (United States)

Some, Raphael; Doyle, Richard; Bergman, Larry; Whitaker, William; Powell, Wesley; Johnson, Michael; Goforth, Montgomery; Lowry, Michael

2013-01-01

Spaceflight computing is a key resource in NASA space missions and a core determining factor of spacecraft capability, with ripple effects throughout the spacecraft, end-to-end system, and mission. Onboard computing can be aptly viewed as a "technology multiplier" in that advances provide direct dramatic improvements in flight functions and capabilities across the NASA mission classes, and enable new flight capabilities and mission scenarios, increasing science and exploration return. Space-qualified computing technology, however, has not advanced significantly in well over ten years and the current state of the practice fails to meet the near- to mid-term needs of NASA missions. Recognizing this gap, the NASA Game Changing Development Program (GCDP), under the auspices of the NASA Space Technology Mission Directorate, commissioned a study on space-based computing needs, looking out 15-20 years. The study resulted in a recommendation to pursue high-performance spaceflight computing (HPSC) for next-generation missions, and a decision to partner with the Air Force Research Lab (AFRL) in this development.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

International Nuclear Information System (INIS)

Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-01-01

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.
STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

Energy Technology Data Exchange (ETDEWEB)

Oelerich, Jan Oliver, E-mail: jan.oliver.oelerich@physik.uni-marburg.de; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

2017-06-15

Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.
US QCD computational performance studies with PERI

International Nuclear Information System (INIS)

Zhang, Y; Fowler, R; Huck, K; Malony, A; Porterfield, A; Reed, D; Shende, S; Taylor, V; Wu, X

2007-01-01

We report on some of the interactions between two SciDAC projects: The National Computational Infrastructure for Lattice Gauge Theory (USQCD), and the Performance Engineering Research Institute (PERI). Many modern scientific programs consistently report the need for faster computational resources to maintain global competitiveness. However, as the size and complexity of emerging high end computing (HEC) systems continue to rise, achieving good performance on such systems is becoming ever more challenging. In order to take full advantage of the resources, it is crucial to understand the characteristics of relevant scientific applications and the systems these applications are running on. Using tools developed under PERI and by other performance measurement researchers, we studied the performance of two applications, MILC and Chroma, on several high performance computing systems at DOE laboratories. In the case of Chroma, we discuss how the use of C++ and modern software engineering and programming methods are driving the evolution of performance tools

Towards Portable Large-Scale Image Processing with High-Performance Computing.

Science.gov (United States)

Huo, Yuankai; Blaber, Justin; Damon, Stephen M; Boyd, Brian D; Bao, Shunxing; Parvathaneni, Prasanna; Noguera, Camilo Bermudez; Chaganti, Shikha; Nath, Vishwesh; Greer, Jasmine M; Lyu, Ilwoo; French, William R; Newton, Allen T; Rogers, Baxter P; Landman, Bennett A

2018-05-03

High-throughput, large-scale medical image computing demands tight integration of high-performance computing (HPC) infrastructure for data storage, job distribution, and image processing. The Vanderbilt University Institute for Imaging Science (VUIIS) Center for Computational Imaging (CCI) has constructed a large-scale image storage and processing infrastructure that is composed of (1) a large-scale image database using the eXtensible Neuroimaging Archive Toolkit (XNAT), (2) a content-aware job scheduling platform using the Distributed Automation for XNAT pipeline automation tool (DAX), and (3) a wide variety of encapsulated image processing pipelines called "spiders." The VUIIS CCI medical image data storage and processing infrastructure have housed and processed nearly half-million medical image volumes with Vanderbilt Advanced Computing Center for Research and Education (ACCRE), which is the HPC facility at the Vanderbilt University. The initial deployment was natively deployed (i.e., direct installations on a bare-metal server) within the ACCRE hardware and software environments, which lead to issues of portability and sustainability. First, it could be laborious to deploy the entire VUIIS CCI medical image data storage and processing infrastructure to another HPC center with varying hardware infrastructure, library availability, and software permission policies. Second, the spiders were not developed in an isolated manner, which has led to software dependency issues during system upgrades or remote software installation. To address such issues, herein, we describe recent innovations using containerization techniques with XNAT/DAX which are used to isolate the VUIIS CCI medical image data storage and processing infrastructure from the underlying hardware and software environments. The newly presented XNAT/DAX solution has the following new features: (1) multi-level portability from system level to the application level, (2) flexible and dynamic software
An embedded domain specific language for general purpose vectorization

CERN Document Server

Karpinski, Przemyslaw

2017-01-01

Portable SIMD code generation is an open problem in modern High Performance Computing systems. Performance portability can already be achieved, however it might fail when user-framework interaction is required. Of all portable vectorization techniques, explicit vectorization, using wrapper-class libraries, is proven to achieve the fastest performance, however it does not exploit optimization opportunities outside the simplest algebraic primitives. A more advanced language is therefore required, but the design of a new independent language is not feasible due to its high costs. This work describes an Embedded Domain Specific Language for solving generalized 1-D vectorization problems. The language is implemented using C++ as a host language and published as a lightweight library. By decoupling expression creation from evaluation a wider range of problems can be solved, without sacrificing runtime efficiency. In this paper we discuss design patterns necessary, but not limited, to efficient EDSL implementatio...
Embedded multi-channel data acquisition system on FPGA for Aditya Tokamak

Energy Technology Data Exchange (ETDEWEB)

Rajpal, Rachana, E-mail: rachana@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India); Mandaliya, Hitesh, E-mail: hitesh@ipr.res.in [ITER, Cadarache (France); Patel, Jignesh, E-mail: jjp@ipr.res.in [ITER, Cadarache (France); Kumari, Praveena, E-mail: praveena@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India); Gautam, Pramila, E-mail: pramila@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India); Raulji, Vismaysinh, E-mail: vismay@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India); Edappala, Praveenlal, E-mail: praveen@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India); Pujara, H.D, E-mail: pujara@ipr.res [Institute for Plasma Research, Gandhinagar, Gujarat (India); Jha, R., E-mail: jha@ipr.res.in [Institute for Plasma Research, Gandhinagar, Gujarat (India)

2016-11-15

Highlights: • 64 channel data acquisition, interface to PC/104 bus, using single board computer. • Integration of all components in single hardware to make it standalone and portable. • Development of application software in Qt on Linux platform for better performance and low cost compared to Windows. • Explored and utilized FPGA resources for hardware interfacing. - Abstract: The 64 channel data acquisition board is designed to meet the future demand of acquisition channels for plasma diagnostics. The inherent features of the board are 16 bit resolution, programmable sampling rate upto 200 kS/s/ch and simultaneous acquisition. To make system embedded and compact, 8 Analog Inputs ADC chip, 4M × 16 bit RAM memory, Field Programmable Gate Arrays, PC/104 platform and single board computer are used. High speed timing control signals for all ADCs and RAMs are generated by FPGA. The system is standalone, portable and interface through Ethernet. The acquisition application is developed in Qt. on Linux platform, in SBC. Due to ethernet connectivity and onboard processing, system can be integrated into Aditya and SST-1 data acquisition system. The performance of hardware is tested on Linux and Windows Embedded OS. The paper describes design, hardware and software architecture, implementation and results of 64 channel DAQ system.
Embedded multi-channel data acquisition system on FPGA for Aditya Tokamak

International Nuclear Information System (INIS)

Rajpal, Rachana; Mandaliya, Hitesh; Patel, Jignesh; Kumari, Praveena; Gautam, Pramila; Raulji, Vismaysinh; Edappala, Praveenlal; Pujara, H.D; Jha, R.

2016-01-01

Highlights: • 64 channel data acquisition, interface to PC/104 bus, using single board computer. • Integration of all components in single hardware to make it standalone and portable. • Development of application software in Qt on Linux platform for better performance and low cost compared to Windows. • Explored and utilized FPGA resources for hardware interfacing. - Abstract: The 64 channel data acquisition board is designed to meet the future demand of acquisition channels for plasma diagnostics. The inherent features of the board are 16 bit resolution, programmable sampling rate upto 200 kS/s/ch and simultaneous acquisition. To make system embedded and compact, 8 Analog Inputs ADC chip, 4M × 16 bit RAM memory, Field Programmable Gate Arrays, PC/104 platform and single board computer are used. High speed timing control signals for all ADCs and RAMs are generated by FPGA. The system is standalone, portable and interface through Ethernet. The acquisition application is developed in Qt. on Linux platform, in SBC. Due to ethernet connectivity and onboard processing, system can be integrated into Aditya and SST-1 data acquisition system. The performance of hardware is tested on Linux and Windows Embedded OS. The paper describes design, hardware and software architecture, implementation and results of 64 channel DAQ system.
High Performance Spaceflight Computing (HPSC)

Data.gov (United States)

National Aeronautics and Space Administration — Space-based computing has not kept up with the needs of current and future NASA missions. We are developing a next-generation flight computing system that addresses...
The Artemis workbench for system-level performance evaluation of embedded systems

NARCIS (Netherlands)

Pimentel, A.D.

2008-01-01

In this paper, we present an overview of the Artemis workbench, which provides modelling and simulation methods and tools for efficient performance evaluation and exploration of heterogeneous embedded multimedia systems. More specifically, we describe the Artemis system-level modelling methodology,
Towards the development of run times leveraging virtualization for high performance computing

International Nuclear Information System (INIS)

Diakhate, F.

2010-12-01

In recent years, there has been a growing interest in using virtualization to improve the efficiency of data centers. This success is rooted in virtualization's excellent fault tolerance and isolation properties, in the overall flexibility it brings, and in its ability to exploit multi-core architectures efficiently. These characteristics also make virtualization an ideal candidate to tackle issues found in new compute cluster architectures. However, in spite of recent improvements in virtualization technology, overheads in the execution of parallel applications remain, which prevent its use in the field of high performance computing. In this thesis, we propose a virtual device dedicated to message passing between virtual machines, so as to improve the performance of parallel applications executed in a cluster of virtual machines. We also introduce a set of techniques facilitating the deployment of virtualized parallel applications. These functionalities have been implemented as part of a runtime system which allows to benefit from virtualization's properties in a way that is as transparent as possible to the user while minimizing performance overheads. (author)
Investigation of protein adsorption performance of Ni2+-attached diatomite particles embedded in composite monolithic cryogels.

Science.gov (United States)

Ünlü, Nuri; Ceylan, Şeyda; Erzengin, Mahmut; Odabaşı, Mehmet

2011-08-01

As a low-cost natural adsorbent, diatomite (DA) (2 μm) has several advantages including high surface area, chemical reactivity, hydrophilicity and lack of toxicity. In this study, the protein adsorption performance of supermacroporous composite cryogels embedded with Ni(2+)-attached DA particles (Ni(2+)-ADAPs) was investigated. Supermacroporous poly(2-hydroxyethyl methacrylate) (PHEMA)-based monolithic composite cryogel column embedded with Ni(2+)-ADAPs was prepared by radical cryo-copolymerization of 2-hydroxyethyl methacrylate (HEMA) with N,N'-methylene-bis-acrylamide (MBAAm) as cross-linker directly in a plastic syringe for affinity purification of human serum albumin (HSA) both from aqueous solutions and human serum. The chemical composition and surface area of DA was determined by XRF and BET method, respectively. The characterization of composite cryogel was investigated by SEM. The effect of pH, and embedded Ni(2+)-ADAPs amount, initial HSA concentration, temperature and flow rate on adsorption were studied. The maximum amount of HSA adsorption from aqueous solution at pH 8.0 phosphate buffer was very high (485.15 mg/g DA). It was observed that HSA could be repeatedly adsorbed and desorbed to the embedded Ni(2+)-ADAPs in poly(2-hydroxyethyl methacrylate) composite cryogel without significant loss of adsorption capacity. The efficiency of albumin adsorption from human serum before and after albumin adsorption was also investigated with SDS-PAGE analyses. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
An innovative ultra-capacitor driven shape memory alloy actuator with an embedded control system

International Nuclear Information System (INIS)

Li, Peng; Song, Gangbing

2014-01-01

In this paper, an innovative ultra-capacitor driven shape memory alloy (SMA) actuator with an embedded control system is proposed targeting high power high-duty cycle SMA applications. The ultra-capacitor, which is capable of delivering massive amounts of instantaneous current in a compact dimension for high power applications, is chosen as the main component of the power supply. A specialized embedded system is designed from the ground up to control the ultra-capacitor driven SMA system. The control of the ultra-capacitor driven SMA is different from that of a regular constant voltage powered SMA system in that the energy and the voltage of the ultra-capacitor decrease as the system load increases. The embedded control system is also different from a computer-based control system in that it has limited computational power, and the control algorithm has to be designed to be simple while effective so that it can fit into the embedded system environment. The problem of a variable voltage power source induced by the use of the ultra-capacitor is solved by using a fuzzy PID (proportional integral and derivative) control. The method of using an ultra-capacitor to drive SMA actuators enabled SMA as a good candidate for high power high-duty cycle applications. The proposed embedded control system provides a good and ready-to-use solution for SMA high power applications. (paper)
International Conference on Modern Mathematical Methods and High Performance Computing in Science and Technology

CERN Document Server

Srivastava, HM; Venturino, Ezio; Resch, Michael; Gupta, Vijay

2016-01-01

The book discusses important results in modern mathematical models and high performance computing, such as applied operations research, simulation of operations, statistical modeling and applications, invisibility regions and regular meta-materials, unmanned vehicles, modern radar techniques/SAR imaging, satellite remote sensing, coding, and robotic systems. Furthermore, it is valuable as a reference work and as a basis for further study and research. All contributing authors are respected academicians, scientists and researchers from around the globe. All the papers were presented at the international conference on Modern Mathematical Methods and High Performance Computing in Science & Technology (M3HPCST 2015), held at Raj Kumar Goel Institute of Technology, Ghaziabad, India, from 27–29 December 2015, and peer-reviewed by international experts. The conference provided an exceptional platform for leading researchers, academicians, developers, engineers and technocrats from a broad range of disciplines ...
A Resource-Aware Component Model for Embedded Systems

OpenAIRE

Vulgarakis, Aneta

2009-01-01

Embedded systems are microprocessor-based systems that cover a large range of computer systems from ultra small computer-based devices to large systems monitoring and controlling complex processes. The particular constraints that must be met by embedded systems, such as timeliness, resource-use efficiency, short time-to-market and low cost, coupled with the increasing complexity of embedded system software, demand technologies and processes that will tackle these issues. An attractive approac...
Embedded Systems Development Tools: A MODUS-oriented Market Overview

Directory of Open Access Journals (Sweden)

Loupis Michalis

2014-03-01

Full Text Available Background: The embedded systems technology has perhaps been the most dominating technology in high-tech industries, in the past decade. The industry has correctly identified the potential of this technology and has put its efforts into exploring its full potential. Objectives: The goal of the paper is to explore the versatility of the application in the embedded system development based on one FP7-SME project. Methods/Approach: Embedded applications normally demand high resilience and quality, as well as conformity to quality standards and rigid performance. As a result embedded system developers have adopted software methods that yield high quality. The qualitative approach to examining embedded systems development tools has been applied in this work. Results: This paper presents a MODUS-oriented market analysis in the domains of Formal Verification tools, HW/SW co-simulation tools, Software Performance Optimization tools and Code Generation tools. Conclusions: The versatility of applications this technology serves is amazing. With all this performance potential, the technology has carried with itself a large number of issues which the industry essentially needs to resolve to be able to harness the full potential contained. The MODUS project toolset addressed four discrete domains of the ESD Software Market, in which corresponding open tools were developed
Performance Measurements in a High Throughput Computing Environment

CERN Document Server

AUTHOR|(CDS)2145966; Gribaudo, Marco

The IT infrastructures of companies and research centres are implementing new technologies to satisfy the increasing need of computing resources for big data analysis. In this context, resource profiling plays a crucial role in identifying areas where the improvement of the utilisation efficiency is needed. In order to deal with the profiling and optimisation of computing resources, two complementary approaches can be adopted: the measurement-based approach and the model-based approach. The measurement-based approach gathers and analyses performance metrics executing benchmark applications on computing resources. Instead, the model-based approach implies the design and implementation of a model as an abstraction of the real system, selecting only those aspects relevant to the study. This Thesis originates from a project carried out by the author within the CERN IT department. CERN is an international scientific laboratory that conducts fundamental researches in the domain of elementary particle physics. The p...
Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

International Nuclear Information System (INIS)

Brown, W. Michael; Wang, Peng; Plimpton, Steven J.; Tharrington, Arnold N.

2011-01-01

The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both many-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.
Polymer waveguides for electro-optical integration in data centers and high-performance computers.

Science.gov (United States)

Dangel, Roger; Hofrichter, Jens; Horst, Folkert; Jubin, Daniel; La Porta, Antonio; Meier, Norbert; Soganci, Ibrahim Murat; Weiss, Jonas; Offrein, Bert Jan

2015-02-23

To satisfy the intra- and inter-system bandwidth requirements of future data centers and high-performance computers, low-cost low-power high-throughput optical interconnects will become a key enabling technology. To tightly integrate optics with the computing hardware, particularly in the context of CMOS-compatible silicon photonics, optical printed circuit boards using polymer waveguides are considered as a formidable platform. IBM Research has already demonstrated the essential silicon photonics and interconnection building blocks. A remaining challenge is electro-optical packaging, i.e., the connection of the silicon photonics chips with the system. In this paper, we present a new single-mode polymer waveguide technology and a scalable method for building the optical interface between silicon photonics chips and single-mode polymer waveguides.
Power/energy use cases for high performance computing

Energy Technology Data Exchange (ETDEWEB)

Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hammond, Steven [National Renewable Energy Lab. (NREL), Golden, CO (United States); Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Munch, Kristin [National Renewable Energy Lab. (NREL), Golden, CO (United States)

2013-12-01

Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.
Computer simulation of liquid cesium using embedded atom model

International Nuclear Information System (INIS)

Belashchenko, D K; Nikitin, N Yu

2008-01-01

The new method is presented for the inventing an embedded atom potential (EAM potential) for liquid metals. This method uses directly the pair correlation function (PCF) of the liquid metal near the melting temperature. Because of the specific analytic form of this EAM potential, the pair term of potential can be calculated using the pair correlation function and, for example, Schommers algorithm. Other parameters of EAM potential may be found using the potential energy, module of compression and pressure at some conditions, mainly near the melting temperature, at very high temperature or in strongly compressed state. We used the simple exponential formula for effective EAM electronic density and a polynomial series for embedding energy. Molecular dynamics method was applied with L. Verlet algorithm. A series of models with 1968 atoms in the basic cube was constructed in temperature interval 323-1923 K. The thermodynamic properties of liquid cesium, structure data and self-diffusion coefficients are calculated. In general, agreement between the model data and known experimental ones is reasonable. The evaluation is given for the critical temperature of cesium models with EAM potential
High-performance implementation of Chebyshev filter diagonalization for interior eigenvalue computations

Energy Technology Data Exchange (ETDEWEB)

Pieper, Andreas [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Kreutzer, Moritz [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Alvermann, Andreas, E-mail: alvermann@physik.uni-greifswald.de [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Galgon, Martin [Bergische Universität Wuppertal (Germany); Fehske, Holger [Ernst-Moritz-Arndt-Universität Greifswald (Germany); Hager, Georg [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany); Lang, Bruno [Bergische Universität Wuppertal (Germany); Wellein, Gerhard [Friedrich-Alexander-Universität Erlangen-Nürnberg (Germany)

2016-11-15

We study Chebyshev filter diagonalization as a tool for the computation of many interior eigenvalues of very large sparse symmetric matrices. In this technique the subspace projection onto the target space of wanted eigenvectors is approximated with filter polynomials obtained from Chebyshev expansions of window functions. After the discussion of the conceptual foundations of Chebyshev filter diagonalization we analyze the impact of the choice of the damping kernel, search space size, and filter polynomial degree on the computational accuracy and effort, before we describe the necessary steps towards a parallel high-performance implementation. Because Chebyshev filter diagonalization avoids the need for matrix inversion it can deal with matrices and problem sizes that are presently not accessible with rational function methods based on direct or iterative linear solvers. To demonstrate the potential of Chebyshev filter diagonalization for large-scale problems of this kind we include as an example the computation of the 10{sup 2} innermost eigenpairs of a topological insulator matrix with dimension 10{sup 9} derived from quantum physics applications.
Compact Acoustic Models for Embedded Speech Recognition

Directory of Open Access Journals (Sweden)

Lévy Christophe

2009-01-01

Full Text Available Speech recognition applications are known to require a significant amount of resources. However, embedded speech recognition only authorizes few KB of memory, few MIPS, and small amount of training data. In order to fit the resource constraints of embedded applications, an approach based on a semicontinuous HMM system using state-independent acoustic modelling is proposed. A transformation is computed and applied to the global model in order to obtain each HMM state-dependent probability density functions, authorizing to store only the transformation parameters. This approach is evaluated on two tasks: digit and voice-command recognition. A fast adaptation technique of acoustic models is also proposed. In order to significantly reduce computational costs, the adaptation is performed only on the global model (using related speaker recognition adaptation techniques with no need for state-dependent data. The whole approach results in a relative gain of more than 20% compared to a basic HMM-based system fitting the constraints.
Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons

Directory of Open Access Journals (Sweden)

Ernestina Martel

2018-06-01

Full Text Available Dimensionality reduction represents a critical preprocessing step in order to increase the efficiency and the performance of many hyperspectral imaging algorithms. However, dimensionality reduction algorithms, such as the Principal Component Analysis (PCA, suffer from their computationally demanding nature, becoming advisable for their implementation onto high-performance computer architectures for applications under strict latency constraints. This work presents the implementation of the PCA algorithm onto two different high-performance devices, namely, an NVIDIA Graphics Processing Unit (GPU and a Kalray manycore, uncovering a highly valuable set of tips and tricks in order to take full advantage of the inherent parallelism of these high-performance computing platforms, and hence, reducing the time that is required to process a given hyperspectral image. Moreover, the achieved results obtained with different hyperspectral images have been compared with the ones that were obtained with a field programmable gate array (FPGA-based implementation of the PCA algorithm that has been recently published, providing, for the first time in the literature, a comprehensive analysis in order to highlight the pros and cons of each option.

High Performance Computing and Storage Requirements for Nuclear Physics: Target 2017

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

2014-04-30

In April 2014, NERSC, ASCR, and the DOE Office of Nuclear Physics (NP) held a review to characterize high performance computing (HPC) and storage requirements for NP research through 2017. This review is the 12th in a series of reviews held by NERSC and Office of Science program offices that began in 2009. It is the second for NP, and the final in the second round of reviews that covered the six Office of Science program offices. This report is the result of that review
High level waste containing granules coated and embedded in metal as an alternative to HLW glasses

International Nuclear Information System (INIS)

Neumann, W.

1980-01-01

Simulated high level waste containing granules were overcoated with pyrocarbon or nickel respectively. The coatings were performed by the use of chemical vapour deposition in a fluidized bed. The coated granules were embedded in an aluminium-silicon-alloy to improve the dissipation of radiation induced heat. The metal-granules-composites obtained were of improved product stability related to the high level waste containing glasses. (orig.) [de
Analysis and modeling of social influence in high performance computing workloads

KAUST Repository

Zheng, Shuai

2011-01-01

Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time. © 2011 Springer-Verlag.
Peer-to-peer computing for secure high performance data copying

International Nuclear Information System (INIS)

Hanushevsky, A.; Trunov, A.; Cottrell, L.

2001-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model--if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, the authors present the bbcp architecture, it's various features, and the reasons for their inclusion
Peer-to-Peer Computing for Secure High Performance Data Copying

International Nuclear Information System (INIS)

2002-01-01

The BaBar Copy Program (bbcp) is an excellent representative of peer-to-peer (P2P) computing. It is also a pioneering application of its type in the P2P arena. Built upon the foundation of its predecessor, Secure Fast Copy (sfcp), bbcp incorporates significant improvements performance and usability. As with sfcp, bbcp uses ssh for authentication; providing an elegant and simple working model -- if you can ssh to a location, you can copy files to or from that location. To fully support this notion, bbcp transparently supports 3rd party copy operations. The program also incorporates several mechanism to deal with firewall security; the bane of P2P computing. To achieve high performance in a wide area network, bbcp allows a user to independently specify, the number of parallel network streams, tcp window size, and the file I/O blocking factor. Using these parameters, data is pipelined from source to target to provide a uniform traffic pattern that maximizes router efficiency. For improved recoverability, bbcp also keeps track of copy operations so that an operation can be restarted from the point of failure at a later time; minimizing the amount of network traffic in the event of a copy failure. Here, we preset the bbcp architecture, it's various features, and the reasons for their inclusion
Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information.

Energy Technology Data Exchange (ETDEWEB)

Aimone, James Bradley [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Betty, Rita [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2015-03-01

Using High Performance Computing to Examine the Processes of Neurogenesis Underlying Pattern Separation/Completion of Episodic Information - Sandia researchers developed novel methods and metrics for studying the computational function of neurogenesis, thus generating substantial impact to the neuroscience and neural computing communities. This work could benefit applications in machine learning and other analysis activities.
Atrial Fibrillation Screening in Nonmetropolitan Areas Using a Telehealth Surveillance System With an Embedded Cloud-Computing Algorithm: Prospective Pilot Study

Science.gov (United States)

Chen, Ying-Hsien; Hung, Chi-Sheng; Huang, Ching-Chang; Hung, Yu-Chien

2017-01-01

Background Atrial fibrillation (AF) is a common form of arrhythmia that is associated with increased risk of stroke and mortality. Detecting AF before the first complication occurs is a recognized priority. No previous studies have examined the feasibility of undertaking AF screening using a telehealth surveillance system with an embedded cloud-computing algorithm; we address this issue in this study. Objective The objective of this study was to evaluate the feasibility of AF screening in nonmetropolitan areas using a telehealth surveillance system with an embedded cloud-computing algorithm. Methods We conducted a prospective AF screening study in a nonmetropolitan area using a single-lead electrocardiogram (ECG) recorder. All ECG measurements were reviewed on the telehealth surveillance system and interpreted by the cloud-computing algorithm and a cardiologist. The process of AF screening was evaluated with a satisfaction questionnaire. Results Between March 11, 2016 and August 31, 2016, 967 ECGs were recorded from 922 residents in nonmetropolitan areas. A total of 22 (2.4%, 22/922) residents with AF were identified by the physician’s ECG interpretation, and only 0.2% (2/967) of ECGs contained significant artifacts. The novel cloud-computing algorithm for AF detection had a sensitivity of 95.5% (95% CI 77.2%-99.9%) and specificity of 97.7% (95% CI 96.5%-98.5%). The overall satisfaction score for the process of AF screening was 92.1%. Conclusions AF screening in nonmetropolitan areas using a telehealth surveillance system with an embedded cloud-computing algorithm is feasible. PMID:28951384
High Performance Molecular Visualization: In-Situ and Parallel Rendering with EGL

Science.gov (United States)

Stone, John E.; Messmer, Peter; Sisneros, Robert; Schulten, Klaus

2016-01-01

Large scale molecular dynamics simulations produce terabytes of data that is impractical to transfer to remote facilities. It is therefore necessary to perform visualization tasks in-situ as the data are generated, or by running interactive remote visualization sessions and batch analyses co-located with direct access to high performance storage systems. A significant challenge for deploying visualization software within clouds, clusters, and supercomputers involves the operating system software required to initialize and manage graphics acceleration hardware. Recently, it has become possible for applications to use the Embedded-system Graphics Library (EGL) to eliminate the requirement for windowing system software on compute nodes, thereby eliminating a significant obstacle to broader use of high performance visualization applications. We outline the potential benefits of this approach in the context of visualization applications used in the cloud, on commodity clusters, and supercomputers. We discuss the implementation of EGL support in VMD, a widely used molecular visualization application, and we outline benefits of the approach for molecular visualization tasks on petascale computers, clouds, and remote visualization servers. We then provide a brief evaluation of the use of EGL in VMD, with tests using developmental graphics drivers on conventional workstations and on Amazon EC2 G2 GPU-accelerated cloud instance types. We expect that the techniques described here will be of broad benefit to many other visualization applications. PMID:27747137
Uniformly embedded silver nanomesh as highly bendable transparent conducting electrode

International Nuclear Information System (INIS)

Choi, Hak-Jong; Choo, Soyoung; Jung, Pil-Hoon; Shin, Ju-Hyeon; Kim, Yang-Doo; Lee, Heon

2015-01-01

Ag-nanomesh-based highly bendable conducting electrodes are developed using a combination of metal nanotransfer printing and embossing for the 6-inch wafer scale. Two Ag nanomeshes, including pitch sizes of 7.5 and 10 μm, are used to obtain highly transparent (approximately 85% transmittance at a wavelength of 550 nm) and electrically conducting properties (below 10 Ω sq −1 ). The Ag nanomeshes are also distinguished according to the fabrication process, which is called transferred or embedded Ag nanomesh on polyethylene terephthalate (PET) substrate, in order to compare their stability against bending stress. Then the enhancement of bending stability when the Ag nanomesh is embedded in the PET substrate is confirmed. (paper)
Uniformly embedded silver nanomesh as highly bendable transparent conducting electrode

Science.gov (United States)

Choi, Hak-Jong; Choo, Soyoung; Jung, Pil-Hoon; Shin, Ju-Hyeon; Kim, Yang-Doo; Lee, Heon

2015-02-01

Ag-nanomesh-based highly bendable conducting electrodes are developed using a combination of metal nanotransfer printing and embossing for the 6-inch wafer scale. Two Ag nanomeshes, including pitch sizes of 7.5 and 10 μm, are used to obtain highly transparent (approximately 85% transmittance at a wavelength of 550 nm) and electrically conducting properties (below 10 Ω sq-1). The Ag nanomeshes are also distinguished according to the fabrication process, which is called transferred or embedded Ag nanomesh on polyethylene terephthalate (PET) substrate, in order to compare their stability against bending stress. Then the enhancement of bending stability when the Ag nanomesh is embedded in the PET substrate is confirmed.
Rapid Open Source GPS software development for modern embedded systems:using the GPSTk with the Gumstix

OpenAIRE

Salazar Hernández, Dagoberto José; Hernández Pajares, Manuel; Juan Zornoza, José Miguel; Sanz Subirana, Jaume

2006-01-01

This work shows how the combination of GPS Open Source Software (GOSS) and advanced full function miniature computers (FFMC) allows rapid development, implementation and testing of advanced embedded GNSS data processing applications in a flexible way. In this regard, our tools of choice are the “GPS Toolkit” (GPSTk), and a modern, high power embedded platform such as the “Gumstix” computer boards. Peer Reviewed
An Efficient Connected Component Labeling Architecture for Embedded Systems

Directory of Open Access Journals (Sweden)

Fanny Spagnolo

2018-03-01

Full Text Available Connected component analysis is one of the most fundamental steps used in several image processing systems. This technique allows for distinguishing and detecting different objects in images by assigning a unique label to all pixels that refer to the same object. Most of the previous published algorithms have been designed for implementation by software. However, due to the large number of memory accesses and compare, lookup, and control operations when executed on a general-purpose processor, they do not satisfy the speed performance required by the next generation high performance computer vision systems. In this paper, we present the design of a new Connected Component Labeling hardware architecture suitable for high performance heterogeneous image processing of embedded designs. When implemented on a Zynq All Programmable-System on Chip (AP-SOC 7045 chip, the proposed design allows a throughput rate higher of 220 Mpixels/s to be reached using less than 18,000 LUTs and 5000 FFs, dissipating about 620 μJ.
Scalability of DL_POLY on High Performance Computing Platform

CSIR Research Space (South Africa)

Mabakane, Mabule S

2017-12-01

Full Text Available stream_source_info Mabakanea_19979_2017.pdf.txt stream_content_type text/plain stream_size 33716 Content-Encoding UTF-8 stream_name Mabakanea_19979_2017.pdf.txt Content-Type text/plain; charset=UTF-8 SACJ 29(3) December... when using many processors within the compute nodes of the supercomputer. The type of the processors of compute nodes and their memory also play an important role in the overall performance of the parallel application running on a supercomputer. DL...
High-Bandwidth Tactical-Network Data Analysis in a High-Performance-Computing (HPC) Environment: Packet-Level Analysis

Science.gov (United States)

2015-09-01

individual fragments using the hash-based method. In general, fragments 6 appear in order and relatively close to each other in the file. A fragment...data product derived from the data model is shown in Fig. 5, a Google Earth12 Keyhole Markup Language (KML) file. This product includes aggregate...System BLOb binary large object FPGA field-programmable gate array HPC high-performance computing IP Internet Protocol KML Keyhole Markup Language
Adaptive Probabilistic Tracking Embedded in Smart Cameras for Distributed Surveillance in a 3D Model

Directory of Open Access Journals (Sweden)

Sven Fleck

2006-12-01

Full Text Available Tracking applications based on distributed and embedded sensor networks are emerging today, both in the fields of surveillance and industrial vision. Traditional centralized approaches have several drawbacks, due to limited communication bandwidth, computational requirements, and thus limited spatial camera resolution and frame rate. In this article, we present network-enabled smart cameras for probabilistic tracking. They are capable of tracking objects adaptively in real time and offer a very bandwidthconservative approach, as the whole computation is performed embedded in each smart camera and only the tracking results are transmitted, which are on a higher level of abstraction. Based on this, we present a distributed surveillance system. The smart cameras' tracking results are embedded in an integrated 3D environment as live textures and can be viewed from arbitrary perspectives. Also a georeferenced live visualization embedded in Google Earth is presented.
Adaptive Probabilistic Tracking Embedded in Smart Cameras for Distributed Surveillance in a 3D Model

Directory of Open Access Journals (Sweden)

Fleck Sven

2007-01-01

Full Text Available Tracking applications based on distributed and embedded sensor networks are emerging today, both in the fields of surveillance and industrial vision. Traditional centralized approaches have several drawbacks, due to limited communication bandwidth, computational requirements, and thus limited spatial camera resolution and frame rate. In this article, we present network-enabled smart cameras for probabilistic tracking. They are capable of tracking objects adaptively in real time and offer a very bandwidthconservative approach, as the whole computation is performed embedded in each smart camera and only the tracking results are transmitted, which are on a higher level of abstraction. Based on this, we present a distributed surveillance system. The smart cameras' tracking results are embedded in an integrated 3D environment as live textures and can be viewed from arbitrary perspectives. Also a georeferenced live visualization embedded in Google Earth is presented.
Design and Implementation of an Embedded NIOS II System for JPEG2000 Tier II Encoding

Directory of Open Access Journals (Sweden)

John M. McNichols

2013-01-01

Full Text Available This paper presents a novel implementation of the JPEG2000 standard as a system on a chip (SoC. While most of the research in this field centers on acceleration of the EBCOT Tier I encoder, this work focuses on an embedded solution for EBCOT Tier II. Specifically, this paper proposes using an embedded softcore processor to perform Tier II processing as the back end of an encoding pipeline. The Altera NIOS II processor is chosen for the implementation and is coupled with existing embedded processing modules to realize a fully embedded JPEG2000 encoder. The design is synthesized on a Stratix IV FPGA and is shown to out perform other comparable SoC implementations by 39% in computation time.
Solving Problems in Various Domains by Hybrid Models of High Performance Computations

Directory of Open Access Journals (Sweden)

Yurii Rogozhin

2014-03-01

Full Text Available This work presents a hybrid model of high performance computations. The model is based on membrane system (P~system where some membranes may contain quantum device that is triggered by the data entering the membrane. This model is supposed to take advantages of both biomolecular and quantum paradigms and to overcome some of their inherent limitations. The proposed approach is demonstrated through two selected problems: SAT, and image retrieving.
High electrochemical performance of RuO_2–Fe_2O_3 nanoparticles embedded ordered mesoporous carbon as a supercapacitor electrode material

International Nuclear Information System (INIS)

Xiang, Dong; Yin, Longwei; Wang, Chenxiang; Zhang, Luyuan

2016-01-01

The electrode materials RuO_2 or RuO_2–Fe_2O_3 nanoparticle embedded OMC (ordered mesoporous carbon) are prepared by the method of impregnation and heating in situ. The mesoporous structure optimized the electron and proton conducting pathways, leading to the enhanced capacitive performances of the composite materials. The average nanoparticle size of RuO_2 and RuO_2–Fe_2O_3 is 2.54 and 1.96 nm, respectively. The fine RuO_2–Fe_2O_3 nanoparticles are dispersed evenly in the pore channel wall of the two-dimensional mesoporous carbon without blocking the mesoporous channel, and they have a higher specific surface area, a larger pore volume, a proper pore size and a small charge transfer impedance value. The special electrochemical capacitance of RuO_2–Fe_2O_3/OMC tested in acid electrolyte (H_2SO_4) is measured to be as high as 1668 F g"−"1, which is higher than that of RuO_2/OMC. Meanwhile, the supercapacitor properties of the RuO_2–Fe_2O_3/OMC composites show a good cycling performance of 93% capacitance retention (3000 cycles), a better reversibility, a higher energy density (134 Wh kg"−"1) and power density (4000 W kg"−"1). The composite electrode of RuO_2–Fe_2O_3/OMC, which combines a double layer capacitance with pseudo-capacitance, is proved to be suitable for ideal high performance electrode material of a hybrid supercapacitor application. - Highlights: • The nanocomposites of RuO_2–Fe_2O_3/OMC are prepared by impregnation and heating in situ. • The fine RuO_2–Fe_2O_3 nanoparticles distribute in the pore channel wall of OMC. • We discuss a reversible redox reaction mechanism of RuO_2–Fe_2O_3/OMC in acid solutions. • RuO_2–Fe_2O_3 nanoparticles embedded OMC shows a higher supercapacitive performance.
General rigid motion correction for computed tomography imaging based on locally linear embedding

Science.gov (United States)

Chen, Mianyi; He, Peng; Feng, Peng; Liu, Baodong; Yang, Qingsong; Wei, Biao; Wang, Ge

2018-02-01

The patient motion can damage the quality of computed tomography images, which are typically acquired in cone-beam geometry. The rigid patient motion is characterized by six geometric parameters and are more challenging to correct than in fan-beam geometry. We extend our previous rigid patient motion correction method based on the principle of locally linear embedding (LLE) from fan-beam to cone-beam geometry and accelerate the computational procedure with the graphics processing unit (GPU)-based all scale tomographic reconstruction Antwerp toolbox. The major merit of our method is that we need neither fiducial markers nor motion-tracking devices. The numerical and experimental studies show that the LLE-based patient motion correction is capable of calibrating the six parameters of the patient motion simultaneously, reducing patient motion artifacts significantly.

FY 1996 Blue Book: High Performance Computing and Communications: Foundations for America`s Information Future

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
FY 1997 Blue Book: High Performance Computing and Communications: Advancing the Frontiers of Information Technology

Data.gov (United States)

Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...
Cell-Averaged discretization for incompressible Navier-Stokes with embedded boundaries and locally refined Cartesian meshes: a high-order finite volume approach

Science.gov (United States)

Bhalla, Amneet Pal Singh; Johansen, Hans; Graves, Dan; Martin, Dan; Colella, Phillip; Applied Numerical Algorithms Group Team

2017-11-01

We present a consistent cell-averaged discretization for incompressible Navier-Stokes equations on complex domains using embedded boundaries. The embedded boundary is allowed to freely cut the locally-refined background Cartesian grid. Implicit-function representation is used for the embedded boundary, which allows us to convert the required geometric moments in the Taylor series expansion (upto arbitrary order) of polynomials into an algebraic problem in lower dimensions. The computed geometric moments are then used to construct stencils for various operators like the Laplacian, divergence, gradient, etc., by solving a least-squares system locally. We also construct the inter-level data-transfer operators like prolongation and restriction for multi grid solvers using the same least-squares system approach. This allows us to retain high-order of accuracy near coarse-fine interface and near embedded boundaries. Canonical problems like Taylor-Green vortex flow and flow past bluff bodies will be presented to demonstrate the proposed method. U.S. Department of Energy, Office of Science, ASCR (Award Number DE-AC02-05CH11231).
Field tests on partial embedment effects (embedment effect tests on soil-structure interaction)

International Nuclear Information System (INIS)

Kurimoto, O.; Tsunoda, T.; Inoue, T.; Izumi, M.; Kusakabe, K.; Akino, K.

1993-01-01

A series of Model Tests of Embedment Effect on Reactor Buildings has been carried out by the Nuclear Power Engineering Corporation (NUPEC), under the sponsorship of the Ministry of International Trade and lndustry (MITI) of Japan. The nuclear reactor buildings are partially embedded due to conditions for the construction or building arrangement in Japan. It is necessary to verify the partial embedment effects by experiments and analytical studies in order to incorporate the effects in the seismic design. Forced vibration tests, therefore, were performed using a model with several types of embedment. Correlated simulation analyses were also performed and the characteristics of partial embedment effects on soil-structure interaction were evaluated. (author)
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Wirth, Brian D., E-mail: bdwirth@utk.edu [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Nuclear Science and Engineering Directorate, Oak Ridge National Laboratory, Oak Ridge, TN (United States); Hammond, K.D. [Department of Nuclear Engineering, University of Tennessee, Knoxville, TN 37996 (United States); Krasheninnikov, S.I. [University of California, San Diego, La Jolla, CA (United States); Maroudas, D. [University of Massachusetts, Amherst, Amherst, MA 01003 (United States)

2015-08-15

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification.
Challenges and opportunities of modeling plasma–surface interactions in tungsten using high-performance computing

International Nuclear Information System (INIS)

Wirth, Brian D.; Hammond, K.D.; Krasheninnikov, S.I.; Maroudas, D.

2015-01-01

The performance of plasma facing components (PFCs) is critical for ITER and future magnetic fusion reactors. The ITER divertor will be tungsten, which is the primary candidate material for future reactors. Recent experiments involving tungsten exposure to low-energy helium plasmas reveal significant surface modification, including the growth of nanometer-scale tendrils of “fuzz” and formation of nanometer-sized bubbles in the near-surface region. The large span of spatial and temporal scales governing plasma surface interactions are among the challenges to modeling divertor performance. Fortunately, recent innovations in computational modeling, increasingly powerful high-performance computers, and improved experimental characterization tools provide a path toward self-consistent, experimentally validated models of PFC and divertor performance. Recent advances in understanding tungsten–helium interactions are reviewed, including such processes as helium clustering, which serve as nuclei for gas bubbles; and trap mutation, dislocation loop punching and bubble bursting; which together initiate surface morphological modification
Scalable domain decomposition solvers for stochastic PDEs in high performance computing

International Nuclear Information System (INIS)

Desai, Ajit; Pettit, Chris; Poirel, Dominique; Sarkar, Abhijit

2017-01-01

Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.
High performance computations using dynamical nucleation theory

International Nuclear Information System (INIS)

Windus, T L; Crosby, L D; Kathmann, S M

2008-01-01

Chemists continue to explore the use of very large computations to perform simulations that describe the molecular level physics of critical challenges in science. In this paper, we describe the Dynamical Nucleation Theory Monte Carlo (DNTMC) model - a model for determining molecular scale nucleation rate constants - and its parallel capabilities. The potential for bottlenecks and the challenges to running on future petascale or larger resources are delineated. A 'master-slave' solution is proposed to scale to the petascale and will be developed in the NWChem software. In addition, mathematical and data analysis challenges are described
Application of High Performance Computing to Earthquake Hazard and Disaster Estimation in Urban Area

Directory of Open Access Journals (Sweden)

Muneo Hori

2018-02-01

Full Text Available Integrated earthquake simulation (IES is a seamless simulation of analyzing all processes of earthquake hazard and disaster. There are two difficulties in carrying out IES, namely, the requirement of large-scale computation and the requirement of numerous analysis models for structures in an urban area, and they are solved by taking advantage of high performance computing (HPC and by developing a system of automated model construction. HPC is a key element in developing IES, as it needs to analyze wave propagation and amplification processes in an underground structure; a model of high fidelity for the underground structure exceeds a degree-of-freedom larger than 100 billion. Examples of IES for Tokyo Metropolis are presented; the numerical computation is made by using K computer, the supercomputer of Japan. The estimation of earthquake hazard and disaster for a given earthquake scenario is made by the ground motion simulation and the urban area seismic response simulation, respectively, for the target area of 10,000 m × 10,000 m.
Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

KAUST Repository

Danani, Bob K.

2012-07-01

Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.
Reconfigurable computing the theory and practice of FPGA-based computation

CERN Document Server

Hauck, Scott

2010-01-01

Reconfigurable Computing marks a revolutionary and hot topic that bridges the gap between the separate worlds of hardware and software design- the key feature of reconfigurable computing is its groundbreaking ability to perform computations in hardware to increase performance while retaining the flexibility of a software solution. Reconfigurable computers serve as affordable, fast, and accurate tools for developing designs ranging from single chip architectures to multi-chip and embedded systems. Scott Hauck and Andre DeHon have assembled a group of the key experts in the fields of both hardwa
High-Throughput Computing on High-Performance Platforms: A Case Study

Energy Technology Data Exchange (ETDEWEB)

Oleynik, D [University of Texas at Arlington; Panitkin, S [Brookhaven National Laboratory (BNL); Matteo, Turilli [Rutgers University; Angius, Alessio [Rutgers University; Oral, H Sarp [ORNL; De, K [University of Texas at Arlington; Klimentov, A [Brookhaven National Laboratory (BNL); Wells, Jack C. [ORNL; Jha, S [Rutgers University

2017-10-01

The computing systems used by LHC experiments has historically consisted of the federation of hundreds to thousands of distributed resources, ranging from small to mid-size resource. In spite of the impressive scale of the existing distributed computing solutions, the federation of small to mid-size resources will be insufficient to meet projected future demands. This paper is a case study of how the ATLAS experiment has embraced Titan -- a DOE leadership facility in conjunction with traditional distributed high- throughput computing to reach sustained production scales of approximately 52M core-hours a years. The three main contributions of this paper are: (i) a critical evaluation of design and operational considerations to support the sustained, scalable and production usage of Titan; (ii) a preliminary characterization of a next generation executor for PanDA to support new workloads and advanced execution modes; and (iii) early lessons for how current and future experimental and observational systems can be integrated with production supercomputers and other platforms in a general and extensible manner.
Cloud object store for archive storage of high performance computing data using decoupling middleware

Science.gov (United States)

Bent, John M.; Faibish, Sorin; Grider, Gary

2015-06-30

Cloud object storage is enabled for archived data, such as checkpoints and results, of high performance computing applications using a middleware process. A plurality of archived files, such as checkpoint files and results, generated by a plurality of processes in a parallel computing system are stored by obtaining the plurality of archived files from the parallel computing system; converting the plurality of archived files to objects using a log structured file system middleware process; and providing the objects for storage in a cloud object storage system. The plurality of processes may run, for example, on a plurality of compute nodes. The log structured file system middleware process may be embodied, for example, as a Parallel Log-Structured File System (PLFS). The log structured file system middleware process optionally executes on a burst buffer node.
A novel 2 T P-channel nano-crystal memory for low power/high speed embedded NVM applications

International Nuclear Information System (INIS)

Zhang Junyu; Wang Yong; Liu Jing; Zhang Manhong; Xu Zhongguang; Huo Zongliang; Liu Ming

2012-01-01

We introduce a novel 2 T P-channel nano-crystal memory structure for low power and high speed embedded non-volatile memory (NVM) applications. By using the band-to-band tunneling-induced hot-electron (BTBTIHE) injection scheme, both high-speed and low power programming can be achieved at the same time. Due to the use of a select transistor, the 'erased states' can be set to below 0 V, so that the periphery HV circuit (high-voltage generating and management) and read-out circuit can be simplified. Good memory cell performance has also been achieved, including a fast program/erase (P/E) speed (a 1.15 V memory window under 10 μs program pulse), an excellent data retention (only 20% charge loss for 10 years). The data shows that the device has strong potential for future embedded NVM applications. (semiconductor devices)
Matrix multiplication operations with data pre-conditioning in a high performance computing architecture

Science.gov (United States)

Eichenberger, Alexandre E; Gschwind, Michael K; Gunnels, John A

2013-11-05

Mechanisms for performing matrix multiplication operations with data pre-conditioning in a high performance computing architecture are provided. A vector load operation is performed to load a first vector operand of the matrix multiplication operation to a first target vector register. A load and splat operation is performed to load an element of a second vector operand and replicating the element to each of a plurality of elements of a second target vector register. A multiply add operation is performed on elements of the first target vector register and elements of the second target vector register to generate a partial product of the matrix multiplication operation. The partial product of the matrix multiplication operation is accumulated with other partial products of the matrix multiplication operation.
Atrial Fibrillation Screening in Nonmetropolitan Areas Using a Telehealth Surveillance System With an Embedded Cloud-Computing Algorithm: Prospective Pilot Study.

Science.gov (United States)

Chen, Ying-Hsien; Hung, Chi-Sheng; Huang, Ching-Chang; Hung, Yu-Chien; Hwang, Juey-Jen; Ho, Yi-Lwun

2017-09-26

Atrial fibrillation (AF) is a common form of arrhythmia that is associated with increased risk of stroke and mortality. Detecting AF before the first complication occurs is a recognized priority. No previous studies have examined the feasibility of undertaking AF screening using a telehealth surveillance system with an embedded cloud-computing algorithm; we address this issue in this study. The objective of this study was to evaluate the feasibility of AF screening in nonmetropolitan areas using a telehealth surveillance system with an embedded cloud-computing algorithm. We conducted a prospective AF screening study in a nonmetropolitan area using a single-lead electrocardiogram (ECG) recorder. All ECG measurements were reviewed on the telehealth surveillance system and interpreted by the cloud-computing algorithm and a cardiologist. The process of AF screening was evaluated with a satisfaction questionnaire. Between March 11, 2016 and August 31, 2016, 967 ECGs were recorded from 922 residents in nonmetropolitan areas. A total of 22 (2.4%, 22/922) residents with AF were identified by the physician's ECG interpretation, and only 0.2% (2/967) of ECGs contained significant artifacts. The novel cloud-computing algorithm for AF detection had a sensitivity of 95.5% (95% CI 77.2%-99.9%) and specificity of 97.7% (95% CI 96.5%-98.5%). The overall satisfaction score for the process of AF screening was 92.1%. AF screening in nonmetropolitan areas using a telehealth surveillance system with an embedded cloud-computing algorithm is feasible. ©Ying-Hsien Chen, Chi-Sheng Hung, Ching-Chang Huang, Yu-Chien Hung, Juey-Jen Hwang, Yi-Lwun Ho. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 26.09.2017.
High performance computing, supercomputing, náročné počítání

Czech Academy of Sciences Publication Activity Database

Okrouhlík, Miloslav

2003-01-01

Roč. 10, č. 5 (2003), s. 429-438 ISSN 1210-2717 R&D Projects: GA ČR GA101/02/0072 Institutional research plan: CEZ:AV0Z2076919 Keywords : high performance computing * vector and parallel computers * programing tools for parellelization Subject RIV: BI - Acoustics
Requirements for high performance computing for lattice QCD. Report of the ECFA working panel

International Nuclear Information System (INIS)

Jegerlehner, F.; Kenway, R.D.; Martinelli, G.; Michael, C.; Pene, O.; Petersson, B.; Petronzio, R.; Sachrajda, C.T.; Schilling, K.

2000-01-01

This report, prepared at the request of the European Committee for Future Accelerators (ECFA), contains an assessment of the High Performance Computing resources which will be required in coming years by European physicists working in Lattice Field Theory and a review of the scientific opportunities which these resources would open. (orig.)
High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

CERN Document Server

Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

2014-01-01

The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment
Toward real-time virtual biopsy of oral lesions using confocal laser endomicroscopy interfaced with embedded computing.

Science.gov (United States)

Thong, Patricia S P; Tandjung, Stephanus S; Movania, Muhammad Mobeen; Chiew, Wei-Ming; Olivo, Malini; Bhuvaneswari, Ramaswamy; Seah, Hock-Soon; Lin, Feng; Qian, Kemao; Soo, Khee-Chee

2012-05-01

Oral lesions are conventionally diagnosed using white light endoscopy and histopathology. This can pose a challenge because the lesions may be difficult to visualise under white light illumination. Confocal laser endomicroscopy can be used for confocal fluorescence imaging of surface and subsurface cellular and tissue structures. To move toward real-time "virtual" biopsy of oral lesions, we interfaced an embedded computing system to a confocal laser endomicroscope to achieve a prototype three-dimensional (3-D) fluorescence imaging system. A field-programmable gated array computing platform was programmed to enable synchronization of cross-sectional image grabbing and Z-depth scanning, automate the acquisition of confocal image stacks and perform volume rendering. Fluorescence imaging of the human and murine oral cavities was carried out using the fluorescent dyes fluorescein sodium and hypericin. Volume rendering of cellular and tissue structures from the oral cavity demonstrate the potential of the system for 3-D fluorescence visualization of the oral cavity in real-time. We aim toward achieving a real-time virtual biopsy technique that can complement current diagnostic techniques and aid in targeted biopsy for better clinical outcomes.

Manufacturing of highly integrated mechatronic modules by using the technology of embedding stereolithography

Science.gov (United States)

Rechtenwald, Thomas; Frick, Thomas; Schmidt, Michael

The embedding stereolithography is an additive, hybrid process, which allows the construction of highly integrated 3D assemblies for the use in automotive applications. The flexible process of stereolithography is combined with the embedding of functional components and supplemented by the additive manufacturing of electrical or optical conductive structures. This combination of sub-processes implies a high potential regarding the obtainable integration density of mechatronical modules. This work considers basic restrictions, which limit the mechanical stability of the manufactured modules by calculating the superposition of residual and external stress using a thermo-mechanical finite element model and develops a procedure to qualify stereolithography matrix materials for the process of the embedding stereolithography.
Polarizable Density Embedding

DEFF Research Database (Denmark)

Olsen, Jógvan Magnus Haugaard; Steinmann, Casper; Ruud, Kenneth

2015-01-01

We present a new QM/QM/MM-based model for calculating molecular properties and excited states of solute-solvent systems. We denote this new approach the polarizable density embedding (PDE) model and it represents an extension of our previously developed polarizable embedding (PE) strategy. The PDE...... model is a focused computational approach in which a core region of the system studied is represented by a quantum-chemical method, whereas the environment is divided into two other regions: an inner and an outer region. Molecules belonging to the inner region are described by their exact densities...
Department of Energy: MICS (Mathematical Information, and Computational Sciences Division). High performance computing and communications program

Energy Technology Data Exchange (ETDEWEB)

NONE

1996-06-01

This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, {open_quotes}The DOE Program in HPCC{close_quotes}), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW). The information pointed to by the URL is updated frequently, and the interested reader is urged to access the WWW for the latest information.
Analysis and Modeling of Social In uence in High Performance Computing Workloads

KAUST Repository

Zheng, Shuai

2011-06-01

High Performance Computing (HPC) is becoming a common tool in many research areas. Social influence (e.g., project collaboration) among increasing users of HPC systems creates bursty behavior in underlying workloads. This bursty behavior is increasingly common with the advent of grid computing and cloud computing. Mining the user bursty behavior is important for HPC workloads prediction and scheduling, which has direct impact on overall HPC computing performance. A representative work in this area is the Mixed User Group Model (MUGM), which clusters users according to the resource demand features of their submissions, such as duration time and parallelism. However, MUGM has some difficulties when implemented in real-world system. First, representing user behaviors by the features of their resource demand is usually difficult. Second, these features are not always available. Third, measuring the similarities among users is not a well-defined problem. In this work, we propose a Social Influence Model (SIM) to identify, analyze, and quantify the level of social influence across HPC users. The advantage of the SIM model is that it finds HPC communities by analyzing user job submission time, thereby avoiding the difficulties of MUGM. An offline algorithm and a fast-converging, computationally-efficient online learning algorithm for identifying social groups are proposed. Both offline and online algorithms are applied on several HPC and grid workloads, including Grid 5000, EGEE 2005 and 2007, and KAUST Supercomputing Lab (KSL) BGP data. From the experimental results, we show the existence of a social graph, which is characterized by a pattern of dominant users and followers. In order to evaluate the effectiveness of identified user groups, we show the pattern discovered by the offline algorithm follows a power-law distribution, which is consistent with those observed in mainstream social networks. We finally conclude the thesis and discuss future directions of our work.
High performance stream computing for particle beam transport simulations

International Nuclear Information System (INIS)

Appleby, R; Bailey, D; Higham, J; Salt, M

2008-01-01

Understanding modern particle accelerators requires simulating charged particle transport through the machine elements. These simulations can be very time consuming due to the large number of particles and the need to consider many turns of a circular machine. Stream computing offers an attractive way to dramatically improve the performance of such simulations by calculating the simultaneous transport of many particles using dedicated hardware. Modern Graphics Processing Units (GPUs) are powerful and affordable stream computing devices. The results of simulations of particle transport through the booster-to-storage-ring transfer line of the DIAMOND synchrotron light source using an NVidia GeForce 7900 GPU are compared to the standard transport code MAD. It is found that particle transport calculations are suitable for stream processing and large performance increases are possible. The accuracy and potential speed gains are compared and the prospects for future work in the area are discussed
Parametric analysis of electromechanical and fatigue performance of total knee replacement bearing with embedded piezoelectric transducers

Science.gov (United States)

Safaei, Mohsen; Meneghini, R. Michael; Anton, Steven R.

2017-09-01

Total knee arthroplasty is a common procedure in the United States; it has been estimated that about 4 million people are currently living with primary knee replacement in this country. Despite huge improvements in material properties, implant design, and surgical techniques, some implants fail a few years after surgery. A lack of information about in vivo kinetics of the knee prevents the establishment of a correlated intra- and postoperative loading pattern in knee implants. In this study, a conceptual design of an ultra high molecular weight (UHMW) knee bearing with embedded piezoelectric transducers is proposed, which is able to measure the reaction forces from knee motion as well as harvest energy to power embedded electronics. A simplified geometry consisting of a disk of UHMW with a single embedded piezoelectric ceramic is used in this work to study the general parametric trends of an instrumented knee bearing. A combined finite element and electromechanical modeling framework is employed to investigate the fatigue behavior of the instrumented bearing and the electromechanical performance of the embedded piezoelectric. The model is validated through experimental testing and utilized for further parametric studies. Parametric studies consist of the investigation of the effects of several dimensional and piezoelectric material parameters on the durability of the bearing and electrical output of the transducers. Among all the parameters, it is shown that adding large fillet radii results in noticeable improvement in the fatigue life of the bearing. Additionally, the design is highly sensitive to the depth of piezoelectric pocket. Finally, using PZT-5H piezoceramics, higher voltage and slightly enhanced fatigue life is achieved.
Extreme learning machine based optimal embedding location finder for image steganography.

Directory of Open Access Journals (Sweden)

Hayfaa Abdulzahra Atee

Full Text Available In image steganography, determining the optimum location for embedding the secret message precisely with minimum distortion of the host medium remains a challenging issue. Yet, an effective approach for the selection of the best embedding location with least deformation is far from being achieved. To attain this goal, we propose a novel approach for image steganography with high-performance, where extreme learning machine (ELM algorithm is modified to create a supervised mathematical model. This ELM is first trained on a part of an image or any host medium before being tested in the regression mode. This allowed us to choose the optimal location for embedding the message with best values of the predicted evaluation metrics. Contrast, homogeneity, and other texture features are used for training on a new metric. Furthermore, the developed ELM is exploited for counter over-fitting while training. The performance of the proposed steganography approach is evaluated by computing the correlation, structural similarity (SSIM index, fusion matrices, and mean square error (MSE. The modified ELM is found to outperform the existing approaches in terms of imperceptibility. Excellent features of the experimental results demonstrate that the proposed steganographic approach is greatly proficient for preserving the visual information of an image. An improvement in the imperceptibility as much as 28% is achieved compared to the existing state of the art methods.
Electrochemical properties for high surface area and improved electrical conductivity of platinum-embedded porous carbon nanofibers

Science.gov (United States)

An, Geon-Hyoung; Ahn, Hyo-Jin; Hong, Woong-Ki

2015-01-01

Four different types of carbon nanofibers (CNFs) for electrical double-layer capacitors (EDLCs), porous and non-porous CNFs with and without Pt metal nanoparticles, are synthesized by an electrospinning method and their performance in electrical double-layer capacitors (EDLCs) is characterized. In particular, the Pt-embedded porous CNFs (PCNFs) exhibit a high specific surface area of 670 m2 g-1, a large mesopore volume of 55.7%, and a low electrical resistance of 1.7 × 103. The synergistic effects of the high specific surface area with a large mesopore volume, and superior electrical conductivity result in an excellent specific capacitance of 130.2 F g-1, a good high-rate performance, superior cycling durability, and high energy density of 16.9-15.4 W h kg-1 for the performance of EDLCs.
A State-Based Modeling Approach for Efficient Performance Evaluation of Embedded System Architectures at Transaction Level

Directory of Open Access Journals (Sweden)

Anthony Barreteau

2012-01-01

Full Text Available Abstract models are necessary to assist system architects in the evaluation process of hardware/software architectures and to cope with the still increasing complexity of embedded systems. Efficient methods are required to create reliable models of system architectures and to allow early performance evaluation and fast exploration of the design space. In this paper, we present a specific transaction level modeling approach for performance evaluation of hardware/software architectures. This approach relies on a generic execution model that exhibits light modeling effort. Created models are used to evaluate by simulation expected processing and memory resources according to various architectures. The proposed execution model relies on a specific computation method defined to improve the simulation speed of transaction level models. The benefits of the proposed approach are highlighted through two case studies. The first case study is a didactic example illustrating the modeling approach. In this example, a simulation speed-up by a factor of 7,62 is achieved by using the proposed computation method. The second case study concerns the analysis of a communication receiver supporting part of the physical layer of the LTE protocol. In this case study, architecture exploration is led in order to improve the allocation of processing functions.
Comparison in performance of sediment microbial fuel cells according to depth of embedded anode.

Science.gov (United States)

An, Junyeong; Kim, Bongkyu; Nam, Jonghyeon; Ng, How Yong; Chang, In Seop

2013-01-01

Five rigid graphite plates were embedded in evenly divided sections of sediment, ranging from 2 cm (A1) to 10 cm (A5) below the top sediment layer. The maximum power and current of the MFCs increased in depth order; however, despite the increase in the internal resistance, the power and current density of the A5 MFC were 2.2 and 3.5 times higher, respectively, than those of the A1 MFC. In addition, the anode open circuit potentials (OCPs) of the sediment microbial fuel cells (SMFCs) became more negative with sediment depth. Based on these results, it could be then concluded that as the anode-embedding depth increases, that the anode environment is thermodynamically and kinetically favorable to anodophiles or electrophiles. Therefore, the anode-embedding depth should be considered an important parameter that determines the performance of SMFCs, and we posit that the anode potential could be one indicator for selecting the anode-embedding depth. Copyright © 2012 Elsevier Ltd. All rights reserved.
Embedded Control in Wearable Medical Devices: Application to the Artificial Pancreas

Directory of Open Access Journals (Sweden)

Stamatina Zavitsanou

2016-09-01

Full Text Available Significant increases in processing power, coupled with the miniaturization of processing units operating at low power levels, has motivated the embedding of modern control systems into medical devices. The design of such embedded decision-making strategies for medical applications is driven by multiple crucial factors, such as: (i guaranteed safety in the presence of exogenous disturbances and unexpected system failures; (ii constraints on computing resources; (iii portability and longevity in terms of size and power consumption; and (iv constraints on manufacturing and maintenance costs. Embedded control systems are especially compelling in the context of modern artificial pancreas systems (AP used in glucose regulation for patients with type 1 diabetes mellitus (T1DM. Herein, a review of potential embedded control strategies that can be leveraged in a fully-automated and portable AP is presented. Amongst competing controllers, emphasis is provided on model predictive control (MPC, since it has been established as a very promising control strategy for glucose regulation using the AP. Challenges involved in the design, implementation and validation of safety-critical embedded model predictive controllers for the AP application are discussed in detail. Additionally, the computational expenditure inherent to MPC strategies is investigated, and a comparative study of runtime performances and storage requirements among modern quadratic programming solvers is reported for a desktop environment and a prototype hardware platform.
Compact FPGA hardware architecture for public key encryption in embedded devices.

Science.gov (United States)

Rodríguez-Flores, Luis; Morales-Sandoval, Miguel; Cumplido, René; Feregrino-Uribe, Claudia; Algredo-Badillo, Ignacio

2018-01-01

Security is a crucial requirement in the envisioned applications of the Internet of Things (IoT), where most of the underlying computing platforms are embedded systems with reduced computing capabilities and energy constraints. In this paper we present the design and evaluation of a scalable low-area FPGA hardware architecture that serves as a building block to accelerate the costly operations of exponentiation and multiplication in [Formula: see text], commonly required in security protocols relying on public key encryption, such as in key agreement, authentication and digital signature. The proposed design can process operands of different size using the same datapath, which exhibits a significant reduction in area without loss of efficiency if compared to representative state of the art designs. For example, our design uses 96% less standard logic than a similar design optimized for performance, and 46% less resources than other design optimized for area. Even using fewer area resources, our design still performs better than its embedded software counterparts (190x and 697x).
Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

2016-02-27

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Performance management of high performance computing for medical image processing in Amazon Web Services

Science.gov (United States)

Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

2016-03-01

Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.
Gas expulsion in highly substructured embedded star clusters

Science.gov (United States)

Farias, J. P.; Fellhauer, M.; Smith, R.; Domínguez, R.; Dabringhausen, J.

2018-06-01

We investigate the response of initially substructured, young, embedded star clusters to instantaneous gas expulsion of their natal gas. We introduce primordial substructure to the stars and the gas by simplistically modelling the star formation process so as to obtain a variety of substructure distributed within our modelled star-forming regions. We show that, by measuring the virial ratio of the stars alone (disregarding the gas completely), we can estimate how much mass a star cluster will retain after gas expulsion to within 10 per cent accuracy, no matter how complex the background structure of the gas is, and we present a simple analytical recipe describing this behaviour. We show that the evolution of the star cluster while still embedded in the natal gas, and the behaviour of the gas before being expelled, is crucial process that affect the time-scale on which the cluster can evolve into a virialized spherical system. Embedded star clusters that have high levels of substructure are subvirial for longer times, enabling them to survive gas expulsion better than a virialized and spherical system. By using a more realistic treatment for the background gas than our previous studies, we find it very difficult to destroy the young clusters with instantaneous gas expulsion. We conclude that gas removal may not be the main culprit for the dissolution of young star clusters.
Security Implications for Ultra-Low Power Configurable SoC FPAA Embedded Systems

Directory of Open Access Journals (Sweden)

Jennifer Hasler

2018-06-01

Full Text Available We discuss the impact of physical computing techniques to classifying network security issues for ultra-low power networked IoT devices. Physical computing approaches enable at least a factor of 1000 improvement in computational energy efficiency empowering a new generation of local computational structures for embedded IoT devices. These techniques offer computational capability to address network security concerns. This paper begins the discussion of security opportunities for, and issues using, FPAA devices for small embedded IoT platforms. These FPAAs enable devices often utilized for low-power context aware computation. Embedded FPAA devices have both positive Security attributes, as well as potential vulnerabilities. FPAA devices can be part of the resulting secure computation, such as implementing unique functions. FPAA devices can be used investigate security of analog/mixed signal capabilities. The paper concludes with summarizing key improvements for secure ultra-low power embedded FPAA devices.
THE IMPROVEMENT OF COMPUTER NETWORK PERFORMANCE WITH BANDWIDTH MANAGEMENT IN KEMURNIAN II SENIOR HIGH SCHOOL

Directory of Open Access Journals (Sweden)

Bayu Kanigoro

2012-05-01

Full Text Available This research describes the improvement of computer network performance with bandwidth management in Kemurnian II Senior High School. The main issue of this research is the absence of bandwidth division on computer, which makes user who is downloading data, the provided bandwidth will be absorbed by the user. It leads other users do not get the bandwidth. Besides that, it has been done IP address division on each room, such as computer, teacher and administration room for supporting learning process in Kemurnian II Senior High School, so wireless network is needed. The method is location observation and interview with related parties in Kemurnian II Senior High School, the network analysis has run and designed a new topology network including the wireless network along with its configuration and separation bandwidth on microtic router and its limitation. The result is network traffic on Kemurnian II Senior High School can be shared evenly to each user; IX and IIX traffic are separated, which improve the speed on network access at school and the implementation of wireless network.Keywords: Bandwidth Management; Wireless Network
Contributing to the design of run-time systems dedicated to high performance computing

International Nuclear Information System (INIS)

Perache, M.

2006-10-01

In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)
Glass-embedded two-dimensional silicon photonic crystal devices with a broad bandwidth waveguide and a high quality nanocavity.

Science.gov (United States)

Jeon, Seung-Woo; Han, Jin-Kyu; Song, Bong-Shik; Noda, Susumu

2010-08-30

To enhance the mechanical stability of a two-dimensional photonic crystal slab structure and maintain its excellent performance, we designed a glass-embedded silicon photonic crystal device consisting of a broad bandwidth waveguide and a nanocavity with a high quality (Q) factor, and then fabricated the structure using spin-on glass (SOG). Furthermore, we showed that the refractive index of the SOG could be tuned from 1.37 to 1.57 by varying the curing temperature of the SOG. Finally, we demonstrated a glass-embedded heterostructured cavity with an ultrahigh Q factor of 160,000 by adjusting the refractive index of the SOG.
Co-design for an SoC embedded network controller

Institute of Scientific and Technical Information of China (English)

无

2006-01-01

With the development of Ethernet systems and the growing capacity of modern silicon technology, embedded communication networks are playing an increasingly important role in embedded and safety critical systems. Hardware/software co-design is a methodology for solving design problems in processor based embedded systems. In this work, we implemented a new 1-cycle pipeline microprocessor and a fast Ethernet transceiver and established a low cost, high performance embedded network controller, and designed a TCP/IP stack to access the Internet. We discussed the hardware/software architecture in the forepart, and then the whole system-on-a-chip on Altera Stratix EP1S25F780C6 device. Using the FPGA environment and SmartBit tester, we tested the system's throughput. Our simulation results showed that the maximum throughput of Ethernet packets is up to 7 Mbps, that of UDP packets is up to 5.8 Mbps, and that of TCP packets is up to 3.4 Mbps, which showed that this embedded system can easily transmit basic voice and video signals through Ethernet, and that using only one chip can realize that many electronic devices access to the Internet directly and get high performance.

T and D-Bench--Innovative Combined Support for Education and Research in Computer Architecture and Embedded Systems

Science.gov (United States)

Soares, S. N.; Wagner, F. R.

2011-01-01

Teaching and Design Workbench (T&D-Bench) is a framework aimed at education and research in the areas of computer architecture and embedded systems. It includes a set of features not found in other educational environments. This set of features is the result of an original combination of design requirements for T&D-Bench: that the…
Design and implement of pack filter module base on embedded firewall

Science.gov (United States)

Tian, Libo; Wang, Chen; Yang, Shunbo

2011-10-01

In the traditional security solution conditions, software firewall cannot intercept and respond the invasion before being attacked. And because of the high cost, the hardware firewall does not apply to the security strategy of the end nodes, so we have designed a kind of solution of embedded firewall with hardware and software. With ARM embedding Linux operating system, we have designed packet filter module and intrusion detection module to implement the basic function of firewall. Experiments and results show that that firewall has the advantages of low cost, high processing speed, high safety and the application of the computer terminals. This paper focuses on packet filtering module design and implementation.
Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

2017-05-15

Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.
A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

Energy Technology Data Exchange (ETDEWEB)

Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

2017-09-01

As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.
Modeling of High-Speed InP DHBTs using Electromagnetic Simulation Based De-embedding

DEFF Research Database (Denmark)

Johansen, Tom Keinicke; Krozer, Viktor; Konczykowska, Agnieszka

2006-01-01

In this paper an approach for high-speed InP DHBT modeling valid to 110 GHz is reported. Electromagnetic (EM) simulation is applied to predict the embedded network model caused by pad parasitics. The form of the parasitic network calls for a 4-step de-embedding approach. Applying direct parameter...... extraction on the de-embedded device response leads to accurate small-signal model description of the InP DHBT. An parameter extraction approach is described for the Agilent HBT model, which assures consistency between large-signal and bias-dependent smallsignal modeling....
Real-Time Operating Systems for Multicore Embedded Systems

OpenAIRE

Tomiyama, Hiroyuki; Honda, Shinya; Takada, Hiroaki

2008-01-01

Multicore systems-on-chip have become popular inthe design of embedded systems in order to simultaneously achieve high performance and low power consumption. On the software side, real-time operating systems are necessary in orderto handle growing complexity of embedded software. This paper describes requirements, design principles and implementation techniques for real-time operating systems to be used inasymmetric multicore systems.
A parallel calibration utility for WRF-Hydro on high performance computers

Science.gov (United States)

Wang, J.; Wang, C.; Kotamarthi, V. R.

2017-12-01

A successful modeling of complex hydrological processes comprises establishing an integrated hydrological model which simulates the hydrological processes in each water regime, calibrates and validates the model performance based on observation data, and estimates the uncertainties from different sources especially those associated with parameters. Such a model system requires large computing resources and often have to be run on High Performance Computers (HPC). The recently developed WRF-Hydro modeling system provides a significant advancement in the capability to simulate regional water cycles more completely. The WRF-Hydro model has a large range of parameters such as those in the input table files — GENPARM.TBL, SOILPARM.TBL and CHANPARM.TBL — and several distributed scaling factors such as OVROUGHRTFAC. These parameters affect the behavior and outputs of the model and thus may need to be calibrated against the observations in order to obtain a good modeling performance. Having a parameter calibration tool specifically for automate calibration and uncertainty estimates of WRF-Hydro model can provide significant convenience for the modeling community. In this study, we developed a customized tool using the parallel version of the model-independent parameter estimation and uncertainty analysis tool, PEST, to enabled it to run on HPC with PBS and SLURM workload manager and job scheduler. We also developed a series of PEST input file templates that are specifically for WRF-Hydro model calibration and uncertainty analysis. Here we will present a flood case study occurred in April 2013 over Midwest. The sensitivity and uncertainties are analyzed using the customized PEST tool we developed.
Accessible high performance computing solutions for near real-time image processing for time critical applications

Science.gov (United States)

Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

2009-09-01

High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.
Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

2014-11-11

has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).
Deep embedding convolutional neural network for synthesizing CT image from T1-Weighted MR image.

Science.gov (United States)

Xiang, Lei; Wang, Qian; Nie, Dong; Zhang, Lichi; Jin, Xiyao; Qiao, Yu; Shen, Dinggang

2018-07-01

Recently, more and more attention is drawn to the field of medical image synthesis across modalities. Among them, the synthesis of computed tomography (CT) image from T1-weighted magnetic resonance (MR) image is of great importance, although the mapping between them is highly complex due to large gaps of appearances of the two modalities. In this work, we aim to tackle this MR-to-CT synthesis task by a novel deep embedding convolutional neural network (DECNN). Specifically, we generate the feature maps from MR images, and then transform these feature maps forward through convolutional layers in the network. We can further compute a tentative CT synthesis from the midway of the flow of feature maps, and then embed this tentative CT synthesis result back to the feature maps. This embedding operation results in better feature maps, which are further transformed forward in DECNN. After repeating this embedding procedure for several times in the network, we can eventually synthesize a final CT image in the end of the DECNN. We have validated our proposed method on both brain and prostate imaging datasets, by also comparing with the state-of-the-art methods. Experimental results suggest that our DECNN (with repeated embedding operations) demonstrates its superior performances, in terms of both the perceptive quality of the synthesized CT image and the run-time cost for synthesizing a CT image. Copyright © 2018. Published by Elsevier B.V.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

Science.gov (United States)

Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
Monitoring performance of a highly distributed and complex computing infrastructure in LHCb

Science.gov (United States)

Mathe, Z.; Haen, C.; Stagni, F.

2017-10-01

In order to ensure an optimal performance of the LHCb Distributed Computing, based on LHCbDIRAC, it is necessary to be able to inspect the behavior over time of many components: firstly the agents and services on which the infrastructure is built, but also all the computing tasks and data transfers that are managed by this infrastructure. This consists of recording and then analyzing time series of a large number of observables, for which the usage of SQL relational databases is far from optimal. Therefore within DIRAC we have been studying novel possibilities based on NoSQL databases (ElasticSearch, OpenTSDB and InfluxDB) as a result of this study we developed a new monitoring system based on ElasticSearch. It has been deployed on the LHCb Distributed Computing infrastructure for which it collects data from all the components (agents, services, jobs) and allows creating reports through Kibana and a web user interface, which is based on the DIRAC web framework. In this paper we describe this new implementation of the DIRAC monitoring system. We give details on the ElasticSearch implementation within the DIRAC general framework, as well as an overview of the advantages of the pipeline aggregation used for creating a dynamic bucketing of the time series. We present the advantages of using the ElasticSearch DSL high-level library for creating and running queries. Finally we shall present the performances of that system.
High performance in software development

CERN Multimedia

CERN. Geneva; Haapio, Petri; Liukkonen, Juha-Matti

2015-01-01

What are the ingredients of high-performing software? Software development, especially for large high-performance systems, is one the most complex tasks mankind has ever tried. Technological change leads to huge opportunities but challenges our old ways of working. Processing large data sets, possibly in real time or with other tight computational constraints, requires an efficient solution architecture. Efficiency requirements span from the distributed storage and large-scale organization of computation and data onto the lowest level of processor and data bus behavior. Integrating performance behavior over these levels is especially important when the computation is resource-bounded, as it is in numerics: physical simulation, machine learning, estimation of statistical models, etc. For example, memory locality and utilization of vector processing are essential for harnessing the computing power of modern processor architectures due to the deep memory hierarchies of modern general-purpose computers. As a r...
MPC Related Computational Capabilities of ARMv7A Processors

DEFF Research Database (Denmark)

Frison, Gianluca; Jørgensen, John Bagterp

2015-01-01

In recent years, the mass market of mobile devices has pushed the demand for increasingly fast but cheap processors. ARM, the world leader in this sector, has developed the Cortex-A series of processors with focus on computationally intensive applications. If properly programmed, these processors...... are powerful enough to solve the complex optimization problems arising in MPC in real-time, while keeping the traditional low-cost and low-power consumption. This makes these processors ideal candidates for use in embedded MPC. In this paper, we investigate the floating-point capabilities of Cortex A7, A9...... and A15 and show how to exploit the unique features of each processor to obtain the best performance, in the context of a novel implementation method for the linear-algebra routines used in MPC solvers. This method adapts high-performance computing techniques to the needs of embedded MPC. In particular...
Analytical thermal modelling of multilayered active embedded chips into high density electronic board

Directory of Open Access Journals (Sweden)

Monier-Vinard Eric

2013-01-01

Full Text Available The recent Printed Wiring Board embedding technology is an attractive packaging alternative that allows a very high degree of miniaturization by stacking multiple layers of embedded chips. This disruptive technology will further increase the thermal management challenges by concentrating heat dissipation at the heart of the organic substrate structure. In order to allow the electronic designer to early analyze the limits of the power dissipation, depending on the embedded chip location inside the board, as well as the thermal interactions with other buried chips or surface mounted electronic components, an analytical thermal modelling approach was established. The presented work describes the comparison of the analytical model results with the numerical models of various embedded chips configurations. The thermal behaviour predictions of the analytical model, found to be within ±10% of relative error, demonstrate its relevance for modelling high density electronic board. Besides the approach promotes a practical solution to study the potential gain to conduct a part of heat flow from the components towards a set of localized cooled board pads.
Leveraging the Power of High Performance Computing for Next Generation Sequencing Data Analysis: Tricks and Twists from a High Throughput Exome Workflow

Science.gov (United States)

Wonczak, Stephan; Thiele, Holger; Nieroda, Lech; Jabbari, Kamel; Borowski, Stefan; Sinha, Vishal; Gunia, Wilfried; Lang, Ulrich; Achter, Viktor; Nürnberg, Peter

2015-01-01

Next generation sequencing (NGS) has been a great success and is now a standard method of research in the life sciences. With this technology, dozens of whole genomes or hundreds of exomes can be sequenced in rather short time, producing huge amounts of data. Complex bioinformatics analyses are required to turn these data into scientific findings. In order to run these analyses fast, automated workflows implemented on high performance computers are state of the art. While providing sufficient compute power and storage to meet the NGS data challenge, high performance computing (HPC) systems require special care when utilized for high throughput processing. This is especially true if the HPC system is shared by different users. Here, stability, robustness and maintainability are as important for automated workflows as speed and throughput. To achieve all of these aims, dedicated solutions have to be developed. In this paper, we present the tricks and twists that we utilized in the implementation of our exome data processing workflow. It may serve as a guideline for other high throughput data analysis projects using a similar infrastructure. The code implementing our solutions is provided in the supporting information files. PMID:25942438
High-performance floating-point image computing workstation for medical applications

Science.gov (United States)

Mills, Karl S.; Wong, Gilman K.; Kim, Yongmin

1990-07-01

The medical imaging field relies increasingly on imaging and graphics techniques in diverse applications with needs similar to (or more stringent than) those of the military, industrial and scientific communities. However, most image processing and graphics systems available for use in medical imaging today are either expensive, specialized, or in most cases both. High performance imaging and graphics workstations which can provide real-time results for a number of applications, while maintaining affordability and flexibility, can facilitate the application of digital image computing techniques in many different areas. This paper describes the hardware and software architecture of a medium-cost floating-point image processing and display subsystem for the NeXT computer, and its applications as a medical imaging workstation. Medical imaging applications of the workstation include use in a Picture Archiving and Communications System (PACS), in multimodal image processing and 3-D graphics workstation for a broad range of imaging modalities, and as an electronic alternator utilizing its multiple monitor display capability and large and fast frame buffer. The subsystem provides a 2048 x 2048 x 32-bit frame buffer (16 Mbytes of image storage) and supports both 8-bit gray scale and 32-bit true color images. When used to display 8-bit gray scale images, up to four different 256-color palettes may be used for each of four 2K x 2K x 8-bit image frames. Three of these image frames can be used simultaneously to provide pixel selectable region of interest display. A 1280 x 1024 pixel screen with 1: 1 aspect ratio can be windowed into the frame buffer for display of any portion of the processed image or images. In addition, the system provides hardware support for integer zoom and an 82-color cursor. This subsystem is implemented on an add-in board occupying a single slot in the NeXT computer. Up to three boards may be added to the NeXT for multiple display capability (e
An Embedded Ghost-Fluid Method for Compressible Flow in Complex Geometry

KAUST Repository

Almarouf, Mohamad Abdulilah Alhusain Alali

2016-06-03

We present an embedded ghost-fluid method for numerical solutions of the compressible Navier Stokes (CNS) equations in arbitrary complex domains. The PDE multidimensional extrapolation approach of Aslam [1] is used to reconstruct the solution in the ghost-fluid regions and impose boundary conditions at the fluid-solid interface. The CNS equations are numerically solved by the second order multidimensional upwind method of Colella [2] and Saltzman [3]. Block-structured adaptive mesh refinement implemented under the Chombo framework is utilized to reduce the computational cost while keeping high-resolution mesh around the embedded boundary and regions of high gradient solutions. Numerical examples with different Reynolds numbers for low and high Mach number flow will be presented. We compare our simulation results with other reported experimental and computational results. The significance and advantages of our implementation, which revolve around balancing between the solution accuracy and implementation difficulties, are briefly discussed as well. © 2016 Trans Tech Publications.
An Embedded Ghost-Fluid Method for Compressible Flow in Complex Geometry

KAUST Repository

Almarouf, Mohamad Abdulilah Alhusain Alali; Samtaney, Ravi

2016-01-01

We present an embedded ghost-fluid method for numerical solutions of the compressible Navier Stokes (CNS) equations in arbitrary complex domains. The PDE multidimensional extrapolation approach of Aslam [1] is used to reconstruct the solution in the ghost-fluid regions and impose boundary conditions at the fluid-solid interface. The CNS equations are numerically solved by the second order multidimensional upwind method of Colella [2] and Saltzman [3]. Block-structured adaptive mesh refinement implemented under the Chombo framework is utilized to reduce the computational cost while keeping high-resolution mesh around the embedded boundary and regions of high gradient solutions. Numerical examples with different Reynolds numbers for low and high Mach number flow will be presented. We compare our simulation results with other reported experimental and computational results. The significance and advantages of our implementation, which revolve around balancing between the solution accuracy and implementation difficulties, are briefly discussed as well. © 2016 Trans Tech Publications.
A real-time photogrammetry system based on embedded architecture

Directory of Open Access Journals (Sweden)

S. Y. Zheng

2014-06-01

Full Text Available In order to meet the demand of real-time spatial data processing and improve the online processing capability of photogrammetric system, a kind of real-time photogrammetry method is proposed in this paper. According to the proposed method, system based on embedded architecture is then designed: using FPGA, ARM+DSP and other embedded computing technology to build specialized hardware operating environment, transplanting and optimizing the existing photogrammetric algorithm to the embedded system, and finally real-time photogrammetric data processing is realized. At last, aerial photogrammetric experiment shows that the method can achieve high-speed and stable on-line processing of photogrammetric data. And the experiment also verifies the feasibility of the proposed real-time photogrammetric system based on embedded architecture. It is the first time to realize real-time aerial photogrammetric system, which can improve the online processing efficiency of photogrammetry to a higher level and broaden the application field of photogrammetry.

The data embedding method

Energy Technology Data Exchange (ETDEWEB)

Sandford, M.T. II; Bradley, J.N.; Handel, T.G.

1996-06-01

Data embedding is a new steganographic method for combining digital information sets. This paper describes the data embedding method and gives examples of its application using software written in the C-programming language. Sandford and Handel produced a computer program (BMPEMBED, Ver. 1.51 written for IBM PC/AT or compatible, MS/DOS Ver. 3.3 or later) that implements data embedding in an application for digital imagery. Information is embedded into, and extracted from, Truecolor or color-pallet images in Microsoft{reg_sign} bitmap (.BMP) format. Hiding data in the noise component of a host, by means of an algorithm that modifies or replaces the noise bits, is termed {open_quote}steganography.{close_quote} Data embedding differs markedly from conventional steganography, because it uses the noise component of the host to insert information with few or no modifications to the host data values or their statistical properties. Consequently, the entropy of the host data is affected little by using data embedding to add information. The data embedding method applies to host data compressed with transform, or {open_quote}lossy{close_quote} compression algorithms, as for example ones based on discrete cosine transform and wavelet functions. Analysis of the host noise generates a key required for embedding and extracting the auxiliary data from the combined data. The key is stored easily in the combined data. Images without the key cannot be processed to extract the embedded information. To provide security for the embedded data, one can remove the key from the combined data and manage it separately. The image key can be encrypted and stored in the combined data or transmitted separately as a ciphertext much smaller in size than the embedded data. The key size is typically ten to one-hundred bytes, and it is in data an analysis algorithm.
Running Interactive Jobs on Peregrine | High-Performance Computing | NREL

Science.gov (United States)

shell prompt, which allows users to execute commands and scripts as they would on the login nodes. Login performed on the compute nodes rather than on login nodes. This page provides instructions and examples of , start GUIs etc. and the commands will execute on that node instead of on the login node. The -V option
Web Server Embedded System

Directory of Open Access Journals (Sweden)

Adharul Muttaqin

2014-07-01

Full Text Available Abstrak Embedded sistem saat ini menjadi perhatian khusus pada teknologi komputer, beberapa sistem operasi linux dan web server yang beraneka ragam juga sudah dipersiapkan untuk mendukung sistem embedded, salah satu aplikasi yang dapat digunakan dalam operasi pada sistem embedded adalah web server. Pemilihan web server pada lingkungan embedded saat ini masih jarang dilakukan, oleh karena itu penelitian ini dilakukan dengan menitik beratkan pada dua buah aplikasi web server yang tergolong memiliki fitur utama yang menawarkan “keringanan” pada konsumsi CPU maupun memori seperti Light HTTPD dan Tiny HTTPD. Dengan menggunakan parameter thread (users, ramp-up periods, dan loop count pada stress test embedded system, penelitian ini menawarkan solusi web server manakah diantara Light HTTPD dan Tiny HTTPD yang memiliki kecocokan fitur dalam penggunaan embedded sistem menggunakan beagleboard ditinjau dari konsumsi CPU dan memori. Hasil penelitian menunjukkan bahwa dalam hal konsumsi CPU pada beagleboard embedded system lebih disarankan penggunaan Light HTTPD dibandingkan dengan tiny HTTPD dikarenakan terdapat perbedaan CPU load yang sangat signifikan antar kedua layanan web tersebut Kata kunci: embedded system, web server Abstract Embedded systems are currently of particular concern in computer technology, some of the linux operating system and web server variegated also prepared to support the embedded system, one of the applications that can be used in embedded systems are operating on the web server. Selection of embedded web server on the environment is still rarely done, therefore this study was conducted with a focus on two web application servers belonging to the main features that offer a "lightness" to the CPU and memory consumption as Light HTTPD and Tiny HTTPD. By using the parameters of the thread (users, ramp-up periods, and loop count on a stress test embedded systems, this study offers a solution of web server which between the Light
Potential Functional Embedding Theory at the Correlated Wave Function Level. 2. Error Sources and Performance Tests.

Science.gov (United States)

Cheng, Jin; Yu, Kuang; Libisch, Florian; Dieterich, Johannes M; Carter, Emily A

2017-03-14

Quantum mechanical embedding theories partition a complex system into multiple spatial regions that can use different electronic structure methods within each, to optimize trade-offs between accuracy and cost. The present work incorporates accurate but expensive correlated wave function (CW) methods for a subsystem containing the phenomenon or feature of greatest interest, while self-consistently capturing quantum effects of the surroundings using fast but less accurate density functional theory (DFT) approximations. We recently proposed two embedding methods [for a review, see: Acc. Chem. Res. 2014 , 47 , 2768 ]: density functional embedding theory (DFET) and potential functional embedding theory (PFET). DFET provides a fast but non-self-consistent density-based embedding scheme, whereas PFET offers a more rigorous theoretical framework to perform fully self-consistent, variational CW/DFT calculations [as defined in part 1, CW/DFT means subsystem 1(2) is treated with CW(DFT) methods]. When originally presented, PFET was only tested at the DFT/DFT level of theory as a proof of principle within a planewave (PW) basis. Part 1 of this two-part series demonstrated that PFET can be made to work well with mixed Gaussian type orbital (GTO)/PW bases, as long as optimized GTO bases and consistent electron-ion potentials are employed throughout. Here in part 2 we conduct the first PFET calculations at the CW/DFT level and compare them to DFET and full CW benchmarks. We test the performance of PFET at the CW/DFT level for a variety of types of interactions (hydrogen bonding, metallic, and ionic). By introducing an intermediate CW/DFT embedding scheme denoted DFET/PFET, we show how PFET remedies different types of errors in DFET, serving as a more robust type of embedding theory.
GRID computing for experimental high energy physics

International Nuclear Information System (INIS)

Moloney, G.R.; Martin, L.; Seviour, E.; Taylor, G.N.; Moorhead, G.F.

2002-01-01

Full text: The Large Hadron Collider (LHC), to be completed at the CERN laboratory in 2006, will generate 11 petabytes of data per year. The processing of this large data stream requires a large, distributed computing infrastructure. A recent innovation in high performance distributed computing, the GRID, has been identified as an important tool in data analysis for the LHC. GRID computing has actual and potential application in many fields which require computationally intensive analysis of large, shared data sets. The Australian experimental High Energy Physics community has formed partnerships with the High Performance Computing community to establish a GRID node at the University of Melbourne. Through Australian membership of the ATLAS experiment at the LHC, Australian researchers have an opportunity to be involved in the European DataGRID project. This presentation will include an introduction to the GRID, and it's application to experimental High Energy Physics. We will present the results of our studies, including participation in the first LHC data challenge
PtRu nanoparticles embedded in nitrogen doped carbon with highly stable CO tolerance and durability

Science.gov (United States)

Ling, Ying; Yang, Zehui; Yang, Jun; Zhang, Yunfeng; Zhang, Quan; Yu, Xinxin; Cai, Weiwei

2018-02-01

As is well known, the lower durability and sluggish methanol oxidation reaction (MOR) of PtRu alloy electrocatalyst blocks the commercialization of direct methanol fuel cells (DMFCs). Here, we design a new PtRu electrocatalyst, with highly stable CO tolerance and durability, in which the PtRu nanoparticles are embedded in nitrogen doped carbon layers derived from carbonization of poly(vinyl pyrrolidone). The newly fabricated electrocatalyst exhibits no loss in electrochemical surface area (ECSA) and MOR activity after potential cycling from 0.6-1.0 V versus reversible hydrogen electrode, while commercial CB/PtRu retains only 50% of its initial ECSA. Meanwhile, due to the same protective layers, the Ru dissolution is decelerated, resulting in stable CO tolerance. Methanol oxidation reaction (MOR) testing indicates that the activity of newly fabricated electrocatalyst is two times higher than that of commercial CB/PtRu, and the fuel cell performance of the embedded PtRu electrocatalyst was comparable to that of commercial CB/PtRu. The embedded PtRu electrocatalyst is applicable in real DMFC operation. This study offers important and useful information for the design and fabrication of durable and CO tolerant electrocatalysts.
Performance Evaluation of UML2-Modeled Embedded Streaming Applications with System-Level Simulation

Directory of Open Access Journals (Sweden)

Arpinen Tero

2009-01-01

Full Text Available This article presents an efficient method to capture abstract performance model of streaming data real-time embedded systems (RTESs. Unified Modeling Language version 2 (UML2 is used for the performance modeling and as a front-end for a tool framework that enables simulation-based performance evaluation and design-space exploration. The adopted application meta-model in UML resembles the Kahn Process Network (KPN model and it is targeted at simulation-based performance evaluation. The application workload modeling is done using UML2 activity diagrams, and platform is described with structural UML2 diagrams and model elements. These concepts are defined using a subset of the profile for Modeling and Analysis of Realtime and Embedded (MARTE systems from OMG and custom stereotype extensions. The goal of the performance modeling and simulation is to achieve early estimates on task response times, processing element, memory, and on-chip network utilizations, among other information that is used for design-space exploration. As a case study, a video codec application on multiple processors is modeled, evaluated, and explored. In comparison to related work, this is the first proposal that defines transformation between UML activity diagrams and streaming data application workload meta models and successfully adopts it for RTES performance evaluation.
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Science.gov (United States)

Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.

2015-01-01

With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746
Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

Directory of Open Access Journals (Sweden)

Anwar S. Shatil

2015-01-01

Full Text Available With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1 inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2 highlight their main advantages; 3 discuss when it may (and may not be advisable to use them; 4 review some of their potential problems and barriers to access; and finally 5 give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc., a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

Energy Technology Data Exchange (ETDEWEB)

Qin, J; Bauer, M A, E-mail: qin.jinhui@gmail.com, E-mail: bauer@uwo.ca [Computer Science Department, University of Western Ontario, London, ON N6A 5B7 (Canada)

2010-11-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
Accelerated Synchrotron X-ray Diffraction Data Analysis on a Heterogeneous High Performance Computing System

International Nuclear Information System (INIS)

Qin, J; Bauer, M A

2010-01-01

The analysis of synchrotron X-ray Diffraction (XRD) data has been used by scientists and engineers to understand and predict properties of materials. However, the large volume of XRD image data and the intensive computations involved in the data analysis makes it hard for researchers to quickly reach any conclusions about the images from an experiment when using conventional XRD data analysis software. Synchrotron time is valuable and delays in XRD data analysis can impact decisions about subsequent experiments or about materials that they are investigating. In order to improve the data analysis performance, ideally to achieve near real time data analysis during an XRD experiment, we designed and implemented software for accelerated XRD data analysis. The software has been developed for a heterogeneous high performance computing (HPC) system, comprised of IBM PowerXCell 8i processors and Intel quad-core Xeon processors. This paper describes the software and reports on the improved performance. The results indicate that it is possible for XRD data to be analyzed at the rate it is being produced.
High-performance secure multi-party computation for data mining applications

DEFF Research Database (Denmark)

Bogdanov, Dan; Niitsoo, Margus; Toft, Tomas

2012-01-01

Secure multi-party computation (MPC) is a technique well suited for privacy-preserving data mining. Even with the recent progress in two-party computation techniques such as fully homomorphic encryption, general MPC remains relevant as it has shown promising performance metrics in real...... operations such as multiplication and comparison. Secondly, the confidential processing of financial data requires the use of more complex primitives, including a secure division operation. This paper describes new protocols in the Sharemind model for secure multiplication, share conversion, equality, bit...
Reconfigurable Computing

CERN Document Server

Cardoso, Joao MP

2011-01-01

As the complexity of modern embedded systems increases, it becomes less practical to design monolithic processing platforms. As a result, reconfigurable computing is being adopted widely for more flexible design. Reconfigurable Computers offer the spatial parallelism and fine-grained customizability of application-specific circuits with the postfabrication programmability of software. To make the most of this unique combination of performance and flexibility, designers need to be aware of both hardware and software issues. FPGA users must think not only about the gates needed to perform a comp
Additive Manufacturing and High-Performance Computing: a Disruptive Latent Technology

Science.gov (United States)

Goodwin, Bruce

2015-03-01

This presentation will discuss the relationship between recent advances in Additive Manufacturing (AM) technology, High-Performance Computing (HPC) simulation and design capabilities, and related advances in Uncertainty Quantification (UQ), and then examines their impacts upon national and international security. The presentation surveys how AM accelerates the fabrication process, while HPC combined with UQ provides a fast track for the engineering design cycle. The combination of AM and HPC/UQ almost eliminates the engineering design and prototype iterative cycle, thereby dramatically reducing cost of production and time-to-market. These methods thereby present significant benefits for US national interests, both civilian and military, in an age of austerity. Finally, considering cyber security issues and the advent of the ``cloud,'' these disruptive, currently latent technologies may well enable proliferation and so challenge both nuclear and non-nuclear aspects of international security.
Software Applications on the Peregrine System | High-Performance Computing

Science.gov (United States)

Algebraic Modeling System (GAMS) Statistics and analysis High-level modeling system for mathematical reactivity. Gurobi Optimizer Statistics and analysis Solver for mathematical programming LAMMPS Chemistry and , reactivities, and vibrational, electronic and NMR spectra. R Statistical Computing Environment Statistics and
An Embedded System for Safe, Secure and Reliable Execution of High Consequence Software

Energy Technology Data Exchange (ETDEWEB)

MCCOY,JAMES A.

2000-08-29

As more complex and functionally diverse requirements are placed on high consequence embedded applications, ensuring safe and secure operation requires an execution environment that is ultra reliable from a system viewpoint. In many cases the safety and security of the system depends upon the reliable cooperation between the hardware and the software to meet real-time system throughput requirements. The selection of a microprocessor and its associated development environment for an embedded application has the most far-reaching effects on the development and production of the system than any other element in the design. The effects of this choice ripple through the remainder of the hardware design and profoundly affect the entire software development process. While state-of-the-art software engineering principles indicate that an object oriented (OO) methodology provides a superior development environment, traditional programming languages available for microprocessors targeted for deeply embedded applications do not directly support OO techniques. Furthermore, the microprocessors themselves do not typically support nor do they enforce an OO environment. This paper describes a system level approach for the design of a microprocessor intended for use in deeply embedded high consequence applications that both supports and enforces an OO execution environment.
Homemade Buckeye-Pi: A Learning Many-Node Platform for High-Performance Parallel Computing

Science.gov (United States)

Amooie, M. A.; Moortgat, J.

2017-12-01

We report on the "Buckeye-Pi" cluster, the supercomputer developed in The Ohio State University School of Earth Sciences from 128 inexpensive Raspberry Pi (RPi) 3 Model B single-board computers. Each RPi is equipped with fast Quad Core 1.2GHz ARMv8 64bit processor, 1GB of RAM, and 32GB microSD card for local storage. Therefore, the cluster has a total RAM of 128GB that is distributed on the individual nodes and a flash capacity of 4TB with 512 processors, while it benefits from low power consumption, easy portability, and low total cost. The cluster uses the Message Passing Interface protocol to manage the communications between each node. These features render our platform the most powerful RPi supercomputer to date and suitable for educational applications in high-performance-computing (HPC) and handling of large datasets. In particular, we use the Buckeye-Pi to implement optimized parallel codes in our in-house simulator for subsurface media flows with the goal of achieving a massively-parallelized scalable code. We present benchmarking results for the computational performance across various number of RPi nodes. We believe our project could inspire scientists and students to consider the proposed unconventional cluster architecture as a mainstream and a feasible learning platform for challenging engineering and scientific problems.
Computer-Related Task Performance

DEFF Research Database (Denmark)

Longstreet, Phil; Xiao, Xiao; Sarker, Saonee

2016-01-01

The existing information system (IS) literature has acknowledged computer self-efficacy (CSE) as an important factor contributing to enhancements in computer-related task performance. However, the empirical results of CSE on performance have not always been consistent, and increasing an individual......'s CSE is often a cumbersome process. Thus, we introduce the theoretical concept of self-prophecy (SP) and examine how this social influence strategy can be used to improve computer-related task performance. Two experiments are conducted to examine the influence of SP on task performance. Results show...... that SP and CSE interact to influence performance. Implications are then discussed in terms of organizations’ ability to increase performance....
A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing.

Energy Technology Data Exchange (ETDEWEB)

James, Conrad D.; Schiess, Adrian B.; Howell, Jamie; Baca, Michael J.; Partridge, L. Donald; Finnegan, Patrick Sean; Wolfley, Steven L.; Dagel, Daryl James; Spahn, Olga Blum; Harper, Jason C.; Pohl, Kenneth Roy; Mickel, Patrick R.; Lohn, Andrew; Marinella, Matthew

2013-10-01

The human brain (volume=1200cm3) consumes 20W and is capable of performing > 10^16 operations/s. Current supercomputer technology has reached 1015 operations/s, yet it requires 1500m^3 and 3MW, giving the brain a 10^12 advantage in operations/s/W/cm^3. Thus, to reach exascale computation, two achievements are required: 1) improved understanding of computation in biological tissue, and 2) a paradigm shift towards neuromorphic computing where hardware circuits mimic properties of neural tissue. To address 1), we will interrogate corticostriatal networks in mouse brain tissue slices, specifically with regard to their frequency filtering capabilities as a function of input stimulus. To address 2), we will instantiate biological computing characteristics such as multi-bit storage into hardware devices with future computational and memory applications. Resistive memory devices will be modeled, designed, and fabricated in the MESA facility in consultation with our internal and external collaborators.
Interactive Data Exploration for High-Performance Fluid Flow Computations through Porous Media

KAUST Repository

Perovic, Nevena

2014-09-01

© 2014 IEEE. Huge data advent in high-performance computing (HPC) applications such as fluid flow simulations usually hinders the interactive processing and exploration of simulation results. Such an interactive data exploration not only allows scientiest to \\'play\\' with their data but also to visualise huge (distributed) data sets in both an efficient and easy way. Therefore, we propose an HPC data exploration service based on a sliding window concept, that enables researches to access remote data (available on a supercomputer or cluster) during simulation runtime without exceeding any bandwidth limitations between the HPC back-end and the user front-end.

Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

Science.gov (United States)

Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.
A Real-Time Embedded Control System for Electro-Fused Magnesia Furnace

Directory of Open Access Journals (Sweden)

Fang Zheng

2013-01-01

Full Text Available Since smelting process of electro-fused magnesia furnace is a complicated process which has characteristics like complex operation conditions, strong nonlinearities, and strong couplings, traditional linear controller cannot control it very well. Advanced intelligent control strategy is a good solution to this kind of industrial process. However, advanced intelligent control strategy always involves huge programming task and hard debugging and maintaining problems. In this paper, a real-time embedded control system is proposed for the process control of electro-fused magnesia furnace based on intelligent control strategy and model-based design technology. As for hardware, an embedded controller based on an industrial Single Board Computer (SBC is developed to meet industrial field environment demands. As for software, a Linux based on Real-Time Application Interface (RTAI is used as the real-time kernel of the controller to improve its real-time performance. The embedded software platform is also modified to support generating embedded code automatically from Simulink/Stateflow models. Based on the proposed embedded control system, the intelligent embedded control software of electro-fused magnesium furnace can be directly generated from Simulink/Stateflow models. To validate the effectiveness of the proposed embedded control system, hardware-in-the-loop (HIL and industrial field experiments are both implemented. Experiments results show that the embedded control system works very well in both laboratory and industry environments.
Virtual Network Embedding via Monte Carlo Tree Search.

Science.gov (United States)

Haeri, Soroush; Trajkovic, Ljiljana

2018-02-01

Network virtualization helps overcome shortcomings of the current Internet architecture. The virtualized network architecture enables coexistence of multiple virtual networks (VNs) on an existing physical infrastructure. VN embedding (VNE) problem, which deals with the embedding of VN components onto a physical network, is known to be -hard. In this paper, we propose two VNE algorithms: MaVEn-M and MaVEn-S. MaVEn-M employs the multicommodity flow algorithm for virtual link mapping while MaVEn-S uses the shortest-path algorithm. They formalize the virtual node mapping problem by using the Markov decision process (MDP) framework and devise action policies (node mappings) for the proposed MDP using the Monte Carlo tree search algorithm. Service providers may adjust the execution time of the MaVEn algorithms based on the traffic load of VN requests. The objective of the algorithms is to maximize the profit of infrastructure providers. We develop a discrete event VNE simulator to implement and evaluate performance of MaVEn-M, MaVEn-S, and several recently proposed VNE algorithms. We introduce profitability as a new performance metric that captures both acceptance and revenue to cost ratios. Simulation results show that the proposed algorithms find more profitable solutions than the existing algorithms. Given additional computation time, they further improve embedding solutions.
Embedded DCT and wavelet methods for fine granular scalable video: analysis and comparison

Science.gov (United States)

van der Schaar-Mitrea, Mihaela; Chen, Yingwei; Radha, Hayder

2000-04-01

Video transmission over bandwidth-varying networks is becoming increasingly important due to emerging applications such as streaming of video over the Internet. The fundamental obstacle in designing such systems resides in the varying characteristics of the Internet (i.e. bandwidth variations and packet-loss patterns). In MPEG-4, a new SNR scalability scheme, called Fine-Granular-Scalability (FGS), is currently under standardization, which is able to adapt in real-time (i.e. at transmission time) to Internet bandwidth variations. The FGS framework consists of a non-scalable motion-predicted base-layer and an intra-coded fine-granular scalable enhancement layer. For example, the base layer can be coded using a DCT-based MPEG-4 compliant, highly efficient video compression scheme. Subsequently, the difference between the original and decoded base-layer is computed, and the resulting FGS-residual signal is intra-frame coded with an embedded scalable coder. In order to achieve high coding efficiency when compressing the FGS enhancement layer, it is crucial to analyze the nature and characteristics of residual signals common to the SNR scalability framework (including FGS). In this paper, we present a thorough analysis of SNR residual signals by evaluating its statistical properties, compaction efficiency and frequency characteristics. The signal analysis revealed that the energy compaction of the DCT and wavelet transforms is limited and the frequency characteristic of SNR residual signals decay rather slowly. Moreover, the blockiness artifacts of the low bit-rate coded base-layer result in artificial high frequencies in the residual signal. Subsequently, a variety of wavelet and embedded DCT coding techniques applicable to the FGS framework are evaluated and their results are interpreted based on the identified signal properties. As expected from the theoretical signal analysis, the rate-distortion performances of the embedded wavelet and DCT-based coders are very
Teaching Embedded System Concepts for Technological Literacy

Science.gov (United States)

Winzker, M.; Schwandt, A.

2011-01-01

A basic understanding of technology is recognized as important knowledge even for students not connected with engineering and computer science. This paper shows that embedded system concepts can be taught in a technological literacy course. An embedded system teaching block that has been used in an electronics module for non-engineers is…
Direct numerical simulation of reactor two-phase flows enabled by high-performance computing

Energy Technology Data Exchange (ETDEWEB)

Fang, Jun; Cambareri, Joseph J.; Brown, Cameron S.; Feng, Jinyong; Gouws, Andre; Li, Mengnan; Bolotnov, Igor A.

2018-04-01

Nuclear reactor two-phase flows remain a great engineering challenge, where the high-resolution two-phase flow database which can inform practical model development is still sparse due to the extreme reactor operation conditions and measurement difficulties. Owing to the rapid growth of computing power, the direct numerical simulation (DNS) is enjoying a renewed interest in investigating the related flow problems. A combination between DNS and an interface tracking method can provide a unique opportunity to study two-phase flows based on first principles calculations. More importantly, state-of-the-art high-performance computing (HPC) facilities are helping unlock this great potential. This paper reviews the recent research progress of two-phase flow DNS related to reactor applications. The progress in large-scale bubbly flow DNS has been focused not only on the sheer size of those simulations in terms of resolved Reynolds number, but also on the associated advanced modeling and analysis techniques. Specifically, the current areas of active research include modeling of sub-cooled boiling, bubble coalescence, as well as the advanced post-processing toolkit for bubbly flow simulations in reactor geometries. A novel bubble tracking method has been developed to track the evolution of bubbles in two-phase bubbly flow. Also, spectral analysis of DNS database in different geometries has been performed to investigate the modulation of the energy spectrum slope due to bubble-induced turbulence. In addition, the single-and two-phase analysis results are presented for turbulent flows within the pressurized water reactor (PWR) core geometries. The related simulations are possible to carry out only with the world leading HPC platforms. These simulations are allowing more complex turbulence model development and validation for use in 3D multiphase computational fluid dynamics (M-CFD) codes.
Development of high performance scientific components for interoperability of computing packages

Energy Technology Data Exchange (ETDEWEB)

Gulabani, Teena Pratap [Iowa State Univ., Ames, IA (United States)

2008-01-01

Three major high performance quantum chemistry computational packages, NWChem, GAMESS and MPQC have been developed by different research efforts following different design patterns. The goal is to achieve interoperability among these packages by overcoming the challenges caused by the different communication patterns and software design of each of these packages. A chemistry algorithm is hard to develop as well as being a time consuming process; integration of large quantum chemistry packages will allow resource sharing and thus avoid reinvention of the wheel. Creating connections between these incompatible packages is the major motivation of the proposed work. This interoperability is achieved by bringing the benefits of Component Based Software Engineering through a plug-and-play component framework called Common Component Architecture (CCA). In this thesis, I present a strategy and process used for interfacing two widely used and important computational chemistry methodologies: Quantum Mechanics and Molecular Mechanics. To show the feasibility of the proposed approach the Tuning and Analysis Utility (TAU) has been coupled with NWChem code and its CCA components. Results show that the overhead is negligible when compared to the ease and potential of organizing and coping with large-scale software applications.
Efficient physical embedding of topologically complex information processing networks in brains and computer circuits.

Directory of Open Access Journals (Sweden)

Danielle S Bassett

2010-04-01

Full Text Available Nervous systems are information processing networks that evolved by natural selection, whereas very large scale integrated (VLSI computer circuits have evolved by commercially driven technology development. Here we follow historic intuition that all physical information processing systems will share key organizational properties, such as modularity, that generally confer adaptivity of function. It has long been observed that modular VLSI circuits demonstrate an isometric scaling relationship between the number of processing elements and the number of connections, known as Rent's rule, which is related to the dimensionality of the circuit's interconnect topology and its logical capacity. We show that human brain structural networks, and the nervous system of the nematode C. elegans, also obey Rent's rule, and exhibit some degree of hierarchical modularity. We further show that the estimated Rent exponent of human brain networks, derived from MRI data, can explain the allometric scaling relations between gray and white matter volumes across a wide range of mammalian species, again suggesting that these principles of nervous system design are highly conserved. For each of these fractal modular networks, the dimensionality of the interconnect topology was greater than the 2 or 3 Euclidean dimensions of the space in which it was embedded. This relatively high complexity entailed extra cost in physical wiring: although all networks were economically or cost-efficiently wired they did not strictly minimize wiring costs. Artificial and biological information processing systems both may evolve to optimize a trade-off between physical cost and topological complexity, resulting in the emergence of homologous principles of economical, fractal and modular design across many different kinds of nervous and computational networks.
High-Performance Networking

CERN Multimedia

CERN. Geneva

2003-01-01

The series will start with an historical introduction about what people saw as high performance message communication in their time and how that developed to the now to day known "standard computer network communication". It will be followed by a far more technical part that uses the High Performance Computer Network standards of the 90's, with 1 Gbit/sec systems as introduction for an in depth explanation of the three new 10 Gbit/s network and interconnect technology standards that exist already or emerge. If necessary for a good understanding some sidesteps will be included to explain important protocols as well as some necessary details of concerned Wide Area Network (WAN) standards details including some basics of wavelength multiplexing (DWDM). Some remarks will be made concerning the rapid expanding applications of networked storage.
Using the Eclipse Parallel Tools Platform to Assist Earth Science Model Development and Optimization on High Performance Computers

Science.gov (United States)

Alameda, J. C.

2011-12-01

Development and optimization of computational science models, particularly on high performance computers, and with the advent of ubiquitous multicore processor systems, practically on every system, has been accomplished with basic software tools, typically, command-line based compilers, debuggers, performance tools that have not changed substantially from the days of serial and early vector computers. However, model complexity, including the complexity added by modern message passing libraries such as MPI, and the need for hybrid code models (such as openMP and MPI) to be able to take full advantage of high performance computers with an increasing core count per shared memory node, has made development and optimization of such codes an increasingly arduous task. Additional architectural developments, such as many-core processors, only complicate the situation further. In this paper, we describe how our NSF-funded project, "SI2-SSI: A Productive and Accessible Development Workbench for HPC Applications Using the Eclipse Parallel Tools Platform" (WHPC) seeks to improve the Eclipse Parallel Tools Platform, an environment designed to support scientific code development targeted at a diverse set of high performance computing systems. Our WHPC project to improve Eclipse PTP takes an application-centric view to improve PTP. We are using a set of scientific applications, each with a variety of challenges, and using PTP to drive further improvements to both the scientific application, as well as to understand shortcomings in Eclipse PTP from an application developer perspective, to drive our list of improvements we seek to make. We are also partnering with performance tool providers, to drive higher quality performance tool integration. We have partnered with the Cactus group at Louisiana State University to improve Eclipse's ability to work with computational frameworks and extremely complex build systems, as well as to develop educational materials to incorporate into
Vision-based Nano Robotic System for High-throughput Non-embedded Cell Cutting.

Science.gov (United States)

Shang, Wanfeng; Lu, Haojian; Wan, Wenfeng; Fukuda, Toshio; Shen, Yajing

2016-03-04

Cell cutting is a significant task in biology study, but the highly productive non-embedded cell cutting is still a big challenge for current techniques. This paper proposes a vision-based nano robotic system and then realizes automatic non-embedded cell cutting with this system. First, the nano robotic system is developed and integrated with a nanoknife inside an environmental scanning electron microscopy (ESEM). Then, the positions of the nanoknife and the single cell are recognized, and the distance between them is calculated dynamically based on image processing. To guarantee the positioning accuracy and the working efficiency, we propose a distance-regulated speed adapting strategy, in which the moving speed is adjusted intelligently based on the distance between the nanoknife and the target cell. The results indicate that the automatic non-embedded cutting is able to be achieved within 1-2 mins with low invasion benefiting from the high precise nanorobot system and the sharp edge of nanoknife. This research paves a way for the high-throughput cell cutting at cell's natural condition, which is expected to make significant impact on the biology studies, especially for the in-situ analysis at cellular and subcellular scale, such as cell interaction investigation, neural signal transduction and low invasive cell surgery.
Vision-based Nano Robotic System for High-throughput Non-embedded Cell Cutting

Science.gov (United States)

Shang, Wanfeng; Lu, Haojian; Wan, Wenfeng; Fukuda, Toshio; Shen, Yajing

2016-03-01

Cell cutting is a significant task in biology study, but the highly productive non-embedded cell cutting is still a big challenge for current techniques. This paper proposes a vision-based nano robotic system and then realizes automatic non-embedded cell cutting with this system. First, the nano robotic system is developed and integrated with a nanoknife inside an environmental scanning electron microscopy (ESEM). Then, the positions of the nanoknife and the single cell are recognized, and the distance between them is calculated dynamically based on image processing. To guarantee the positioning accuracy and the working efficiency, we propose a distance-regulated speed adapting strategy, in which the moving speed is adjusted intelligently based on the distance between the nanoknife and the target cell. The results indicate that the automatic non-embedded cutting is able to be achieved within 1-2 mins with low invasion benefiting from the high precise nanorobot system and the sharp edge of nanoknife. This research paves a way for the high-throughput cell cutting at cell’s natural condition, which is expected to make significant impact on the biology studies, especially for the in-situ analysis at cellular and subcellular scale, such as cell interaction investigation, neural signal transduction and low invasive cell surgery.
Intelligence for embedded systems a methodological approach

CERN Document Server

Alippi, Cesare

2014-01-01

Addressing current issues of which any engineer or computer scientist should be aware, this monograph is a response to the need to adopt a new computational paradigm as the methodological basis for designing pervasive embedded systems with sensor capabilities. The requirements of this paradigm are to control complexity, to limit cost and energy consumption, and to provide adaptation and cognition abilities allowing the embedded system to interact proactively with the real world. The quest for such intelligence requires the formalization of a new generation of intelligent systems able to exploit advances in digital architectures and in sensing technologies. The book sheds light on the theory behind intelligence for embedded systems with specific focus on: · robustness (the robustness of a computational flow and its evaluation); · intelligence (how to mimic the adaptation and cognition abilities of the human brain), · the capacity to learn in non-stationary and evolv...
Comparison of Pilot Symbol Embedded Channel Estimation Algorithms

Directory of Open Access Journals (Sweden)

P. Kadlec

2009-12-01

Full Text Available In the paper, algorithms of the pilot symbol embedded channel estimation are compared. Attention is turned to the Least Square (LS channel estimation and the Sliding Correlator (SC algorithm. Both algorithms are implemented in Matlab to estimate the Channel Impulse Response (CIR of a channel exhibiting multi-path propagation. Algorithms are compared from the viewpoint of computational demands, influence of the Additive White Gaussian Noise (AWGN, an embedded pilot symbol and a computed CIR over the estimation error.
The Centre of High-Performance Scientific Computing, Geoverbund, ABC/J - Geosciences enabled by HPSC

Science.gov (United States)

Kollet, Stefan; Görgen, Klaus; Vereecken, Harry; Gasper, Fabian; Hendricks-Franssen, Harrie-Jan; Keune, Jessica; Kulkarni, Ketan; Kurtz, Wolfgang; Sharples, Wendy; Shrestha, Prabhakar; Simmer, Clemens; Sulis, Mauro; Vanderborght, Jan

2016-04-01

The Centre of High-Performance Scientific Computing (HPSC TerrSys) was founded 2011 to establish a centre of competence in high-performance scientific computing in terrestrial systems and the geosciences enabling fundamental and applied geoscientific research in the Geoverbund ABC/J (geoscientfic research alliance of the Universities of Aachen, Cologne, Bonn and the Research Centre Jülich, Germany). The specific goals of HPSC TerrSys are to achieve relevance at the national and international level in (i) the development and application of HPSC technologies in the geoscientific community; (ii) student education; (iii) HPSC services and support also to the wider geoscientific community; and in (iv) the industry and public sectors via e.g., useful applications and data products. A key feature of HPSC TerrSys is the Simulation Laboratory Terrestrial Systems, which is located at the Jülich Supercomputing Centre (JSC) and provides extensive capabilities with respect to porting, profiling, tuning and performance monitoring of geoscientific software in JSC's supercomputing environment. We will present a summary of success stories of HPSC applications including integrated terrestrial model development, parallel profiling and its application from watersheds to the continent; massively parallel data assimilation using physics-based models and ensemble methods; quasi-operational terrestrial water and energy monitoring; and convection permitting climate simulations over Europe. The success stories stress the need for a formalized education of students in the application of HPSC technologies in future.
Current state and future direction of computer systems at NASA Langley Research Center

Science.gov (United States)

Rogers, James L. (Editor); Tucker, Jerry H. (Editor)

1992-01-01

Computer systems have advanced at a rate unmatched by any other area of technology. As performance has dramatically increased there has been an equally dramatic reduction in cost. This constant cost performance improvement has precipitated the pervasiveness of computer systems into virtually all areas of technology. This improvement is due primarily to advances in microelectronics. Most people are now convinced that the new generation of supercomputers will be built using a large number (possibly thousands) of high performance microprocessors. Although the spectacular improvements in computer systems have come about because of these hardware advances, there has also been a steady improvement in software techniques. In an effort to understand how these hardware and software advances will effect research at NASA LaRC, the Computer Systems Technical Committee drafted this white paper to examine the current state and possible future directions of computer systems at the Center. This paper discusses selected important areas of computer systems including real-time systems, embedded systems, high performance computing, distributed computing networks, data acquisition systems, artificial intelligence, and visualization.
Type-based homeomorphic embedding for online termination

DEFF Research Database (Denmark)

Albert, Elvira; Gallagher, John Patrick; Gómez-Zamalloa, Miguel

2009-01-01

that the computations supervised are performed over a finite signature, i.e., the number of constants and function symbols involved is finite. However, there are many situations, for example numeric computations, which involve an infinite signature and thus HEm does not guarantee termination. Some extensions to HEm...... for the case of infinite signatures have been proposed which guarantee termination. However, the existing techniques either do not provide systematic means for generating such extensions or the extensions are too simplistic and do not produce the expected results in practice. We propose Type-based Homeomorphic...... Embedding (TbHEm) as an extension of the standard, untyped, HEm. By taking static information about the behavior of the computation into account, expressed as types, TbHEm allows obtaining more precise results than those of the previous extensions to HEm for the case of infinite signatures. We show...
The Use of Video-Gaming Devices as a Motivation for Learning Embedded Systems Programming

Science.gov (United States)

Gonzalez, J.; Pomares, H.; Damas, M.; Garcia-Sanchez,P.; Rodriguez-Alvarez, M.; Palomares, J. M.

2013-01-01

As embedded systems are becoming prevalent in everyday life, many universities are incorporating embedded systems-related courses in their undergraduate curricula. However, it is not easy to motivate students in such courses since they conceive of embedded systems as bizarre computing elements, different from the personal computers with which they…
The ongoing investigation of high performance parallel computing in HEP

CERN Document Server

Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

1993-01-01

Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.
DNS and Embedded DNS as Tools for Investigating Unsteady Heat Transfer Phenomena in Turbines

Science.gov (United States)

vonTerzi, Dominic; Bauer, H.-J.

2010-01-01

DNS is a powerful tool with high potential for investigating unsteady heat transfer and fluid flow phenomena, in particular for cases involving transition to turbulence and/or large coherent structures. - DNS of idealized configurations related to turbomachinery components is already possible. - For more realistic configurations and the inclusion of more effects, reduction of computational cost is key issue (e.g., hybrid methods). - Approach pursued here: Embedded DNS ( segregated coupling of DNS with LES and/or RANS). - Embedded DNS is an enabling technology for many studies. - Pre-transitional heat transfer and trailing-edge cutback film-cooling are good candidates for (embedded) DNS studies.

Power for Vehicle Embedded MEMS Sensors, Phase II

Data.gov (United States)

National Aeronautics and Space Administration — Embedded wireless sensors of the future will enable flight vehicle systems to be "highly aware" of onboard health and performance parameters, as well as the external...
Power for Vehicle Embedded MEMS Sensors, Phase I

Data.gov (United States)

National Aeronautics and Space Administration — Embedded wireless sensors of the future will enable flight vehicle systems to be "highly aware" of onboard health and performance parameters, as well as the external...
DOE High Performance Computing Operational Review (HPCOR): Enabling Data-Driven Scientific Discovery at HPC Facilities

Energy Technology Data Exchange (ETDEWEB)

Gerber, Richard; Allcock, William; Beggio, Chris; Campbell, Stuart; Cherry, Andrew; Cholia, Shreyas; Dart, Eli; England, Clay; Fahey, Tim; Foertter, Fernanda; Goldstone, Robin; Hick, Jason; Karelitz, David; Kelly, Kaki; Monroe, Laura; Prabhat,; Skinner, David; White, Julia

2014-10-17

U.S. Department of Energy (DOE) High Performance Computing (HPC) facilities are on the verge of a paradigm shift in the way they deliver systems and services to science and engineering teams. Research projects are producing a wide variety of data at unprecedented scale and level of complexity, with community-specific services that are part of the data collection and analysis workflow. On June 18-19, 2014 representatives from six DOE HPC centers met in Oakland, CA at the DOE High Performance Operational Review (HPCOR) to discuss how they can best provide facilities and services to enable large-scale data-driven scientific discovery at the DOE national laboratories. The report contains findings from that review.
Reward-based learning under hardware constraints-using a RISC processor embedded in a neuromorphic substrate.

Science.gov (United States)

Friedmann, Simon; Frémaux, Nicolas; Schemmel, Johannes; Gerstner, Wulfram; Meier, Karlheinz

2013-01-01

In this study, we propose and analyze in simulations a new, highly flexible method of implementing synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. The study focuses on globally modulated STDP, as a special use-case of this method. Flexibility is achieved by embedding a general-purpose processor dedicated to plasticity into the wafer. To evaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spike train learning task. A single layer of neurons is trained to fire at specific points in time with only the reward as feedback. This model is simulated to measure its performance, i.e., the increase in received reward after learning. Using this performance as baseline, we then simulate the model with various constraints imposed by the proposed implementation and compare the performance. The simulated constraints include discretized synaptic weights, a restricted interface between analog synapses and embedded processor, and mismatch of analog circuits. We find that probabilistic updates can increase the performance of low-resolution weights, a simple interface between analog synapses and processor is sufficient for learning, and performance is insensitive to mismatch. Further, we consider communication latency between wafer and the conventional control computer system that is simulating the environment. This latency increases the delay, with which the reward is sent to the embedded processor. Because of the time continuous operation of the analog synapses, delay can cause a deviation of the updates as compared to the not delayed situation. We find that for highly accelerated systems latency has to be kept to a minimum. This study demonstrates the suitability of the proposed implementation to emulate the selected reward modulated STDP learning rule. It is therefore an ideal candidate for implementation in an upgraded version of the wafer-scale system developed within the BrainScaleS project.
Engineering embedded systems physics, programs, circuits

CERN Document Server

Hintenaus, Peter

2015-01-01

This is a textbook for graduate and final-year-undergraduate computer-science and electrical-engineering students interested in the hardware and software aspects of embedded and cyberphysical systems design. It is comprehensive and self-contained, covering everything from the basics to case-study implementation. Emphasis is placed on the physical nature of the problem domain and of the devices used. The reader is assumed to be familiar on a theoretical level with mathematical tools like ordinary differential equation and Fourier transforms. In this book these tools will be put to practical use. Engineering Embedded Systems begins by addressing basic material on signals and systems, before introducing to electronics. Treatment of digital electronics accentuating synchronous circuits and including high-speed effects proceeds to micro-controllers, digital signal processors and programmable logic. Peripheral units and decentralized networks are given due weight. The properties of analog circuits and devices like ...
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Science.gov (United States)

Brown, David K; Penkler, David L; Musyoka, Thommas M; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing.

Directory of Open Access Journals (Sweden)

David K Brown

Full Text Available Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS, a workflow management system and web interface for high performance computing (HPC. JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS.
JMS: An Open Source Workflow Management System and Web-Based Cluster Front-End for High Performance Computing

Science.gov (United States)

Brown, David K.; Penkler, David L.; Musyoka, Thommas M.; Bishop, Özlem Tastan

2015-01-01

Complex computational pipelines are becoming a staple of modern scientific research. Often these pipelines are resource intensive and require days of computing time. In such cases, it makes sense to run them over high performance computing (HPC) clusters where they can take advantage of the aggregated resources of many powerful computers. In addition to this, researchers often want to integrate their workflows into their own web servers. In these cases, software is needed to manage the submission of jobs from the web interface to the cluster and then return the results once the job has finished executing. We have developed the Job Management System (JMS), a workflow management system and web interface for high performance computing (HPC). JMS provides users with a user-friendly web interface for creating complex workflows with multiple stages. It integrates this workflow functionality with the resource manager, a tool that is used to control and manage batch jobs on HPC clusters. As such, JMS combines workflow management functionality with cluster administration functionality. In addition, JMS provides developer tools including a code editor and the ability to version tools and scripts. JMS can be used by researchers from any field to build and run complex computational pipelines and provides functionality to include these pipelines in external interfaces. JMS is currently being used to house a number of bioinformatics pipelines at the Research Unit in Bioinformatics (RUBi) at Rhodes University. JMS is an open-source project and is freely available at https://github.com/RUBi-ZA/JMS. PMID:26280450
SAME4HPC: A Promising Approach in Building a Scalable and Mobile Environment for High-Performance Computing

Energy Technology Data Exchange (ETDEWEB)

Karthik, Rajasekar [ORNL

2014-01-01

In this paper, an architecture for building Scalable And Mobile Environment For High-Performance Computing with spatial capabilities called SAME4HPC is described using cutting-edge technologies and standards such as Node.js, HTML5, ECMAScript 6, and PostgreSQL 9.4. Mobile devices are increasingly becoming powerful enough to run high-performance apps. At the same time, there exist a significant number of low-end and older devices that rely heavily on the server or the cloud infrastructure to do the heavy lifting. Our architecture aims to support both of these types of devices to provide high-performance and rich user experience. A cloud infrastructure consisting of OpenStack with Ubuntu, GeoServer, and high-performance JavaScript frameworks are some of the key open-source and industry standard practices that has been adopted in this architecture.
Integrated Design Tools for Embedded Control Systems

OpenAIRE

Jovanovic, D.S.; Hilderink, G.H.; Broenink, Johannes F.; Karelse, F.

2001-01-01

Currently, computer-based control systems are still being implemented using the same techniques as 10 years ago. The purpose of this project is the development of a design framework, consisting of tools and libraries, which allows the designer to build high reliable heterogeneous real-time embedded systems in a very short time at a fraction of the present day costs. The ultimate focus of current research is on transformation control laws to efficient concurrent algorithms, with concerns about...
Enabling Efficient Climate Science Workflows in High Performance Computing Environments

Science.gov (United States)

Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

2015-12-01

A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.
Integrated Design and Implementation of Embedded Control Systems with Scilab.

Science.gov (United States)

Ma, Longhua; Xia, Feng; Peng, Zhe

2008-09-05

Embedded systems are playing an increasingly important role in control engineering. Despite their popularity, embedded systems are generally subject to resource constraints and it is therefore difficult to build complex control systems on embedded platforms. Traditionally, the design and implementation of control systems are often separated, which causes the development of embedded control systems to be highly timeconsuming and costly. To address these problems, this paper presents a low-cost, reusable, reconfigurable platform that enables integrated design and implementation of embedded control systems. To minimize the cost, free and open source software packages such as Linux and Scilab are used. Scilab is ported to the embedded ARM-Linux system. The drivers for interfacing Scilab with several communication protocols including serial, Ethernet, and Modbus are developed. Experiments are conducted to test the developed embedded platform. The use of Scilab enables implementation of complex control algorithms on embedded platforms. With the developed platform, it is possible to perform all phases of the development cycle of embedded control systems in a unified environment, thus facilitating the reduction of development time and cost.
Integrated Design and Implementation of Embedded Control Systems with Scilab

Directory of Open Access Journals (Sweden)

Zhe Peng

2008-09-01

Full Text Available Embedded systems are playing an increasingly important role in control engineering. Despite their popularity, embedded systems are generally subject to resource constraints and it is therefore difficult to build complex control systems on embedded platforms. Traditionally, the design and implementation of control systems are often separated, which causes the development of embedded control systems to be highly timeconsuming and costly. To address these problems, this paper presents a low-cost, reusable, reconfigurable platform that enables integrated design and implementation of embedded control systems. To minimize the cost, free and open source software packages such as Linux and Scilab are used. Scilab is ported to the embedded ARM-Linux system. The drivers for interfacing Scilab with several communication protocols including serial, Ethernet, and Modbus are developed. Experiments are conducted to test the developed embedded platform. The use of Scilab enables implementation of complex control algorithms on embedded platforms. With the developed platform, it is possible to perform all phases of the development cycle of embedded control systems in a unified environment, thus facilitating the reduction of development time and cost.
OpenVX-based Python Framework for real-time cross platform acceleration of embedded computer vision applications

Directory of Open Access Journals (Sweden)

Ori Heimlich

2016-11-01

Full Text Available Embedded real-time vision applications are being rapidly deployed in a large realm of consumer electronics, ranging from automotive safety to surveillance systems. However, the relatively limited computational power of embedded platforms is considered as a bottleneck for many vision applications, necessitating optimization. OpenVX is a standardized interface, released in late 2014, in an attempt to provide both system and kernel level optimization to vision applications. With OpenVX, Vision processing are modeled with coarse-grained data flow graphs, which can be optimized and accelerated by the platform implementer. Current full implementations of OpenVX are given in the programming language C, which does not support advanced programming paradigms such as object-oriented, imperative and functional programming, nor does it have runtime or type-checking. Here we present a python-based full Implementation of OpenVX, which eliminates much of the discrepancies between the object-oriented paradigm used by many modern applications and the native C implementations. Our open-source implementation can be used for rapid development of OpenVX applications in embedded platforms. Demonstration includes static and real-time image acquisition and processing using a Raspberry Pi and a GoPro camera. Code is given as supplementary information. Code project and linked deployable virtual machine are located on GitHub: https://github.com/NBEL-lab/PythonOpenVX.
Bendable solid-state supercapacitors with Au nanoparticle-embedded graphene hydrogel films

Science.gov (United States)

Yang, Kyungwhan; Cho, Kyoungah; Yoon, Dae Sung; Kim, Sangsig

2017-01-01

In this study, we fabricate bendable solid-state supercapacitors with Au nanoparticle (NP)-embedded graphene hydrogel (GH) electrodes and investigate the influence of the Au NP embedment on the internal resistance and capacitive performance. Embedding the Au NPs into the GH electrodes results in a decrease of the internal resistance from 35 to 21 Ω, and a threefold reduction of the IR drop at a current density of 5 A/g when compared with GH electrodes without Au NPs. The Au NP-embedded GH supercapacitors (NP-GH SCs) exhibit excellent capacitive performances, with large specific capacitance (135 F/g) and high energy density (15.2 W·h/kg). Moreover, the NP-GH SCs exhibit comparable areal capacitance (168 mF/cm2) and operate under tensile/compressive bending. PMID:28074865
Mixed-Language High-Performance Computing for Plasma Simulations

Directory of Open Access Journals (Sweden)

Quanming Lu

2003-01-01

Full Text Available Java is receiving increasing attention as the most popular platform for distributed computing. However, programmers are still reluctant to embrace Java as a tool for writing scientific and engineering applications due to its still noticeable performance drawbacks compared with other programming languages such as Fortran or C. In this paper, we present a hybrid Java/Fortran implementation of a parallel particle-in-cell (PIC algorithm for plasma simulations. In our approach, the time-consuming components of this application are designed and implemented as Fortran subroutines, while less calculation-intensive components usually involved in building the user interface are written in Java. The two types of software modules have been glued together using the Java native interface (JNI. Our mixed-language PIC code was tested and its performance compared with pure Java and Fortran versions of the same algorithm on a Sun E6500 SMP system and a Linux cluster of Pentium~III machines.
Scheduling of network access for feedback-based embedded systems

Science.gov (United States)

Liberatore, Vincenzo

2002-07-01

nd communication capabilities. Examples range from smart dust embedded in building materials to networks of appliances in the home. Embedded devices will be deployed in unprecedented numbers, will enable pervasive distributed computing, and will radically change the way people interact with the surrounding environment [EGH00a]. The paper targets embedded systems and their real-time (RT) communication requirements. RT requirements arise from the
Minimum curvilinearity to enhance topological prediction of protein interactions by network embedding

KAUST Repository

Cannistraci, Carlo

2013-06-21

Motivation: Most functions within the cell emerge thanks to protein-protein interactions (PPIs), yet experimental determination of PPIs is both expensive and time-consuming. PPI networks present significant levels of noise and incompleteness. Predicting interactions using only PPI-network topology (topological prediction) is difficult but essential when prior biological knowledge is absent or unreliable.Methods: Network embedding emphasizes the relations between network proteins embedded in a low-dimensional space, in which protein pairs that are closer to each other represent good candidate interactions. To achieve network denoising, which boosts prediction performance, we first applied minimum curvilinear embedding (MCE), and then adopted shortest path (SP) in the reduced space to assign likelihood scores to candidate interactions. Furthermore, we introduce (i) a new valid variation of MCE, named non-centred MCE (ncMCE); (ii) two automatic strategies for selecting the appropriate embedding dimension; and (iii) two new randomized procedures for evaluating predictions.Results: We compared our method against several unsupervised and supervisedly tuned embedding approaches and node neighbourhood techniques. Despite its computational simplicity, ncMCE-SP was the overall leader, outperforming the current methods in topological link prediction.Conclusion: Minimum curvilinearity is a valuable non-linear framework that we successfully applied to the embedding of protein networks for the unsupervised prediction of novel PPIs. The rationale for our approach is that biological and evolutionary information is imprinted in the non-linear patterns hidden behind the protein network topology, and can be exploited for predicting new protein links. The predicted PPIs represent good candidates for testing in high-throughput experiments or for exploitation in systems biology tools such as those used for network-based inference and prediction of disease-related functional modules. The
Tackling some of the most intricate geophysical challenges via high-performance computing

Science.gov (United States)

Khosronejad, A.

2016-12-01

Recently, world has been witnessing significant enhancements in computing power of supercomputers. Computer clusters in conjunction with the advanced mathematical algorithms has set the stage for developing and applying powerful numerical tools to tackle some of the most intricate geophysical challenges that today`s engineers face. One such challenge is to understand how turbulent flows, in real-world settings, interact with (a) rigid and/or mobile complex bed bathymetry of waterways and sea-beds in the coastal areas; (b) objects with complex geometry that are fully or partially immersed; and (c) free-surface of waterways and water surface waves in the coastal area. This understanding is especially important because the turbulent flows in real-world environments are often bounded by geometrically complex boundaries, which dynamically deform and give rise to multi-scale and multi-physics transport phenomena, and characterized by multi-lateral interactions among various phases (e.g. air/water/sediment phases). Herein, I present some of the multi-scale and multi-physics geophysical fluid mechanics processes that I have attempted to study using an in-house high-performance computational model, the so-called VFS-Geophysics. More specifically, I will present the simulation results of turbulence/sediment/solute/turbine interactions in real-world settings. Parts of the simulations I present are performed to gain scientific insights into the processes such as sand wave formation (A. Khosronejad, and F. Sotiropoulos, (2014), Numerical simulation of sand waves in a turbulent open channel flow, Journal of Fluid Mechanics, 753:150-216), while others are carried out to predict the effects of climate change and large flood events on societal infrastructures ( A. Khosronejad, et al., (2016), Large eddy simulation of turbulence and solute transport in a forested headwater stream, Journal of Geophysical Research:, doi: 10.1002/2014JF003423).
The tracking performance of distributed recoverable flight control systems subject to high intensity radiated fields

Science.gov (United States)

Wang, Rui

It is known that high intensity radiated fields (HIRF) can produce upsets in digital electronics, and thereby degrade the performance of digital flight control systems. Such upsets, either from natural or man-made sources, can change data values on digital buses and memory and affect CPU instruction execution. HIRF environments are also known to trigger common-mode faults, affecting nearly-simultaneously multiple fault containment regions, and hence reducing the benefits of n-modular redundancy and other fault-tolerant computing techniques. Thus, it is important to develop models which describe the integration of the embedded digital system, where the control law is implemented, as well as the dynamics of the closed-loop system. In this dissertation, theoretical tools are presented to analyze the relationship between the design choices for a class of distributed recoverable computing platforms and the tracking performance degradation of a digital flight control system implemented on such a platform while operating in a HIRF environment. Specifically, a tractable hybrid performance model is developed for a digital flight control system implemented on a computing platform inspired largely by the NASA family of fault-tolerant, reconfigurable computer architectures known as SPIDER (scalable processor-independent design for enhanced reliability). The focus will be on the SPIDER implementation, which uses the computer communication system known as ROBUS-2 (reliable optical bus). A physical HIRF experiment was conducted at the NASA Langley Research Center in order to validate the theoretical tracking performance degradation predictions for a distributed Boeing 747 flight control system subject to a HIRF environment. An extrapolation of these results for scenarios that could not be physically tested is also presented.

Operating system concepts for embedded multicores

OpenAIRE

Horst, Oliver; Schmidt, Adriaan

2014-01-01

Currently we can see an increasing adoption of multi-core platforms in the area of embedded systems. While these new hardware platforms offer the potential to satisfy the ever increasing demand for computational power, they pose considerable challenges with regard to software development. This affects the application software itself, but also the system design and architecture. Here, we address the consequences for operating system architecture in embedded systems. After dis-cussing current a...
Parametric embedding for class visualization.

Science.gov (United States)

Iwata, Tomoharu; Saito, Kazumi; Ueda, Naonori; Stromsten, Sean; Griffiths, Thomas L; Tenenbaum, Joshua B

2007-09-01

We propose a new method, parametric embedding (PE), that embeds objects with the class structure into a low-dimensional visualization space. PE takes as input a set of class conditional probabilities for given data points and tries to preserve the structure in an embedding space by minimizing a sum of Kullback-Leibler divergences, under the assumption that samples are generated by a gaussian mixture with equal covariances in the embedding space. PE has many potential uses depending on the source of the input data, providing insight into the classifier's behavior in supervised, semisupervised, and unsupervised settings. The PE algorithm has a computational advantage over conventional embedding methods based on pairwise object relations since its complexity scales with the product of the number of objects and the number of classes. We demonstrate PE by visualizing supervised categorization of Web pages, semisupervised categorization of digits, and the relations of words and latent topics found by an unsupervised algorithm, latent Dirichlet allocation.
Enabling MPEG-2 video playback in embedded systems through improved data cache efficiency

Science.gov (United States)

Soderquist, Peter; Leeser, Miriam E.

1999-01-01

Digital video decoding, enabled by the MPEG-2 Video standard, is an important future application for embedded systems, particularly PDAs and other information appliances. Many such system require portability and wireless communication capabilities, and thus face severe limitations in size and power consumption. This places a premium on integration and efficiency, and favors software solutions for video functionality over specialized hardware. The processors in most embedded system currently lack the computational power needed to perform video decoding, but a related and equally important problem is the required data bandwidth, and the need to cost-effectively insure adequate data supply. MPEG data sets are very large, and generate significant amounts of excess memory traffic for standard data caches, up to 100 times the amount required for decoding. Meanwhile, cost and power limitations restrict cache sizes in embedded systems. Some systems, including many media processors, eliminate caches in favor of memories under direct, painstaking software control in the manner of digital signal processors. Yet MPEG data has locality which caches can exploit if properly optimized, providing fast, flexible, and automatic data supply. We propose a set of enhancements which target the specific needs of the heterogeneous types within the MPEG decoder working set. These optimizations significantly improve the efficiency of small caches, reducing cache-memory traffic by almost 70 percent, and can make an enhanced 4 KB cache perform better than a standard 1 MB cache. This performance improvement can enable high-resolution, full frame rate video playback in cheaper, smaller system than woudl otherwise be possible.
Computational fluid dynamics (CFD) assisted performance evaluation of the Twincer (TM) disposable high-dose dry powder inhaler

NARCIS (Netherlands)

de Boer, Anne H.; Hagedoorn, Paul; Woolhouse, Robert; Wynn, Ed

Objectives To use computational fluid dynamics (CFD) for evaluating and understanding the performance of the high-dose disposable Twincer (TM) dry powder inhaler, as well as to learn the effect of design modifications on dose entrainment, powder dispersion and retention behaviour. Methods Comparison
Embedded pitch adapters: A high-yield interconnection solution for strip sensors

Energy Technology Data Exchange (ETDEWEB)

Ullán, M., E-mail: miguel.ullan@imb-cnm.csic.es [Centro Nacional de Microelectronica (IMB-CNM, CSIC), Campus UAB-Bellaterra, 08193 Barcelona (Spain); Allport, P.P.; Baca, M.; Broughton, J.; Chisholm, A.; Nikolopoulos, K.; Pyatt, S.; Thomas, J.P.; Wilson, J.A. [School of Physics and Astronomy, University of Birmingham, Birmingham B15 2TT (United Kingdom); Kierstead, J.; Kuczewski, P.; Lynn, D. [Brookhaven National Laboratory, Physics Department and Instrumentation Division, Upton, NY 11973-5000 (United States); Hommels, L.B.A. [Cavendish Laboratory, University of Cambridge, JJ Thomson Avenue, Cambridge CB3 0HE (United Kingdom); Fleta, C.; Fernandez-Tejero, J.; Quirion, D. [Centro Nacional de Microelectronica (IMB-CNM, CSIC), Campus UAB-Bellaterra, 08193 Barcelona (Spain); Bloch, I.; Díez, S.; Gregor, I.M.; Lohwasser, K. [DESY, Notkestrasse 85, 22607 Hamburg (Germany); and others

2016-09-21

A proposal to fabricate large area strip sensors with integrated, or embedded, pitch adapters is presented for the End-cap part of the Inner Tracker in the ATLAS experiment. To implement the embedded pitch adapters, a second metal layer is used in the sensor fabrication, for signal routing to the ASICs. Sensors with different embedded pitch adapters have been fabricated in order to optimize the design and technology. Inter-strip capacitance, noise, pick-up, cross-talk, signal efficiency, and fabrication yield have been taken into account in their design and fabrication. Inter-strip capacitance tests taking into account all channel neighbors reveal the important differences between the various designs considered. These tests have been correlated with noise figures obtained in full assembled modules, showing that the tests performed on the bare sensors are a valid tool to estimate the final noise in the full module. The full modules have been subjected to test beam experiments in order to evaluate the incidence of cross-talk, pick-up, and signal loss. The detailed analysis shows no indication of cross-talk or pick-up as no additional hits can be observed in any channel not being hit by the beam above 170 mV threshold, and the signal in those channels is always below 1% of the signal recorded in the channel being hit, above 100 mV threshold. First results on irradiated mini-sensors with embedded pitch adapters do not show any change in the interstrip capacitance measurements with only the first neighbors connected.
Big Data and High-Performance Computing in Global Seismology

Science.gov (United States)

Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Peter, Daniel; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen

2014-05-01

Much of our knowledge of Earth's interior is based on seismic observations and measurements. Adjoint methods provide an efficient way of incorporating 3D full wave propagation in iterative seismic inversions to enhance tomographic images and thus our understanding of processes taking place inside the Earth. Our aim is to take adjoint tomography, which has been successfully applied to regional and continental scale problems, further to image the entire planet. This is one of the extreme imaging challenges in seismology, mainly due to the intense computational requirements and vast amount of high-quality seismic data that can potentially be assimilated. We have started low-resolution inversions (T > 30 s and T > 60 s for body and surface waves, respectively) with a limited data set (253 carefully selected earthquakes and seismic data from permanent and temporary networks) on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Recent improvements in our 3D global wave propagation solvers, such as a GPU version of the SPECFEM3D_GLOBE package, will enable us perform higher-resolution (T > 9 s) and longer duration (~180 m) simulations to take the advantage of high-frequency body waves and major-arc surface waves, thereby improving imbalanced ray coverage as a result of the uneven global distribution of sources and receivers. Our ultimate goal is to use all earthquakes in the global CMT catalogue within the magnitude range of our interest and data from all available seismic networks. To take the full advantage of computational resources, we need a solid framework to manage big data sets during numerical simulations, pre-processing (i.e., data requests and quality checks, processing data, window selection, etc.) and post-processing (i.e., pre-conditioning and smoothing kernels, etc.). We address the bottlenecks in our global seismic workflow, which are mainly coming from heavy I/O traffic during simulations and the pre- and post-processing stages, by defining new data
Embedded Empiricisms in Soft Soil Technology

Science.gov (United States)

Wijeyesekera, D. C.; John, L. M. S. Alvin; Adnan, Z.

2016-07-01

Civil engineers of today are continuously challenged by innovative projects that push further the knowledge boundaries with conceptual and/or ingenious solutions leading to the realization of that once was considered impossible in the realms of geotechnology. Some of the forward developments rely on empirical methods embedded within soft soil technology and the spectral realms of engineering in its entirety. Empiricisms unlike folklore are not always shrouded in mysticism but can find scientific reasoning to justify them being adopted in design and tangible construction projects. This lecture therefore is an outline exposition of how empiricism has been integrally embedded in total empirical beginnings in the evolution of soft soil technology from the Renaissance time, through the developments of soil mechanics in the 19th century which in turn has paved the way to the rise of computational soil mechanics. Developments in computational soil mechanics has always embraced and are founded on a wide backdrop of empirical geoenvironment simulations. However, it is imperative that a competent geotechnical engineer needs postgraduate training combined with empiricism that is based on years of well- winnowed practical experience to fathom the diverseness and complexity of nature. However, experience being regarded more highly than expertise can, perhaps inadvertently, inhibit development and innovation.
Miniature high speed compressor having embedded permanent magnet motor

Science.gov (United States)

Zhou, Lei (Inventor); Zheng, Liping (Inventor); Chow, Louis (Inventor); Kapat, Jayanta S. (Inventor); Wu, Thomas X. (Inventor); Kota, Krishna M. (Inventor); Li, Xiaoyi (Inventor); Acharya, Dipjyoti (Inventor)

2011-01-01

A high speed centrifugal compressor for compressing fluids includes a permanent magnet synchronous motor (PMSM) having a hollow shaft, the being supported on its ends by ball bearing supports. A permanent magnet core is embedded inside the shaft. A stator with a winding is located radially outward of the shaft. The PMSM includes a rotor including at least one impeller secured to the shaft or integrated with the shaft as a single piece. The rotor is a high rigidity rotor providing a bending mode speed of at least 100,000 RPM which advantageously permits implementation of relatively low-cost ball bearing supports.
An embedded control and acquisition system for multichannel detectors

International Nuclear Information System (INIS)

Gori, L.; Tommasini, R.; Cautero, G.; Giuressi, D.; Barnaba, M.; Accardo, A.; Carrato, S.; Paolucci, G.

1999-01-01

We present a pulse counting multichannel data acquisition system, characterized by the high number of high speed acquisition channels, and by the modular, embedded system architecture. The former leads to very fast acquisitions and allows to obtain sequences of snapshots, for the study of time dependent phenomena. The latter, thanks to the integration of a CPU into the system, provides high computational capabilities, so that the interfacing with the user computer is very simple and user friendly. Moreover, the user computer is free from control and acquisition tasks. The system has been developed for one of the beamlines of the third generation synchrotron radiation sources ELETTRA, and because of the modular architecture can be useful in various other kinds of experiments, where parallel acquisition, high data rates, and user friendliness are required. First experimental results on a double pass hemispherical electron analyser provided with a 96 channel detector confirm the validity of the approach. (author)
Efficient Implementation of Solvers for Linear Model Predictive Control on Embedded Devices

DEFF Research Database (Denmark)

Frison, Gianluca; Kwame Minde Kufoalor, D.; Imsland, Lars

2014-01-01

This paper proposes a novel approach for the efficient implementation of solvers for linear MPC on embedded devices. The main focus is to explain in detail the approach used to optimize the linear algebra for selected low-power embedded devices, and to show how the high-performance implementation...
Designing a High Performance Parallel Personal Cluster

OpenAIRE

Kapanova, K. G.; Sellier, J. M.

2016-01-01

Today, many scientific and engineering areas require high performance computing to perform computationally intensive experiments. For example, many advances in transport phenomena, thermodynamics, material properties, computational chemistry and physics are possible only because of the availability of such large scale computing infrastructures. Yet many challenges are still open. The cost of energy consumption, cooling, competition for resources have been some of the reasons why the scientifi...
Static Schedulers for Embedded Real-Time Systems

Science.gov (United States)

1989-12-01

Because of the need for having efficient scheduling algorithms in large scale real time systems , software engineers put a lot of effort on developing...provide static schedulers for he Embedded Real Time Systems with single processor using Ada programming language. The independent nonpreemptable...support the Computer Aided Rapid Prototyping for Embedded Real Time Systems so that we determine whether the system, as designed, meets the required
High-Performance Operating Systems

DEFF Research Database (Denmark)

Sharp, Robin

1999-01-01

Notes prepared for the DTU course 49421 "High Performance Operating Systems". The notes deal with quantitative and qualitative techniques for use in the design and evaluation of operating systems in computer systems for which performance is an important parameter, such as real-time applications......, communication systems and multimedia systems....
Reward-based learning under hardware constraints - Using a RISC processor embedded in a neuromorphic substrate

Directory of Open Access Journals (Sweden)

Simon eFriedmann

2013-09-01

Full Text Available In this study, we propose and analyze in simulations a new, highly flexible method of imple-menting synaptic plasticity in a wafer-scale, accelerated neuromorphic hardware system. Thestudy focuses on globally modulated STDP, as a special use-case of this method. Flexibility isachieved by embedding a general-purpose processor dedicated to plasticity into the wafer. Toevaluate the suitability of the proposed system, we use a reward modulated STDP rule in a spiketrain learning task. A single layer of neurons is trained to fire at specific points in time withonly the reward as feedback. This model is simulated to measure its performance, i.e. the in-crease in received reward after learning. Using this performance as baseline, we then simulatethe model with various constraints imposed by the proposed implementation and compare theperformance. The simulated constraints include discretized synaptic weights, a restricted inter-face between analog synapses and embedded processor, and mismatch of analog circuits. Wefind that probabilistic updates can increase the performance of low-resolution weights, a simpleinterface between analog synapses and processor is sufficient for learning, and performance isinsensitive to mismatch. Further, we consider communication latency between wafer and theconventional control computer system that is simulating the environment. This latency increasesthe delay, with which the reward is sent to the embedded processor. Because of the time continu-ous operation of the analog synapses, delay can cause a deviation of the updates as compared tothe not delayed situation. We find that for highly accelerated systems latency has to be kept to aminimum. This study demonstrates the suitability of the proposed implementation to emulatethe selected reward modulated STDP learning rule. It is therefore an ideal candidate for imple-mentation in an upgraded version of the wafer-scale system developed within the BrainScaleSproject.
Chromatin accessibility prediction via convolutional long short-term memory networks with k-mer embedding.

Science.gov (United States)

Min, Xu; Zeng, Wanwen; Chen, Ning; Chen, Ting; Jiang, Rui

2017-07-15

Experimental techniques for measuring chromatin accessibility are expensive and time consuming, appealing for the development of computational approaches to predict open chromatin regions from DNA sequences. Along this direction, existing methods fall into two classes: one based on handcrafted k -mer features and the other based on convolutional neural networks. Although both categories have shown good performance in specific applications thus far, there still lacks a comprehensive framework to integrate useful k -mer co-occurrence information with recent advances in deep learning. We fill this gap by addressing the problem of chromatin accessibility prediction with a convolutional Long Short-Term Memory (LSTM) network with k -mer embedding. We first split DNA sequences into k -mers and pre-train k -mer embedding vectors based on the co-occurrence matrix of k -mers by using an unsupervised representation learning approach. We then construct a supervised deep learning architecture comprised of an embedding layer, three convolutional layers and a Bidirectional LSTM (BLSTM) layer for feature learning and classification. We demonstrate that our method gains high-quality fixed-length features from variable-length sequences and consistently outperforms baseline methods. We show that k -mer embedding can effectively enhance model performance by exploring different embedding strategies. We also prove the efficacy of both the convolution and the BLSTM layers by comparing two variations of the network architecture. We confirm the robustness of our model to hyper-parameters by performing sensitivity analysis. We hope our method can eventually reinforce our understanding of employing deep learning in genomic studies and shed light on research regarding mechanisms of chromatin accessibility. The source code can be downloaded from https://github.com/minxueric/ismb2017_lstm . tingchen@tsinghua.edu.cn or ruijiang@tsinghua.edu.cn. Supplementary materials are available at
High Performance Computing Facility Operational Assessment, FY 2011 Oak Ridge Leadership Computing Facility

Energy Technology Data Exchange (ETDEWEB)

Baker, Ann E [ORNL; Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL

2011-08-01

Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.5 billion core hours in calendar year (CY) 2010 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Scientific achievements by OLCF users range from collaboration with university experimentalists to produce a working supercapacitor that uses atom-thick sheets of carbon materials to finely determining the resolution requirements for simulations of coal gasifiers and their components, thus laying the foundation for development of commercial-scale gasifiers. OLCF users are pushing the boundaries with software applications sustaining more than one petaflop of performance in the quest to illuminate the fundamental nature of electronic devices. Other teams of researchers are working to resolve predictive capabilities of climate models, to refine and validate genome sequencing, and to explore the most fundamental materials in nature - quarks and gluons - and their unique properties. Details of these scientific endeavors - not possible without access to leadership-class computing resources - are detailed in Section 4 of this report and in the INCITE in Review. Effective operations of the OLCF play a key role in the scientific missions and accomplishments of its users. This Operational Assessment Report (OAR) will delineate the policies, procedures, and innovations implemented by the OLCF to continue delivering a petaflop-scale resource for cutting-edge research. The 2010 operational assessment of the OLCF yielded recommendations that have been addressed (Reference Section 1) and
SCinet Architecture: Featured at the International Conference for High Performance Computing,Networking, Storage and Analysis 2016

Energy Technology Data Exchange (ETDEWEB)

Lyonnais, Marc; Smith, Matt; Mace, Kate P.

2017-02-06

SCinet is the purpose-built network that operates during the International Conference for High Performance Computing,Networking, Storage and Analysis (Super Computing or SC). Created each year for the conference, SCinet brings to life a high-capacity network that supports applications and experiments that are a hallmark of the SC conference. The network links the convention center to research and commercial networks around the world. This resource serves as a platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of applications. Volunteers from academia, government and industry work together to design and deliver the SCinet infrastructure. Industry vendors and carriers donate millions of dollars in equipment and services needed to build and support the local and wide area networks. Planning begins more than a year in advance of each SC conference and culminates in a high intensity installation in the days leading up to the conference. The SCinet architecture for SC16 illustrates a dramatic increase in participation from the vendor community, particularly those that focus on network equipment. Software-Defined Networking (SDN) and Data Center Networking (DCN) are present in nearly all aspects of the design.
Embedding of solid high-level wastes into metal and non-metal matrices

International Nuclear Information System (INIS)

Geel, J. van; Eschrich, H.; Dobbels, F.; Favre, P.; Sterner, H.

1980-03-01

The primary objective of embedding solidification high-level waste forms of high specific activity into a matrix material is to obtain final waste composites with moderate inner temperatures, even at large waste loadings per meter cylinder length. Secondary objectives are to produce a non-porous, crack-free composite product with a durability superior to that of the embedded waste form itself. The temperature distribution in composite material composed of vitreous beads embedded into a metal matrix (vitromets) are compared with that in a vitreous block, of equal heat generation per meter height, during short- and long-term storage. It was found that for storage under water, inner temperatures below 100 0 C are assured in vitromets, produced from short-cooled high-level wastes, and containing high waste loadings per metercanister height. The chemical and mechanical stability, as well as the thermal conductivity have been examined for vitromets containing various matrix materials whereby emphasis is imparted to lead- and aluminum alloys. The corrosion of lead- and aluminum alloys in distilled water, brine solution and dry salt has been examined at temperatures up to 230 0 C and pressures up to 3.5 MPa. Some lead alloys were found to exhibit superior corrosion resistance in these chemical environments than certain reference borosilicate glasses. The deformation behavior of vitromets under axial compression has been investigated at different temperatures and varying height diameter ratios. The maturity of the vitromet production is finally demonstrated by presenting process data from hot-laboratory scale and cold semi-industrial scale production units. (author)
Correlation coefficient based supervised locally linear embedding for pulmonary nodule recognition.

Science.gov (United States)

Wu, Panpan; Xia, Kewen; Yu, Hengyong

2016-11-01

Dimensionality reduction techniques are developed to suppress the negative effects of high dimensional feature space of lung CT images on classification performance in computer aided detection (CAD) systems for pulmonary nodule detection. An improved supervised locally linear embedding (SLLE) algorithm is proposed based on the concept of correlation coefficient. The Spearman's rank correlation coefficient is introduced to adjust the distance metric in the SLLE algorithm to ensure that more suitable neighborhood points could be identified, and thus to enhance the discriminating power of embedded data. The proposed Spearman's rank correlation coefficient based SLLE (SC(2)SLLE) is implemented and validated in our pilot CAD system using a clinical dataset collected from the publicly available lung image database consortium and image database resource initiative (LICD-IDRI). Particularly, a representative CAD system for solitary pulmonary nodule detection is designed and implemented. After a sequential medical image processing steps, 64 nodules and 140 non-nodules are extracted, and 34 representative features are calculated. The SC(2)SLLE, as well as SLLE and LLE algorithm, are applied to reduce the dimensionality. Several quantitative measurements are also used to evaluate and compare the performances. Using a 5-fold cross-validation methodology, the proposed algorithm achieves 87.65% accuracy, 79.23% sensitivity, 91.43% specificity, and 8.57% false positive rate, on average. Experimental results indicate that the proposed algorithm outperforms the original locally linear embedding and SLLE coupled with the support vector machine (SVM) classifier. Based on the preliminary results from a limited number of nodules in our dataset, this study demonstrates the great potential to improve the performance of a CAD system for nodule detection using the proposed SC(2)SLLE. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Highly Adaptive Primary Mirror Having Embedded Actuators, Sensors, and Neural Control, Phase II

Data.gov (United States)

National Aeronautics and Space Administration — Xinetics has demonstrated the technology required to fabricate a self-compensating highly adaptive silicon carbide primary mirror system having embedded actuators,...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.