WorldWideScience

Sample records for perform parallel-processing computing

  1. High Performance Parallel Processing Project: Industrial computing initiative. Progress reports for fiscal year 1995

    Energy Technology Data Exchange (ETDEWEB)

    Koniges, A.

    1996-02-09

    This project is a package of 11 individual CRADA`s plus hardware. This innovative project established a three-year multi-party collaboration that is significantly accelerating the availability of commercial massively parallel processing computing software technology to U.S. government, academic, and industrial end-users. This report contains individual presentations from nine principal investigators along with overall program information.

  2. Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

    Science.gov (United States)

    Moon, Hongsik

    What is the impact of multicore and associated advanced technologies on computational software for science? Most researchers and students have multicore laptops or desktops for their research and they need computing power to run computational software packages. Computing power was initially derived from Central Processing Unit (CPU) clock speed. That changed when increases in clock speed became constrained by power requirements. Chip manufacturers turned to multicore CPU architectures and associated technological advancements to create the CPUs for the future. Most software applications benefited by the increased computing power the same way that increases in clock speed helped applications run faster. However, for Computational ElectroMagnetics (CEM) software developers, this change was not an obvious benefit - it appeared to be a detriment. Developers were challenged to find a way to correctly utilize the advancements in hardware so that their codes could benefit. The solution was parallelization and this dissertation details the investigation to address these challenges. Prior to multicore CPUs, advanced computer technologies were compared with the performance using benchmark software and the metric was FLoting-point Operations Per Seconds (FLOPS) which indicates system performance for scientific applications that make heavy use of floating-point calculations. Is FLOPS an effective metric for parallelized CEM simulation tools on new multicore system? Parallel CEM software needs to be benchmarked not only by FLOPS but also by the performance of other parameters related to type and utilization of the hardware, such as CPU, Random Access Memory (RAM), hard disk, network, etc. The codes need to be optimized for more than just FLOPs and new parameters must be included in benchmarking. In this dissertation, the parallel CEM software named High Order Basis Based Integral Equation Solver (HOBBIES) is introduced. This code was developed to address the needs of the

  3. Preliminary Study on the Enhancement of Reconstruction Speed for Emission Computed Tomography Using Parallel Processing

    International Nuclear Information System (INIS)

    Park, Min Jae; Lee, Jae Sung; Kim, Soo Mee; Kang, Ji Yeon; Lee, Dong Soo; Park, Kwang Suk

    2009-01-01

    Conventional image reconstruction uses simplified physical models of projection. However, real physics, for example 3D reconstruction, takes too long time to process all the data in clinic and is unable in a common reconstruction machine because of the large memory for complex physical models. We suggest the realistic distributed memory model of fast-reconstruction using parallel processing on personal computers to enable large-scale technologies. The preliminary tests for the possibility on virtual machines and various performance test on commercial super computer, Tachyon were performed. Expectation maximization algorithm with common 2D projection and realistic 3D line of response were tested. Since the process time was getting slower (max 6 times) after a certain iteration, optimization for compiler was performed to maximize the efficiency of parallelization. Parallel processing of a program on multiple computers was available on Linux with MPICH and NFS. We verified that differences between parallel processed image and single processed image at the same iterations were under the significant digits of floating point number, about 6 bit. Double processors showed good efficiency (1.96 times) of parallel computing. Delay phenomenon was solved by vectorization method using SSE. Through the study, realistic parallel computing system in clinic was established to be able to reconstruct by plenty of memory using the realistic physical models which was impossible to simplify

  4. Parallel processing using an optical delay-based reservoir computer

    Science.gov (United States)

    Van der Sande, Guy; Nguimdo, Romain Modeste; Verschaffelt, Guy

    2016-04-01

    Delay systems subject to delayed optical feedback have recently shown great potential in solving computationally hard tasks. By implementing a neuro-inspired computational scheme relying on the transient response to optical data injection, high processing speeds have been demonstrated. However, reservoir computing systems based on delay dynamics discussed in the literature are designed by coupling many different stand-alone components which lead to bulky, lack of long-term stability, non-monolithic systems. Here we numerically investigate the possibility of implementing reservoir computing schemes based on semiconductor ring lasers. Semiconductor ring lasers are semiconductor lasers where the laser cavity consists of a ring-shaped waveguide. SRLs are highly integrable and scalable, making them ideal candidates for key components in photonic integrated circuits. SRLs can generate light in two counterpropagating directions between which bistability has been demonstrated. We demonstrate that two independent machine learning tasks , even with different nature of inputs with different input data signals can be simultaneously computed using a single photonic nonlinear node relying on the parallelism offered by photonics. We illustrate the performance on simultaneous chaotic time series prediction and a classification of the Nonlinear Channel Equalization. We take advantage of different directional modes to process individual tasks. Each directional mode processes one individual task to mitigate possible crosstalk between the tasks. Our results indicate that prediction/classification with errors comparable to the state-of-the-art performance can be obtained even with noise despite the two tasks being computed simultaneously. We also find that a good performance is obtained for both tasks for a broad range of the parameters. The results are discussed in detail in [Nguimdo et al., IEEE Trans. Neural Netw. Learn. Syst. 26, pp. 3301-3307, 2015

  5. Application of the parallel processing computer to a nuclear disaster prevention support system

    Energy Technology Data Exchange (ETDEWEB)

    Shigehiro, Nukatsuka; Osami, Watanabe [Mitsubishi Heavy Industries, LTD (Japan)

    2003-07-01

    At the time of nuclear emergency, it is important to identify the type and the cause of the accident. Besides with these, it is also important to provide adequate information for the emergency response organization to support decision making by predicting and evaluating the development of the event and the influence of the release of radioactivity for the environment. Recently, a new type of nuclear disaster prevention support system called MEASURES (Multiple Radiological Emergency Assistance System for Urgent Response) was developed which provides not only the current state of the nuclear power plant and the influence of the radioactivity for the environment, but also the future prediction of the accident development. In order to provide the accurate results of these analyses quickly, MEASURES utilizes various techniques, such as multiple nesting method which narrows down the calculation area gradually, and parallel processing computer for three dimensional analyses, such as air current distribution analysis. In this paper, the outline and the feature of MEASURES are presented, especially focused on the usage of parallel processing computer for the three dimensional air current distribution analysis. (authors)

  6. Application of the parallel processing computer to a nuclear disaster prevention support system

    International Nuclear Information System (INIS)

    Shigehiro, Nukatsuka; Osami, Watanabe

    2003-01-01

    At the time of nuclear emergency, it is important to identify the type and the cause of the accident. Besides with these, it is also important to provide adequate information for the emergency response organization to support decision making by predicting and evaluating the development of the event and the influence of the release of radioactivity for the environment. Recently, a new type of nuclear disaster prevention support system called MEASURES (Multiple Radiological Emergency Assistance System for Urgent Response) was developed which provides not only the current state of the nuclear power plant and the influence of the radioactivity for the environment, but also the future prediction of the accident development. In order to provide the accurate results of these analyses quickly, MEASURES utilizes various techniques, such as multiple nesting method which narrows down the calculation area gradually, and parallel processing computer for three dimensional analyses, such as air current distribution analysis. In this paper, the outline and the feature of MEASURES are presented, especially focused on the usage of parallel processing computer for the three dimensional air current distribution analysis. (authors)

  7. Parallel processing algorithms for hydrocodes on a computer with MIMD architecture (DENELCOR's HEP)

    International Nuclear Information System (INIS)

    Hicks, D.L.

    1983-11-01

    In real time simulation/prediction of complex systems such as water-cooled nuclear reactors, if reactor operators had fast simulator/predictors to check the consequences of their operations before implementing them, events such as the incident at Three Mile Island might be avoided. However, existing simulator/predictors such as RELAP run slower than real time on serial computers. It appears that the only way to overcome the barrier to higher computing rates is to use computers with architectures that allow concurrent computations or parallel processing. The computer architecture with the greatest degree of parallelism is labeled Multiple Instruction Stream, Multiple Data Stream (MIMD). An example of a machine of this type is the HEP computer by DENELCOR. It appears that hydrocodes are very well suited for parallelization on the HEP. It is a straightforward exercise to parallelize explicit, one-dimensional Lagrangean hydrocodes in a zone-by-zone parallelization. Similarly, implicit schemes can be parallelized in a zone-by-zone fashion via an a priori, symbolic inversion of the tridiagonal matrix that arises in an implicit scheme. These techniques are extended to Eulerian hydrocodes by using Harlow's rezone technique. The extension from single-phase Eulerian to two-phase Eulerian is straightforward. This step-by-step extension leads to hydrocodes with zone-by-zone parallelization that are capable of two-phase flow simulation. Extensions to two and three spatial dimensions can be achieved by operator splitting. It appears that a zone-by-zone parallelization is the best way to utilize the capabilities of an MIMD machine. 40 references

  8. Distributed and cloud computing from parallel processing to the Internet of Things

    CERN Document Server

    Hwang, Kai; Fox, Geoffrey C

    2012-01-01

    Distributed and Cloud Computing, named a 2012 Outstanding Academic Title by the American Library Association's Choice publication, explains how to create high-performance, scalable, reliable systems, exposing the design principles, architecture, and innovative applications of parallel, distributed, and cloud computing systems. Starting with an overview of modern distributed models, the book provides comprehensive coverage of distributed and cloud computing, including: Facilitating management, debugging, migration, and disaster recovery through virtualization Clustered systems for resear

  9. Parallel Processing Performance Evaluation of Mixed T10/T100 Ethernet Topologies on Linux Pentium Systems

    National Research Council Canada - National Science Library

    Decato, Steven

    1997-01-01

    ... performed on relatively inexpensive off the shelf components. Alternative network topologies were implemented using 10 and 100 megabit-per-second Ethernet cards under the Linux operating system on Pentium based personal computer platforms...

  10. Parallel processing for fluid dynamics applications

    International Nuclear Information System (INIS)

    Johnson, G.M.

    1989-01-01

    The impact of parallel processing on computational science and, in particular, on computational fluid dynamics is growing rapidly. In this paper, particular emphasis is given to developments which have occurred within the past two years. Parallel processing is defined and the reasons for its importance in high-performance computing are reviewed. Parallel computer architectures are classified according to the number and power of their processing units, their memory, and the nature of their connection scheme. Architectures which show promise for fluid dynamics applications are emphasized. Fluid dynamics problems are examined for parallelism inherent at the physical level. CFD algorithms and their mappings onto parallel architectures are discussed. Several example are presented to document the performance of fluid dynamics applications on present-generation parallel processing devices

  11. A learnable parallel processing architecture towards unity of memory and computing.

    Science.gov (United States)

    Li, H; Gao, B; Chen, Z; Zhao, Y; Huang, P; Ye, H; Liu, L; Liu, X; Kang, J

    2015-08-14

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named "iMemComp", where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped "iMemComp" with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on "iMemComp" can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  12. A learnable parallel processing architecture towards unity of memory and computing

    Science.gov (United States)

    Li, H.; Gao, B.; Chen, Z.; Zhao, Y.; Huang, P.; Ye, H.; Liu, L.; Liu, X.; Kang, J.

    2015-08-01

    Developing energy-efficient parallel information processing systems beyond von Neumann architecture is a long-standing goal of modern information technologies. The widely used von Neumann computer architecture separates memory and computing units, which leads to energy-hungry data movement when computers work. In order to meet the need of efficient information processing for the data-driven applications such as big data and Internet of Things, an energy-efficient processing architecture beyond von Neumann is critical for the information society. Here we show a non-von Neumann architecture built of resistive switching (RS) devices named “iMemComp”, where memory and logic are unified with single-type devices. Leveraging nonvolatile nature and structural parallelism of crossbar RS arrays, we have equipped “iMemComp” with capabilities of computing in parallel and learning user-defined logic functions for large-scale information processing tasks. Such architecture eliminates the energy-hungry data movement in von Neumann computers. Compared with contemporary silicon technology, adder circuits based on “iMemComp” can improve the speed by 76.8% and the power dissipation by 60.3%, together with a 700 times aggressive reduction in the circuit area.

  13. Performance of MPI parallel processing implemented by MCNP5/ MCNPX for criticality benchmark problems

    International Nuclear Information System (INIS)

    Mark Dennis Usang; Mohd Hairie Rabir; Mohd Amin Sharifuldin Salleh; Mohamad Puad Abu

    2012-01-01

    MPI parallelism are implemented on a SUN Workstation for running MCNPX and on the High Performance Computing Facility (HPC) for running MCNP5. 23 input less obtained from MCNP Criticality Validation Suite are utilized for the purpose of evaluating the amount of speed up achievable by using the parallel capabilities of MPI. More importantly, we will study the economics of using more processors and the type of problem where the performance gain are obvious. This is important to enable better practices of resource sharing especially for the HPC facilities processing time. Future endeavours in this direction might even reveal clues for best MCNP5/ MCNPX coding practices for optimum performance of MPI parallelisms. (author)

  14. Eighth SIAM conference on parallel processing for scientific computing: Final program and abstracts

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This SIAM conference is the premier forum for developments in parallel numerical algorithms, a field that has seen very lively and fruitful developments over the past decade, and whose health is still robust. Themes for this conference were: combinatorial optimization; data-parallel languages; large-scale parallel applications; message-passing; molecular modeling; parallel I/O; parallel libraries; parallel software tools; parallel compilers; particle simulations; problem-solving environments; and sparse matrix computations.

  15. Biological neural networks as model systems for designing future parallel processing computers

    Science.gov (United States)

    Ross, Muriel D.

    1991-01-01

    One of the more interesting debates of the present day centers on whether human intelligence can be simulated by computer. The author works under the premise that neurons individually are not smart at all. Rather, they are physical units which are impinged upon continuously by other matter that influences the direction of voltage shifts across the units membranes. It is only the action of a great many neurons, billions in the case of the human nervous system, that intelligent behavior emerges. What is required to understand even the simplest neural system is painstaking analysis, bit by bit, of the architecture and the physiological functioning of its various parts. The biological neural network studied, the vestibular utricular and saccular maculas of the inner ear, are among the most simple of the mammalian neural networks to understand and model. While there is still a long way to go to understand even this most simple neural network in sufficient detail for extrapolation to computers and robots, a start was made. Moreover, the insights obtained and the technologies developed help advance the understanding of the more complex neural networks that underlie human intelligence.

  16. Parallel Processing and Bio-inspired Computing for Biomedical Image Registration

    Directory of Open Access Journals (Sweden)

    Silviu Ioan Bejinariu

    2014-07-01

    Full Text Available Image Registration (IR is an optimization problem computing optimal parameters of a geometric transform used to overlay one or more source images to a given model by maximizing a similarity measure. In this paper the use of bio-inspired optimization algorithms in image registration is analyzed. Results obtained by means of three different algorithms are compared: Bacterial Foraging Optimization Algorithm (BFOA, Genetic Algorithm (GA and Clonal Selection Algorithm (CSA. Depending on the images type, the registration may be: area based, which is slow but more precise, and features based, which is faster. In this paper a feature based approach based on the Scale Invariant Feature Transform (SIFT is proposed. Finally, results obtained using sequential and parallel implementations on multi-core systems for area based and features based image registration are compared.

  17. Parallel processing of structural integrity analysis codes

    International Nuclear Information System (INIS)

    Swami Prasad, P.; Dutta, B.K.; Kushwaha, H.S.

    1996-01-01

    Structural integrity analysis forms an important role in assessing and demonstrating the safety of nuclear reactor components. This analysis is performed using analytical tools such as Finite Element Method (FEM) with the help of digital computers. The complexity of the problems involved in nuclear engineering demands high speed computation facilities to obtain solutions in reasonable amount of time. Parallel processing systems such as ANUPAM provide an efficient platform for realising the high speed computation. The development and implementation of software on parallel processing systems is an interesting and challenging task. The data and algorithm structure of the codes plays an important role in exploiting the parallel processing system capabilities. Structural analysis codes based on FEM can be divided into two categories with respect to their implementation on parallel processing systems. The first category codes such as those used for harmonic analysis, mechanistic fuel performance codes need not require the parallelisation of individual modules of the codes. The second category of codes such as conventional FEM codes require parallelisation of individual modules. In this category, parallelisation of equation solution module poses major difficulties. Different solution schemes such as domain decomposition method (DDM), parallel active column solver and substructuring method are currently used on parallel processing systems. Two codes, FAIR and TABS belonging to each of these categories have been implemented on ANUPAM. The implementation details of these codes and the performance of different equation solvers are highlighted. (author). 5 refs., 12 figs., 1 tab

  18. The parallel processing of EGS4 code on distributed memory scalar parallel computer:Intel Paragon XP/S15-256

    Energy Technology Data Exchange (ETDEWEB)

    Takemiya, Hiroshi; Ohta, Hirofumi; Honma, Ichirou

    1996-03-01

    The parallelization of Electro-Magnetic Cascade Monte Carlo Simulation Code, EGS4 on distributed memory scalar parallel computer: Intel Paragon XP/S15-256 is described. EGS4 has the feature that calculation time for one incident particle is quite different from each other because of the dynamic generation of secondary particles and different behavior of each particle. Granularity for parallel processing, parallel programming model and the algorithm of parallel random number generation are discussed and two kinds of method, each of which allocates particles dynamically or statically, are used for the purpose of realizing high speed parallel processing of this code. Among four problems chosen for performance evaluation, the speedup factors for three problems have been attained to nearly 100 times with 128 processor. It has been found that when both the calculation time for each incident particles and its dispersion are large, it is preferable to use dynamic particle allocation method which can average the load for each processor. And it has also been found that when they are small, it is preferable to use static particle allocation method which reduces the communication overhead. Moreover, it is pointed out that to get the result accurately, it is necessary to use double precision variables in EGS4 code. Finally, the workflow of program parallelization is analyzed and tools for program parallelization through the experience of the EGS4 parallelization are discussed. (author).

  19. Parallel processing for artificial intelligence 1

    CERN Document Server

    Kanal, LN; Kumar, V; Suttner, CB

    1994-01-01

    Parallel processing for AI problems is of great current interest because of its potential for alleviating the computational demands of AI procedures. The articles in this book consider parallel processing for problems in several areas of artificial intelligence: image processing, knowledge representation in semantic networks, production rules, mechanization of logic, constraint satisfaction, parsing of natural language, data filtering and data mining. The publication is divided into six sections. The first addresses parallel computing for processing and understanding images. The second discus

  20. Advanced parallel processing with supercomputer architectures

    International Nuclear Information System (INIS)

    Hwang, K.

    1987-01-01

    This paper investigates advanced parallel processing techniques and innovative hardware/software architectures that can be applied to boost the performance of supercomputers. Critical issues on architectural choices, parallel languages, compiling techniques, resource management, concurrency control, programming environment, parallel algorithms, and performance enhancement methods are examined and the best answers are presented. The authors cover advanced processing techniques suitable for supercomputers, high-end mainframes, minisupers, and array processors. The coverage emphasizes vectorization, multitasking, multiprocessing, and distributed computing. In order to achieve these operation modes, parallel languages, smart compilers, synchronization mechanisms, load balancing methods, mapping parallel algorithms, operating system functions, application library, and multidiscipline interactions are investigated to ensure high performance. At the end, they assess the potentials of optical and neural technologies for developing future supercomputers

  1. Linear parallel processing machines I

    Energy Technology Data Exchange (ETDEWEB)

    Von Kunze, M

    1984-01-01

    As is well-known, non-context-free grammars for generating formal languages happen to be of a certain intrinsic computational power that presents serious difficulties to efficient parsing algorithms as well as for the development of an algebraic theory of contextsensitive languages. In this paper a framework is given for the investigation of the computational power of formal grammars, in order to start a thorough analysis of grammars consisting of derivation rules of the form aB ..-->.. A/sub 1/ ... A /sub n/ b/sub 1/...b /sub m/ . These grammars may be thought of as automata by means of parallel processing, if one considers the variables as operators acting on the terminals while reading them right-to-left. This kind of automata and their 2-dimensional programming language prove to be useful by allowing a concise linear-time algorithm for integer multiplication. Linear parallel processing machines (LP-machines) which are, in their general form, equivalent to Turing machines, include finite automata and pushdown automata (with states encoded) as special cases. Bounded LP-machines yield deterministic accepting automata for nondeterministic contextfree languages, and they define an interesting class of contextsensitive languages. A characterization of this class in terms of generating grammars is established by using derivation trees with crossings as a helpful tool. From the algebraic point of view, deterministic LP-machines are effectively represented semigroups with distinguished subsets. Concerning the dualism between generating and accepting devices of formal languages within the algebraic setting, the concept of accepting automata turns out to reduce essentially to embeddability in an effectively represented extension monoid, even in the classical cases.

  2. A multitransputer parallel processing system (MTPPS)

    International Nuclear Information System (INIS)

    Jethra, A.K.; Pande, S.S.; Borkar, S.P.; Khare, A.N.; Ghodgaonkar, M.D.; Bairi, B.R.

    1993-01-01

    This report describes the design and implementation of a 16 node Multi Transputer Parallel Processing System(MTPPS) which is a platform for parallel program development. It is a MIMD machine based on message passing paradigm. The basic compute engine is an Inmos Transputer Ims T800-20. Transputer with local memory constitutes the processing element (NODE) of this MIMD architecture. Multiple NODES can be connected to each other in an identifiable network topology through the high speed serial links of the transputer. A Network Configuration Unit (NCU) incorporates the necessary hardware to provide software controlled network configuration. System is modularly expandable and more NODES can be added to the system to achieve the required processing power. The system is backend to the IBM-PC which has been integrated into the system to provide user I/O interface. PC resources are available to the programmer. Interface hardware between the PC and the network of transputers is INMOS compatible. Therefore, all the commercially available development software compatible to INMOS products can run on this system. While giving the details of design and implementation, this report briefly summarises MIMD Architectures, Transputer Architecture and Parallel Processing Software Development issues. LINPACK performance evaluation of the system and solutions of neutron physics and plasma physics problem have been discussed along with results. (author). 12 refs., 22 figs., 3 tabs., 3 appendixes

  3. Practical parallel processing

    International Nuclear Information System (INIS)

    Arendt, M.L.

    1986-01-01

    ELXSI, a San Jose based computer company, was founded in January of 1979 for the purpose of developing and marketing a tightly-coupled multiple processor system. After five years ELXSI succeeded in making the first commercial installations at Digicon Geophysical, NASA-Dryden, and Sandia National Laboratories. Since that time over fifty-one systems and ninety-three processors have been installed. The commercial success of the ELXSI system 6400(TM) is due to several significant breakthroughs in computer technology including a system bus operating at 320 million bytes per second, a new Message-Based Operating System, EMBOS (TM), and a new system organization which allows for easy expansion in any dimension without changes to the operating system, the user environment, or the application programs. (Auth.)

  4. Parallel processing of Monte Carlo code MCNP for particle transport problem

    Energy Technology Data Exchange (ETDEWEB)

    Higuchi, Kenji; Kawasaki, Takuji

    1996-06-01

    It is possible to vectorize or parallelize Monte Carlo codes (MC code) for photon and neutron transport problem, making use of independency of the calculation for each particle. Applicability of existing MC code to parallel processing is mentioned. As for parallel computer, we have used both vector-parallel processor and scalar-parallel processor in performance evaluation. We have made (i) vector-parallel processing of MCNP code on Monte Carlo machine Monte-4 with four vector processors, (ii) parallel processing on Paragon XP/S with 256 processors. In this report we describe the methodology and results for parallel processing on two types of parallel or distributed memory computers. In addition, we mention the evaluation of parallel programming environments for parallel computers used in the present work as a part of the work developing STA (Seamless Thinking Aid) Basic Software. (author)

  5. Modern industrial simulation tools: Kernel-level integration of high performance parallel processing, object-oriented numerics, and adaptive finite element analysis. Final report, July 16, 1993--September 30, 1997

    Energy Technology Data Exchange (ETDEWEB)

    Deb, M.K.; Kennon, S.R.

    1998-04-01

    A cooperative R&D effort between industry and the US government, this project, under the HPPP (High Performance Parallel Processing) initiative of the Dept. of Energy, started the investigations into parallel object-oriented (OO) numerics. The basic goal was to research and utilize the emerging technologies to create a physics-independent computational kernel for applications using adaptive finite element method. The industrial team included Computational Mechanics Co., Inc. (COMCO) of Austin, TX (as the primary contractor), Scientific Computing Associates, Inc. (SCA) of New Haven, CT, Texaco and CONVEX. Sandia National Laboratory (Albq., NM) was the technology partner from the government side. COMCO had the responsibility of the main kernel design and development, SCA had the lead in parallel solver technology and guidance on OO technologies was Sandia`s main expertise in this venture. CONVEX and Texaco supported the partnership by hardware resource and application knowledge, respectively. As such, a minimum of fifty-percent cost-sharing was provided by the industry partnership during this project. This report describes the R&D activities and provides some details about the prototype kernel and example applications.

  6. Parallel processing for artificial intelligence 2

    CERN Document Server

    Kumar, V; Suttner, CB

    1994-01-01

    With the increasing availability of parallel machines and the raising of interest in large scale and real world applications, research on parallel processing for Artificial Intelligence (AI) is gaining greater importance in the computer science environment. Many applications have been implemented and delivered but the field is still considered to be in its infancy. This book assembles diverse aspects of research in the area, providing an overview of the current state of technology. It also aims to promote further growth across the discipline. Contributions have been grouped according to their

  7. Parallel processing of neutron transport in fuel assembly calculation

    International Nuclear Information System (INIS)

    Song, Jae Seung

    1992-02-01

    Group constants, which are used for reactor analyses by nodal method, are generated by fuel assembly calculations based on the neutron transport theory, since one or a quarter of the fuel assembly corresponds to a unit mesh in the current nodal calculation. The group constant calculation for a fuel assembly is performed through spectrum calculations, a two-dimensional fuel assembly calculation, and depletion calculations. The purpose of this study is to develop a parallel algorithm to be used in a parallel processor for the fuel assembly calculation and the depletion calculations of the group constant generation. A serial program, which solves the neutron integral transport equation using the transmission probability method and the linear depletion equation, was prepared and verified by a benchmark calculation. Small changes from the serial program was enough to parallelize the depletion calculation which has inherent parallel characteristics. In the fuel assembly calculation, however, efficient parallelization is not simple and easy because of the many coupling parameters in the calculation and data communications among CPU's. In this study, the group distribution method is introduced for the parallel processing of the fuel assembly calculation to minimize the data communications. The parallel processing was performed on Quadputer with 4 CPU's operating in NURAD Lab. at KAIST. Efficiencies of 54.3 % and 78.0 % were obtained in the fuel assembly calculation and depletion calculation, respectively, which lead to the overall speedup of about 2.5. As a result, it is concluded that the computing time consumed for the group constant generation can be easily reduced by parallel processing on the parallel computer with small size CPU's

  8. Real-time parallel processing of grammatical structure in the fronto-striatal system: a recurrent network simulation study using reservoir computing.

    Science.gov (United States)

    Hinaut, Xavier; Dominey, Peter Ford

    2013-01-01

    Sentence processing takes place in real-time. Previous words in the sentence can influence the processing of the current word in the timescale of hundreds of milliseconds. Recent neurophysiological studies in humans suggest that the fronto-striatal system (frontal cortex, and striatum--the major input locus of the basal ganglia) plays a crucial role in this process. The current research provides a possible explanation of how certain aspects of this real-time processing can occur, based on the dynamics of recurrent cortical networks, and plasticity in the cortico-striatal system. We simulate prefrontal area BA47 as a recurrent network that receives on-line input about word categories during sentence processing, with plastic connections between cortex and striatum. We exploit the homology between the cortico-striatal system and reservoir computing, where recurrent frontal cortical networks are the reservoir, and plastic cortico-striatal synapses are the readout. The system is trained on sentence-meaning pairs, where meaning is coded as activation in the striatum corresponding to the roles that different nouns and verbs play in the sentences. The model learns an extended set of grammatical constructions, and demonstrates the ability to generalize to novel constructions. It demonstrates how early in the sentence, a parallel set of predictions are made concerning the meaning, which are then confirmed or updated as the processing of the input sentence proceeds. It demonstrates how on-line responses to words are influenced by previous words in the sentence, and by previous sentences in the discourse, providing new insight into the neurophysiology of the P600 ERP scalp response to grammatical complexity. This demonstrates that a recurrent neural network can decode grammatical structure from sentences in real-time in order to generate a predictive representation of the meaning of the sentences. This can provide insight into the underlying mechanisms of human cortico

  9. Applications of Parallel Processing in Mobile Banking

    Directory of Open Access Journals (Sweden)

    2007-01-01

    Full Text Available The future of mobile banking will be represented by such applications that support mobile, Internet banking and EFT (Electronic Funds Transfer transactions in a single user interface. In such a way, the mobile banking will be able to cover all the types of applications demanded at the market level. The parallel processing of credit card bank transactions could be performed with the help of a grid network. Excluding some limitations, the grid processing offers huge opportunities to exploit the parallelism. For this reason, a lot of applications of waiting queues in grid processing were developed in the last years. Grid networks represent a distinctive and very modern field of the parallel and distributed processing.

  10. Parallel-Processing Test Bed For Simulation Software

    Science.gov (United States)

    Blech, Richard; Cole, Gary; Townsend, Scott

    1996-01-01

    Second-generation Hypercluster computing system is multiprocessor test bed for research on parallel algorithms for simulation in fluid dynamics, electromagnetics, chemistry, and other fields with large computational requirements but relatively low input/output requirements. Built from standard, off-shelf hardware readily upgraded as improved technology becomes available. System used for experiments with such parallel-processing concepts as message-passing algorithms, debugging software tools, and computational steering. First-generation Hypercluster system described in "Hypercluster Parallel Processor" (LEW-15283).

  11. Parallel processing of genomics data

    Science.gov (United States)

    Agapito, Giuseppe; Guzzi, Pietro Hiram; Cannataro, Mario

    2016-10-01

    The availability of high-throughput experimental platforms for the analysis of biological samples, such as mass spectrometry, microarrays and Next Generation Sequencing, have made possible to analyze a whole genome in a single experiment. Such platforms produce an enormous volume of data per single experiment, thus the analysis of this enormous flow of data poses several challenges in term of data storage, preprocessing, and analysis. To face those issues, efficient, possibly parallel, bioinformatics software needs to be used to preprocess and analyze data, for instance to highlight genetic variation associated with complex diseases. In this paper we present a parallel algorithm for the parallel preprocessing and statistical analysis of genomics data, able to face high dimension of data and resulting in good response time. The proposed system is able to find statistically significant biological markers able to discriminate classes of patients that respond to drugs in different ways. Experiments performed on real and synthetic genomic datasets show good speed-up and scalability.

  12. Parallel processing at the SSC: The fact and the fiction

    International Nuclear Information System (INIS)

    Bourianoff, G.; Cole, B.

    1991-10-01

    Accurately modelling the behavior of particles circulating in accelerators is a computationally demanding task. The particle tracking code currently in use at SSC is based upon a ''thin element'' analysis (TEAPOT). In this model each magnet in the lattice is described by a thin element at which the particle experiences an impulsive kick. Each kick requires approximately 200 floating point operations (''FLOP''). For the SSC collider lattice consisting of 10 4 elements, performing a tracking of study for a set of 100 particles for 10 7 turns would require 2 x 10 15 FLOPS. Even on a machine capable of 100 MFLOP/sec (MFLOPS), this would require 2 x 10 7 seconds, and many such runs are necessary. It should be noted that the accuracy with which the kicks are to be calculated is important: the large number of iterations involved will magnify the effects of small errors. The inability of current computational resources to effectively perform the full calculation motivates the migration of this calculation to the most powerful computers available. A survey of the current research into new technologies for superconducting reveals that the supercomputers of the future will be parallel in nature. Further, numerous such machines exist today, and are being used to solve other difficult problems. Thus it seems clear that it is not early to begin developing the capability to develop tracking codes for parallel architectures. This report discusses implementing parallel processing on the SCC

  13. Parallel Processing of Images in Mobile Devices using BOINC

    Science.gov (United States)

    Curiel, Mariela; Calle, David F.; Santamaría, Alfredo S.; Suarez, David F.; Flórez, Leonardo

    2018-04-01

    Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a) the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b) the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.

  14. Parallel Processing of Images in Mobile Devices using BOINC

    Directory of Open Access Journals (Sweden)

    Curiel Mariela

    2018-04-01

    Full Text Available Medical image processing helps health professionals make decisions for the diagnosis and treatment of patients. Since some algorithms for processing images require substantial amounts of resources, one could take advantage of distributed or parallel computing. A mobile grid can be an adequate computing infrastructure for this problem. A mobile grid is a grid that includes mobile devices as resource providers. In a previous step of this research, we selected BOINC as the infrastructure to build our mobile grid. However, parallel processing of images in mobile devices poses at least two important challenges: the execution of standard libraries for processing images and obtaining adequate performance when compared to desktop computers grids. By the time we started our research, the use of BOINC in mobile devices also involved two issues: a the execution of programs in mobile devices required to modify the code to insert calls to the BOINC API, and b the division of the image among the mobile devices as well as its merging required additional code in some BOINC components. This article presents answers to these four challenges.

  15. An intelligent allocation algorithm for parallel processing

    Science.gov (United States)

    Carroll, Chester C.; Homaifar, Abdollah; Ananthram, Kishan G.

    1988-01-01

    The problem of allocating nodes of a program graph to processors in a parallel processing architecture is considered. The algorithm is based on critical path analysis, some allocation heuristics, and the execution granularity of nodes in a program graph. These factors, and the structure of interprocessor communication network, influence the allocation. To achieve realistic estimations of the executive durations of allocations, the algorithm considers the fact that nodes in a program graph have to communicate through varying numbers of tokens. Coarse and fine granularities have been implemented, with interprocessor token-communication duration, varying from zero up to values comparable to the execution durations of individual nodes. The effect on allocation of communication network structures is demonstrated by performing allocations for crossbar (non-blocking) and star (blocking) networks. The algorithm assumes the availability of as many processors as it needs for the optimal allocation of any program graph. Hence, the focus of allocation has been on varying token-communication durations rather than varying the number of processors. The algorithm always utilizes as many processors as necessary for the optimal allocation of any program graph, depending upon granularity and characteristics of the interprocessor communication network.

  16. Human Computer Music Performance

    OpenAIRE

    Dannenberg, Roger B.

    2012-01-01

    Human Computer Music Performance (HCMP) is the study of music performance by live human performers and real-time computer-based performers. One goal of HCMP is to create a highly autonomous artificial performer that can fill the role of a human, especially in a popular music setting. This will require advances in automated music listening and understanding, new representations for music, techniques for music synchronization, real-time human-computer communication, music generation, sound synt...

  17. Morphological evidence for parallel processing of information in rat macula

    Science.gov (United States)

    Ross, M. D.

    1988-01-01

    Study of montages, tracings and reconstructions prepared from a series of 570 consecutive ultrathin sections shows that rat maculas are morphologically organized for parallel processing of linear acceleratory information. Type II cells of one terminal field distribute information to neighboring terminals as well. The findings are examined in light of physiological data which indicate that macular receptor fields have a preferred directional vector, and are interpreted by analogy to a computer technology known as an information network.

  18. Evidence of Parallel Processing During Translation

    DEFF Research Database (Denmark)

    Balling, Laura Winther; Hvelplund, Kristian Tangsgaard; Sjørup, Annette Camilla

    2014-01-01

    conclude that translation is a parallel process and that literal translation is likely to be a universal initial default strategy in translation. This conclusion is strengthened by the fact that all three experiments were relatively naturalistic, due to the combination of remote eye tracking and mixed...

  19. Load balancing in highly parallel processing of Monte Carlo code for particle transport

    International Nuclear Information System (INIS)

    Higuchi, Kenji; Takemiya, Hiroshi; Kawasaki, Takuji

    1998-01-01

    In parallel processing of Monte Carlo (MC) codes for neutron, photon and electron transport problems, particle histories are assigned to processors making use of independency of the calculation for each particle. Although we can easily parallelize main part of a MC code by this method, it is necessary and practically difficult to optimize the code concerning load balancing in order to attain high speedup ratio in highly parallel processing. In fact, the speedup ratio in the case of 128 processors remains in nearly one hundred times when using the test bed for the performance evaluation. Through the parallel processing of the MCNP code, which is widely used in the nuclear field, it is shown that it is difficult to attain high performance by static load balancing in especially neutron transport problems, and a load balancing method, which dynamically changes the number of assigned particles minimizing the sum of the computational and communication costs, overcomes the difficulty, resulting in nearly fifteen percentage of reduction for execution time. (author)

  20. Parallel processing of two-dimensional Sn transport calculations

    International Nuclear Information System (INIS)

    Uematsu, M.

    1997-01-01

    A parallel processing method for the two-dimensional S n transport code DOT3.5 has been developed to achieve a drastic reduction in computation time. In the proposed method, parallelization is achieved with angular domain decomposition and/or space domain decomposition. The calculational speed of parallel processing by angular domain decomposition is largely influenced by frequent communications between processing elements. To assess parallelization efficiency, sample problems with up to 32 x 32 spatial meshes were solved with a Sun workstation using the PVM message-passing library. As a result, parallel calculation using 16 processing elements, for example, was found to be nine times as fast as that with one processing element. As for parallel processing by geometry segmentation, the influence of processing element communications on computation time is small; however, discontinuity at the segment boundary degrades convergence speed. To accelerate the convergence, an alternate sweep of angular flux in conjunction with space domain decomposition and a two-step rescaling method consisting of segmentwise rescaling and ordinary pointwise rescaling have been developed. By applying the developed method, the number of iterations needed to obtain a converged flux solution was reduced by a factor of 2. As a result, parallel calculation using 16 processing elements was found to be 5.98 times as fast as the original DOT3.5 calculation

  1. Aspects of parallel processing and control engineering

    OpenAIRE

    McKittrick, Brendan J

    1991-01-01

    The concept of parallel processing is not a new one, but the application of it to control engineering tasks is a relatively recent development, made possible by contemporary hardware and software innovation. It has long been accepted that, if properly orchestrated several processors/CPUs when combined can form a powerful processing entity. What prevented this from being implemented in commercial systems was the adequacy of the microprocessor for most tasks and hence the expense of a multi-pro...

  2. Researching the Parallel Process in Supervision and Psychotherapy

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard

    Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out.......Reflects upon how to do process research in supervision and in the parallel process. A single case study is presented illustrating how a study on parallel process can be carried out....

  3. Partitioning sparse rectangular matrices for parallel processing

    Energy Technology Data Exchange (ETDEWEB)

    Kolda, T.G.

    1998-05-01

    The authors are interested in partitioning sparse rectangular matrices for parallel processing. The partitioning problem has been well-studied in the square symmetric case, but the rectangular problem has received very little attention. They will formalize the rectangular matrix partitioning problem and discuss several methods for solving it. They will extend the spectral partitioning method for symmetric matrices to the rectangular case and compare this method to three new methods -- the alternating partitioning method and two hybrid methods. The hybrid methods will be shown to be best.

  4. CUDA/GPU Technology : Parallel Programming For High Performance Scientific Computing

    OpenAIRE

    YUHENDRA; KUZE, Hiroaki; JOSAPHAT, Tetuko Sri Sumantyo

    2009-01-01

    [ABSTRACT]Graphics processing units (GP Us) originally designed for computer video cards have emerged as the most powerful chip in a high-performance workstation. In the high performance computation capabilities, graphic processing units (GPU) lead to much more powerful performance than conventional CPUs by means of parallel processing. In 2007, the birth of Compute Unified Device Architecture (CUDA) and CUDA-enabled GPUs by NVIDIA Corporation brought a revolution in the general purpose GPU a...

  5. High performance computing in linear control

    International Nuclear Information System (INIS)

    Datta, B.N.

    1993-01-01

    Remarkable progress has been made in both theory and applications of all important areas of control. The theory is rich and very sophisticated. Some beautiful applications of control theory are presently being made in aerospace, biomedical engineering, industrial engineering, robotics, economics, power systems, etc. Unfortunately, the same assessment of progress does not hold in general for computations in control theory. Control Theory is lagging behind other areas of science and engineering in this respect. Nowadays there is a revolution going on in the world of high performance scientific computing. Many powerful computers with vector and parallel processing have been built and have been available in recent years. These supercomputers offer very high speed in computations. Highly efficient software, based on powerful algorithms, has been developed to use on these advanced computers, and has also contributed to increased performance. While workers in many areas of science and engineering have taken great advantage of these hardware and software developments, control scientists and engineers, unfortunately, have not been able to take much advantage of these developments

  6. Digital intermediate frequency QAM modulator using parallel processing

    Science.gov (United States)

    Pao, Hsueh-Yuan [Livermore, CA; Tran, Binh-Nien [San Ramon, CA

    2008-05-27

    The digital Intermediate Frequency (IF) modulator applies to various modulation types and offers a simple and low cost method to implement a high-speed digital IF modulator using field programmable gate arrays (FPGAs). The architecture eliminates multipliers and sequential processing by storing the pre-computed modulated cosine and sine carriers in ROM look-up-tables (LUTs). The high-speed input data stream is parallel processed using the corresponding LUTs, which reduces the main processing speed, allowing the use of low cost FPGAs.

  7. Oxytocin: parallel processing in the social brain?

    Science.gov (United States)

    Dölen, Gül

    2015-06-01

    Early studies attempting to disentangle the network complexity of the brain exploited the accessibility of sensory receptive fields to reveal circuits made up of synapses connected both in series and in parallel. More recently, extension of this organisational principle beyond the sensory systems has been made possible by the advent of modern molecular, viral and optogenetic approaches. Here, evidence supporting parallel processing of social behaviours mediated by oxytocin is reviewed. Understanding oxytocinergic signalling from this perspective has significant implications for the design of oxytocin-based therapeutic interventions aimed at disorders such as autism, where disrupted social function is a core clinical feature. Moreover, identification of opportunities for novel technology development will require a better appreciation of the complexity of the circuit-level organisation of the social brain. © 2015 The Authors. Journal of Neuroendocrinology published by John Wiley & Sons Ltd on behalf of British Society for Neuroendocrinology.

  8. High Performance Computing Multicast

    Science.gov (United States)

    2012-02-01

    A History of the Virtual Synchrony Replication Model,” in Replication: Theory and Practice, Charron-Bost, B., Pedone, F., and Schiper, A. (Eds...Performance Computing IP / IPv4 Internet Protocol (version 4.0) IPMC Internet Protocol MultiCast LAN Local Area Network MCMD Dr. Multicast MPI

  9. A qualitative single case study of parallel processes

    DEFF Research Database (Denmark)

    Jacobsen, Claus Haugaard

    2007-01-01

    Parallel process in psychotherapy and supervision is a phenomenon manifest in relationships and interactions, that originates in one setting and is reflected in another. This article presents an explorative single case study of parallel processes based on qualitative analyses of two successive...... randomly chosen psychotherapy sessions with a schizophrenic patient and the supervision session given in between. The author's analysis is verified by an independent examiner's analysis. Parallel processes are identified and described. Reflections on the dynamics of parallel processes and supervisory...

  10. Performing stencil computations

    Energy Technology Data Exchange (ETDEWEB)

    Donofrio, David

    2018-01-16

    A method and apparatus for performing stencil computations efficiently are disclosed. In one embodiment, a processor receives an offset, and in response, retrieves a value from a memory via a single instruction, where the retrieving comprises: identifying, based on the offset, one of a plurality of registers of the processor; loading an address stored in the identified register; and retrieving from the memory the value at the address.

  11. The Acoustic and Peceptual Effects of Series and Parallel Processing

    Directory of Open Access Journals (Sweden)

    Melinda C. Anderson

    2009-01-01

    Full Text Available Temporal envelope (TE cues provide a great deal of speech information. This paper explores how spectral subtraction and dynamic-range compression gain modifications affect TE fluctuations for parallel and series configurations. In parallel processing, algorithms compute gains based on the same input signal, and the gains in dB are summed. In series processing, output from the first algorithm forms the input to the second algorithm. Acoustic measurements show that the parallel arrangement produces more gain fluctuations, introducing more changes to the TE than the series configurations. Intelligibility tests for normal-hearing (NH and hearing-impaired (HI listeners show (1 parallel processing gives significantly poorer speech understanding than an unprocessed (UNP signal and the series arrangement and (2 series processing and UNP yield similar results. Speech quality tests show that UNP is preferred to both parallel and series arrangements, although spectral subtraction is the most preferred. No significant differences exist in sound quality between the series and parallel arrangements, or between the NH group and the HI group. These results indicate that gain modifications affect intelligibility and sound quality differently. Listeners appear to have a higher tolerance for gain modifications with regard to intelligibility, while judgments for sound quality appear to be more affected by smaller amounts of gain modification.

  12. Density functional theory and parallel processing

    International Nuclear Information System (INIS)

    Ward, R.C.; Geist, G.A.; Butler, W.H.

    1987-01-01

    The authors demonstrate a method for obtaining the ground state energies and charge densities of a system of atoms described within density functional theory using simulated annealing on a parallel computer

  13. Parallel Processing with TreeClust

    Science.gov (United States)

    2017-09-01

    Francisco: Elsevier Inc. Hartigan, J. (1975). Clustering algorithms. New York, NY: John Wiley and Sons. Horstmann, C. (2008). Big java (3rd ed...Computer organization and design (3rd ed.). Burlington, MA: Elsevier . 105 R Core Team. (2017, March). R: a language and environment for statistical

  14. Parallel processing from applications to systems

    CERN Document Server

    Moldovan, Dan I

    1993-01-01

    This text provides one of the broadest presentations of parallelprocessing available, including the structure of parallelprocessors and parallel algorithms. The emphasis is on mappingalgorithms to highly parallel computers, with extensive coverage ofarray and multiprocessor architectures. Early chapters provideinsightful coverage on the analysis of parallel algorithms andprogram transformations, effectively integrating a variety ofmaterial previously scattered throughout the literature. Theory andpractice are well balanced across diverse topics in this concisepresentation. For exceptional cla

  15. Computational Biology and High Performance Computing 2000

    Energy Technology Data Exchange (ETDEWEB)

    Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

    2000-10-19

    The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.

  16. ParaBTM: A Parallel Processing Framework for Biomedical Text Mining on Supercomputers.

    Science.gov (United States)

    Xing, Yuting; Wu, Chengkun; Yang, Xi; Wang, Wei; Zhu, En; Yin, Jianping

    2018-04-27

    A prevailing way of extracting valuable information from biomedical literature is to apply text mining methods on unstructured texts. However, the massive amount of literature that needs to be analyzed poses a big data challenge to the processing efficiency of text mining. In this paper, we address this challenge by introducing parallel processing on a supercomputer. We developed paraBTM, a runnable framework that enables parallel text mining on the Tianhe-2 supercomputer. It employs a low-cost yet effective load balancing strategy to maximize the efficiency of parallel processing. We evaluated the performance of paraBTM on several datasets, utilizing three types of named entity recognition tasks as demonstration. Results show that, in most cases, the processing efficiency can be greatly improved with parallel processing, and the proposed load balancing strategy is simple and effective. In addition, our framework can be readily applied to other tasks of biomedical text mining besides NER.

  17. Recent development for the ITS code system: Parallel processing and visualization

    International Nuclear Information System (INIS)

    Fan, W.C.; Turner, C.D.; Halbleib, J.A. Sr.; Kensek, R.P.

    1996-01-01

    A brief overview is given for two software developments related to the ITS code system. These developments provide parallel processing and visualization capabilities and thus allow users to perform ITS calculations more efficiently. Timing results and a graphical example are presented to demonstrate these capabilities

  18. Computer-Related Task Performance

    DEFF Research Database (Denmark)

    Longstreet, Phil; Xiao, Xiao; Sarker, Saonee

    2016-01-01

    The existing information system (IS) literature has acknowledged computer self-efficacy (CSE) as an important factor contributing to enhancements in computer-related task performance. However, the empirical results of CSE on performance have not always been consistent, and increasing an individual......'s CSE is often a cumbersome process. Thus, we introduce the theoretical concept of self-prophecy (SP) and examine how this social influence strategy can be used to improve computer-related task performance. Two experiments are conducted to examine the influence of SP on task performance. Results show...... that SP and CSE interact to influence performance. Implications are then discussed in terms of organizations’ ability to increase performance....

  19. The Extended Parallel Process Model: Illuminating the Gaps in Research

    Science.gov (United States)

    Popova, Lucy

    2012-01-01

    This article examines constructs, propositions, and assumptions of the extended parallel process model (EPPM). Review of the EPPM literature reveals that its theoretical concepts are thoroughly developed, but the theory lacks consistency in operational definitions of some of its constructs. Out of the 12 propositions of the EPPM, a few have not…

  20. Using Motivational Interviewing Techniques to Address Parallel Process in Supervision

    Science.gov (United States)

    Giordano, Amanda; Clarke, Philip; Borders, L. DiAnne

    2013-01-01

    Supervision offers a distinct opportunity to experience the interconnection of counselor-client and counselor-supervisor interactions. One product of this network of interactions is parallel process, a phenomenon by which counselors unconsciously identify with their clients and subsequently present to their supervisors in a similar fashion…

  1. Parallel processing approach to transform-based image coding

    Science.gov (United States)

    Normile, James O.; Wright, Dan; Chu, Ken; Yeh, Chia L.

    1991-06-01

    This paper describes a flexible parallel processing architecture designed for use in real time video processing. The system consists of floating point DSP processors connected to each other via fast serial links, each processor has access to a globally shared memory. A multiple bus architecture in combination with a dual ported memory allows communication with a host control processor. The system has been applied to prototyping of video compression and decompression algorithms. The decomposition of transform based algorithms for decompression into a form suitable for parallel processing is described. A technique for automatic load balancing among the processors is developed and discussed, results ar presented with image statistics and data rates. Finally techniques for accelerating the system throughput are analyzed and results from the application of one such modification described.

  2. Parallel processing and learning in simple systems. Final report, 10 January 1986-14 January 1989

    Energy Technology Data Exchange (ETDEWEB)

    Mpitsos, G.J.

    1989-03-15

    Work over the three-year tenure of this grant has dealt with interrelated studies of (1) neuropharmacology, (2) behavior, and (3) distributed/parallel processing in the generation of variable motor patterns in the buccal-oral system of the sea slug Pleurobranchaea californica. (4) Computer simulations of simple neutral networks have been undertaken to examine neurointegrative principles that could not be examined in biological preparations. The simulation work has set the basis for further simulations dealing with networks having characteristics relating to real neurons. All of the work has had the goal of developing interdisciplinary tools for understanding the scale-independent problem of how individuals, each possessing only local knowledge of group activity, act within a group to produce different and variable adaptive outputs, and, in turn, of how the group influences the activity of the individual. The pharmacologic studies have had the goal of developing biochemical tools with which to identify groups of neurons that perform specific tasks during the production of a given behavior but are multifunctional by being critically involved in generating several different behaviors.

  3. Method of parallel processing in SANPO real time system

    International Nuclear Information System (INIS)

    Ostrovnoj, A.I.; Salamatin, I.M.

    1981-01-01

    A method of parellel processing in SANPO real time system is described. Algorithms of data accumulation and preliminary processing in this system as a parallel processes using a specialized high level programming language are described. Hierarchy of elementary processes are also described. It provides the synchronization of concurrent processes without semaphors. The developed means are applied to the systems of experiment automation using SM-3 minicomputers [ru

  4. Improving engineers' performance with computers

    International Nuclear Information System (INIS)

    Purvis, E.E. III

    1984-01-01

    The problem addressed is how to improve the performance of engineers in the design, operation, and maintenance of nuclear power plants. The application of computer science to this problem offers a challenge in maximizing the use of developments outside the nuclear industry and setting priorities to address the most fruitful areas first. Areas of potential benefits include data base management through design, analysis, procurement, construction, operation maintenance, cost, schedule and interface control and planning, and quality engineering on specifications, inspection, and training

  5. Parallel processing is good for your scientific codes...But massively parallel processing is so much better

    International Nuclear Information System (INIS)

    Thomas, B.; Domain, Ch.; Souffez, Y.; Eon-Duval, P.

    1998-01-01

    Harnessing the power of many computers, to solve concurrently difficult scientific problems, is one of the most innovative trend in High Performance Computing. At EDF, we have invested in parallel computing and have achieved significant results. First we improved the processing speed of strategic codes, in order to extend their scope. Then we turned to numerical simulations at the atomic scale. These computations, we never dreamt of before, provided us with a better understanding of metallurgic phenomena. More precisely we were able to trace defects in alloys that are used in nuclear power plants. (author)

  6. A Parallel Processing Algorithm for Remote Sensing Classification

    Science.gov (United States)

    Gualtieri, J. Anthony

    2005-01-01

    A current thread in parallel computation is the use of cluster computers created by networking a few to thousands of commodity general-purpose workstation-level commuters using the Linux operating system. For example on the Medusa cluster at NASA/GSFC, this provides for super computing performance, 130 G(sub flops) (Linpack Benchmark) at moderate cost, $370K. However, to be useful for scientific computing in the area of Earth science, issues of ease of programming, access to existing scientific libraries, and portability of existing code need to be considered. In this paper, I address these issues in the context of tools for rendering earth science remote sensing data into useful products. In particular, I focus on a problem that can be decomposed into a set of independent tasks, which on a serial computer would be performed sequentially, but with a cluster computer can be performed in parallel, giving an obvious speedup. To make the ideas concrete, I consider the problem of classifying hyperspectral imagery where some ground truth is available to train the classifier. In particular I will use the Support Vector Machine (SVM) approach as applied to hyperspectral imagery. The approach will be to introduce notions about parallel computation and then to restrict the development to the SVM problem. Pseudocode (an outline of the computation) will be described and then details specific to the implementation will be given. Then timing results will be reported to show what speedups are possible using parallel computation. The paper will close with a discussion of the results.

  7. Computed radiography systems performance evaluation

    International Nuclear Information System (INIS)

    Xavier, Clarice C.; Nersissian, Denise Y.; Furquim, Tania A.C.

    2009-01-01

    The performance of a computed radiography system was evaluated, according to the AAPM Report No. 93. Evaluation tests proposed by the publication were performed, and the following nonconformities were found: imaging p/ate (lP) dark noise, which compromises the clinical image acquired using the IP; exposure indicator uncalibrated, which can cause underexposure to the IP; nonlinearity of the system response, which causes overexposure; resolution limit under the declared by the manufacturer and erasure thoroughness uncalibrated, impairing structures visualization; Moire pattern visualized at the grid response, and IP Throughput over the specified by the manufacturer. These non-conformities indicate that digital imaging systems' lack of calibration can cause an increase in dose in order that image prob/ems can be so/ved. (author)

  8. Automatic analysis (aa: efficient neuroimaging workflows and parallel processing using Matlab and XML

    Directory of Open Access Journals (Sweden)

    Rhodri eCusack

    2015-01-01

    Full Text Available Recent years have seen neuroimaging data becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complex to set up and run (increasing the risk of human error and time consuming to execute (restricting what analyses are attempted. Here we present an open-source framework, automatic analysis (aa, to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (redone. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA. However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast and efficient, for simple single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address.

  9. Automatic analysis (aa): efficient neuroimaging workflows and parallel processing using Matlab and XML.

    Science.gov (United States)

    Cusack, Rhodri; Vicente-Grabovetsky, Alejandro; Mitchell, Daniel J; Wild, Conor J; Auer, Tibor; Linke, Annika C; Peelle, Jonathan E

    2014-01-01

    Recent years have seen neuroimaging data sets becoming richer, with larger cohorts of participants, a greater variety of acquisition techniques, and increasingly complex analyses. These advances have made data analysis pipelines complicated to set up and run (increasing the risk of human error) and time consuming to execute (restricting what analyses are attempted). Here we present an open-source framework, automatic analysis (aa), to address these concerns. Human efficiency is increased by making code modular and reusable, and managing its execution with a processing engine that tracks what has been completed and what needs to be (re)done. Analysis is accelerated by optional parallel processing of independent tasks on cluster or cloud computing resources. A pipeline comprises a series of modules that each perform a specific task. The processing engine keeps track of the data, calculating a map of upstream and downstream dependencies for each module. Existing modules are available for many analysis tasks, such as SPM-based fMRI preprocessing, individual and group level statistics, voxel-based morphometry, tractography, and multi-voxel pattern analyses (MVPA). However, aa also allows for full customization, and encourages efficient management of code: new modules may be written with only a small code overhead. aa has been used by more than 50 researchers in hundreds of neuroimaging studies comprising thousands of subjects. It has been found to be robust, fast, and efficient, for simple-single subject studies up to multimodal pipelines on hundreds of subjects. It is attractive to both novice and experienced users. aa can reduce the amount of time neuroimaging laboratories spend performing analyses and reduce errors, expanding the range of scientific questions it is practical to address.

  10. Vector and parallel computing on the IBM ES/3090, a powerful approach to solving problems in the utility industry

    International Nuclear Information System (INIS)

    Bellucci, V.J.

    1990-01-01

    This paper describes IBM's approach to parallel computing using the IBM ES/3090 computer. Parallel processing concepts were discussed including its advantages, potential performance improvements and limitations. Particular applications and capabilities for the IBM ES/3090 were presented along with preliminary results from some utilities in the application of parallel processing to simulation of system reliability, air pollution models, and power network dynamics

  11. "Let's Move" campaign: applying the extended parallel process model.

    Science.gov (United States)

    Batchelder, Alicia; Matusitz, Jonathan

    2014-01-01

    This article examines Michelle Obama's health campaign, "Let's Move," through the lens of the extended parallel process model (EPPM). "Let's Move" aims to reduce the childhood obesity epidemic in the United States. Developed by Kim Witte, EPPM rests on the premise that people's attitudes can be changed when fear is exploited as a factor of persuasion. Fear appeals work best (a) when a person feels a concern about the issue or situation, and (b) when he or she believes to have the capability of dealing with that issue or situation. Overall, the analysis found that "Let's Move" is based on past health campaigns that have been successful. An important element of the campaign is the use of fear appeals (as it is postulated by EPPM). For example, part of the campaign's strategies is to explain the severity of the diseases associated with obesity. By looking at the steps of EPPM, readers can also understand the strengths and weaknesses of "Let's Move."

  12. Plagiarism Detection for Indonesian Language using Winnowing with Parallel Processing

    Science.gov (United States)

    Arifin, Y.; Isa, S. M.; Wulandhari, L. A.; Abdurachman, E.

    2018-03-01

    The plagiarism has many forms, not only copy paste but include changing passive become active voice, or paraphrasing without appropriate acknowledgment. It happens on all language include Indonesian Language. There are many previous research that related with plagiarism detection in Indonesian Language with different method. But there are still some part that still has opportunity to improve. This research proposed the solution that can improve the plagiarism detection technique that can detect not only copy paste form but more advance than that. The proposed solution is using Winnowing with some addition process in pre-processing stage. With stemming processing in Indonesian Language and generate fingerprint in parallel processing that can saving time processing and produce the plagiarism result on the suspected document.

  13. Parallel processing implementation for the coupled transport of photons and electrons using OpenMP

    Science.gov (United States)

    Doerner, Edgardo

    2016-05-01

    In this work the use of OpenMP to implement the parallel processing of the Monte Carlo (MC) simulation of the coupled transport for photons and electrons is presented. This implementation was carried out using a modified EGSnrc platform which enables the use of the Microsoft Visual Studio 2013 (VS2013) environment, together with the developing tools available in the Intel Parallel Studio XE 2015 (XE2015). The performance study of this new implementation was carried out in a desktop PC with a multi-core CPU, taking as a reference the performance of the original platform. The results were satisfactory, both in terms of scalability as parallelization efficiency.

  14. High Performance Spaceflight Computing (HPSC)

    Data.gov (United States)

    National Aeronautics and Space Administration — Space-based computing has not kept up with the needs of current and future NASA missions. We are developing a next-generation flight computing system that addresses...

  15. Parallel processing architecture for H.264 deblocking filter on multi-core platforms

    Science.gov (United States)

    Prasad, Durga P.; Sonachalam, Sekar; Kunchamwar, Mangesh K.; Gunupudi, Nageswara Rao

    2012-03-01

    filter for multi core platforms such as HyperX technology. Parallel techniques such as parallel processing of independent macroblocks, sub blocks, and pixel row level are examined in this work. The deblocking architecture consists of a basic cell called deblocking filter unit (DFU) and dependent data buffer manager (DFM). The DFU can be used in several instances, catering to different performance needs the DFM serves the data required for the different number of DFUs, and also manages all the neighboring data required for future data processing of DFUs. This approach achieves the scalability, flexibility, and performance excellence required in deblocking filters.

  16. High performance computing in Windows Azure cloud

    OpenAIRE

    Ambruš, Dejan

    2013-01-01

    High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...

  17. Big Data GPU-Driven Parallel Processing Spatial and Spatio-Temporal Clustering Algorithms

    Science.gov (United States)

    Konstantaras, Antonios; Skounakis, Emmanouil; Kilty, James-Alexander; Frantzeskakis, Theofanis; Maravelakis, Emmanuel

    2016-04-01

    Advances in graphics processing units' technology towards encompassing parallel architectures [1], comprised of thousands of cores and multiples of parallel threads, provide the foundation in terms of hardware for the rapid processing of various parallel applications regarding seismic big data analysis. Seismic data are normally stored as collections of vectors in massive matrices, growing rapidly in size as wider areas are covered, denser recording networks are being established and decades of data are being compiled together [2]. Yet, many processes regarding seismic data analysis are performed on each seismic event independently or as distinct tiles [3] of specific grouped seismic events within a much larger data set. Such processes, independent of one another can be performed in parallel narrowing down processing times drastically [1,3]. This research work presents the development and implementation of three parallel processing algorithms using Cuda C [4] for the investigation of potentially distinct seismic regions [5,6] present in the vicinity of the southern Hellenic seismic arc. The algorithms, programmed and executed in parallel comparatively, are the: fuzzy k-means clustering with expert knowledge [7] in assigning overall clusters' number; density-based clustering [8]; and a selves-developed spatio-temporal clustering algorithm encompassing expert [9] and empirical knowledge [10] for the specific area under investigation. Indexing terms: GPU parallel programming, Cuda C, heterogeneous processing, distinct seismic regions, parallel clustering algorithms, spatio-temporal clustering References [1] Kirk, D. and Hwu, W.: 'Programming massively parallel processors - A hands-on approach', 2nd Edition, Morgan Kaufman Publisher, 2013 [2] Konstantaras, A., Valianatos, F., Varley, M.R. and Makris, J.P.: 'Soft-Computing Modelling of Seismicity in the Southern Hellenic Arc', Geoscience and Remote Sensing Letters, vol. 5 (3), pp. 323-327, 2008 [3] Papadakis, S. and

  18. A model for dealing with parallel processes in supervision

    Directory of Open Access Journals (Sweden)

    Lilja Cajvert

    2011-03-01

    Supervision in social work is essential for successful outcomes when working with clients. In social work, unconscious difficulties may arise and similar difficulties may occur in supervision as parallel processes. In this article, the development of a practice-based model of supervision to deal with parallel processes in supervision is described. The model has six phases. In the first phase, the focus is on the supervisor’s inner world, his/her own reflections and observations. In the second phase, the supervision situation is “frozen”, and the supervisees are invited to join the supervisor in taking a meta-perspective on the current situation of supervision. The focus in the third phase is on the inner world of all the group members as well as the visualization and identification of reflections and feelings that arose during the supervision process. Phase four focuses on the supervisee who presented a case, and in phase five the focus shifts to the common understanding and theorization of the supervision process as well as the definition and identification of possible parallel processes. In the final phase, the supervisee, with the assistance of the supervisor and other members of the group, develops a solution and determines how to proceed with the client in treatment. This article uses phenomenological concepts to provide a theoretical framework for the supervision model. Phenomenological reduction is an important approach to examine and to externalize and visualize the inner words of the supervisor and supervisees. Een model voor het hanteren van parallelle processen tijdens supervisie Om succesvol te zijn in de hulpverlening aan cliënten, is supervisie cruciaal in het sociaal werk. Tijdens de hulpverlening kunnen impliciete moeilijkheden de kop opsteken en soortgelijke moeilijkheden duiken soms ook op tijdens supervisie. Dit worden parallelle processen genoemd. Dit artikel beschrijft een op praktijkervaringen gebaseerd model om dergelijke parallelle

  19. A dataflow analysis tool for parallel processing of algorithms

    Science.gov (United States)

    Jones, Robert L., III

    1993-01-01

    A graph-theoretic design process and software tool is presented for selecting a multiprocessing scheduling solution for a class of computational problems. The problems of interest are those that can be described using a dataflow graph and are intended to be executed repetitively on a set of identical parallel processors. Typical applications include signal processing and control law problems. Graph analysis techniques are introduced and shown to effectively determine performance bounds, scheduling constraints, and resource requirements. The software tool is shown to facilitate the application of the design process to a given problem.

  20. Implementation of parallel processing in the basf2 framework for Belle II

    International Nuclear Information System (INIS)

    Itoh, Ryosuke; Lee, Soohyung; Katayama, N; Mineo, S; Moll, A; Kuhr, T; Heck, M

    2012-01-01

    Recent PC servers are equipped with multi-core CPUs and it is desired to utilize the full processing power of them for the data analysis in large scale HEP experiments. A software framework basf2 is being developed for the use in the Belle II experiment, a new generation B-factory experiment at KEK, and the parallel event processing to utilize the multi-core CPUs is in its design for the use in the massive data production. The details of the implementation of event parallel processing in the basf2 framework are discussed with the report of preliminary performance study in the realistic use on a 32 core PC server.

  1. Ordering schemes for parallel processing of certain mesh problems

    International Nuclear Information System (INIS)

    O'Leary, D.

    1984-01-01

    In this work, some ordering schemes for mesh points are presented which enable algorithms such as the Gauss-Seidel or SOR iteration to be performed efficiently for the nine-point operator finite difference method on computers consisting of a two-dimensional grid of processors. Convergence results are presented for the discretization of u /SUB xx/ + u /SUB yy/ on a uniform mesh over a square, showing that the spectral radius of the iteration for these orderings is no worse than that for the standard row by row ordering of mesh points. Further applications of these mesh point orderings to network problems, more general finite difference operators, and picture processing problems are noted

  2. Test generation for digital circuits using parallel processing

    Science.gov (United States)

    Hartmann, Carlos R.; Ali, Akhtar-Uz-Zaman M.

    1990-12-01

    The problem of test generation for digital logic circuits is an NP-Hard problem. Recently, the availability of low cost, high performance parallel machines has spurred interest in developing fast parallel algorithms for computer-aided design and test. This report describes a method of applying a 15-valued logic system for digital logic circuit test vector generation in a parallel programming environment. A concept called fault site testing allows for test generation, in parallel, that targets more than one fault at a given location. The multi-valued logic system allows results obtained by distinct processors and/or processes to be merged by means of simple set intersections. A machine-independent description is given for the proposed algorithm.

  3. High-performance computing using FPGAs

    CERN Document Server

    Benkrid, Khaled

    2013-01-01

    This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community.  The book includes:  Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation.     Seven architecture chapters which...

  4. Network survivability performance (computer diskette)

    Science.gov (United States)

    1993-11-01

    File characteristics: Data file; 1 file. Physical description: 1 computer diskette; 3 1/2 in.; high density; 2.0MB. System requirements: Mac; Word. This technical report has been developed to address the survivability of telecommunications networks including services. It responds to the need for a common understanding of, and assessment techniques for network survivability, availability, integrity, and reliability. It provides a basis for designing and operating telecommunication networks to user expectations for network survivability.

  5. A new decomposition method for parallel processing multi-level optimization

    International Nuclear Information System (INIS)

    Park, Hyung Wook; Kim, Min Soo; Choi, Dong Hoon

    2002-01-01

    In practical designs, most of the multidisciplinary problems have a large-size and complicate design system. Since multidisciplinary problems have hundreds of analyses and thousands of variables, the grouping of analyses and the order of the analyses in the group affect the speed of the total design cycle. Therefore, it is very important to reorder and regroup the original design processes in order to minimize the total computational cost by decomposing large multidisciplinary problems into several MultiDisciplinary Analysis SubSystems (MDASS) and by processing them in parallel. In this study, a new decomposition method is proposed for parallel processing of multidisciplinary design optimization, such as Collaborative Optimization (CO) and Individual Discipline Feasible (IDF) method. Numerical results for two example problems are presented to show the feasibility of the proposed method

  6. High-performance computing — an overview

    Science.gov (United States)

    Marksteiner, Peter

    1996-08-01

    An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

  7. In-cylinder diesel spray combustion simulations using parallel computation: A performance benchmarking study

    International Nuclear Information System (INIS)

    Pang, Kar Mun; Ng, Hoon Kiat; Gan, Suyin

    2012-01-01

    Highlights: ► A performance benchmarking exercise is conducted for diesel combustion simulations. ► The reduced chemical mechanism shows its advantages over base and skeletal models. ► High efficiency and great reduction of CPU runtime are achieved through 4-node solver. ► Increasing ISAT memory from 0.1 to 2 GB reduces the CPU runtime by almost 35%. ► Combustion and soot processes are predicted well with minimal computational cost. - Abstract: In the present study, in-cylinder diesel combustion simulation was performed with parallel processing on an Intel Xeon Quad-Core platform to allow both fluid dynamics and chemical kinetics of the surrogate diesel fuel model to be solved simultaneously on multiple processors. Here, Cartesian Z-Coordinate was selected as the most appropriate partitioning algorithm since it computationally bisects the domain such that the dynamic load associated with fuel particle tracking was evenly distributed during parallel computations. Other variables examined included number of compute nodes, chemistry sizes and in situ adaptive tabulation (ISAT) parameters. Based on the performance benchmarking test conducted, parallel configuration of 4-compute node was found to reduce the computational runtime most efficiently whereby a parallel efficiency of up to 75.4% was achieved. The simulation results also indicated that accuracy level was insensitive to the number of partitions or the partitioning algorithms. The effect of reducing the number of species on computational runtime was observed to be more significant than reducing the number of reactions. Besides, the study showed that an increase in the ISAT maximum storage of up to 2 GB reduced the computational runtime by 50%. Also, the ISAT error tolerance of 10 −3 was chosen to strike a balance between results accuracy and computational runtime. The optimised parameters in parallel processing and ISAT, as well as the use of the in-house reduced chemistry model allowed accurate

  8. Fraud Detection in Credit Card Transactions; Using Parallel Processing of Anomalies in Big Data

    Directory of Open Access Journals (Sweden)

    Mohammad Reza Taghva

    2016-10-01

    Full Text Available In parallel to the increasing use of electronic cards, especially in the banking industry, the volume of transactions using these cards has grown rapidly. Moreover, the financial nature of these cards has led to the desirability of fraud in this area. The present study with Map Reduce approach and parallel processing, applied the Kohonen neural network model to detect abnormalities in bank card transactions. For this purpose, firstly it was proposed to classify all transactions into the fraudulent and legal which showed better performance compared with other methods. In the next step, we transformed the Kohonen model into the form of parallel task which demonstrated appropriate performance in terms of time; as expected to be well implemented in transactions with Big Data assumptions.

  9. US QCD computational performance studies with PERI

    International Nuclear Information System (INIS)

    Zhang, Y; Fowler, R; Huck, K; Malony, A; Porterfield, A; Reed, D; Shende, S; Taylor, V; Wu, X

    2007-01-01

    We report on some of the interactions between two SciDAC projects: The National Computational Infrastructure for Lattice Gauge Theory (USQCD), and the Performance Engineering Research Institute (PERI). Many modern scientific programs consistently report the need for faster computational resources to maintain global competitiveness. However, as the size and complexity of emerging high end computing (HEC) systems continue to rise, achieving good performance on such systems is becoming ever more challenging. In order to take full advantage of the resources, it is crucial to understand the characteristics of relevant scientific applications and the systems these applications are running on. Using tools developed under PERI and by other performance measurement researchers, we studied the performance of two applications, MILC and Chroma, on several high performance computing systems at DOE laboratories. In the case of Chroma, we discuss how the use of C++ and modern software engineering and programming methods are driving the evolution of performance tools

  10. GPU: the biggest key processor for AI and parallel processing

    Science.gov (United States)

    Baji, Toru

    2017-07-01

    Two types of processors exist in the market. One is the conventional CPU and the other is Graphic Processor Unit (GPU). Typical CPU is composed of 1 to 8 cores while GPU has thousands of cores. CPU is good for sequential processing, while GPU is good to accelerate software with heavy parallel executions. GPU was initially dedicated for 3D graphics. However from 2006, when GPU started to apply general-purpose cores, it was noticed that this architecture can be used as a general purpose massive-parallel processor. NVIDIA developed a software framework Compute Unified Device Architecture (CUDA) that make it possible to easily program the GPU for these application. With CUDA, GPU started to be used in workstations and supercomputers widely. Recently two key technologies are highlighted in the industry. The Artificial Intelligence (AI) and Autonomous Driving Cars. AI requires a massive parallel operation to train many-layers of neural networks. With CPU alone, it was impossible to finish the training in a practical time. The latest multi-GPU system with P100 makes it possible to finish the training in a few hours. For the autonomous driving cars, TOPS class of performance is required to implement perception, localization, path planning processing and again SoC with integrated GPU will play a key role there. In this paper, the evolution of the GPU which is one of the biggest commercial devices requiring state-of-the-art fabrication technology will be introduced. Also overview of the GPU demanding key application like the ones described above will be introduced.

  11. Debugging a high performance computing program

    Science.gov (United States)

    Gooding, Thomas M.

    2013-08-20

    Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.

  12. Parallel processing of dose calculation for external photon beam therapy

    International Nuclear Information System (INIS)

    Kunieda, Etsuo; Ando, Yutaka; Tsukamoto, Nobuhiro; Ito, Hisao; Kubo, Atsushi

    1994-01-01

    We implemented external photon beam dose calculation programs into a parallel processor system consisting of Transputers, 32-bit processors especially suitable for multi-processor configuration. Two network conformations, binary-tree and pipeline, were evaluated for rectangular and irregular field dose calculation algorithms. Although computation speed increased in proportion to the number of CPU, substantial overhead caused by inter-processor communication occurred when a smaller computation load was delivered to each processor. On the other hand, for irregular field calculation, which requires more computation capability for each calculation point, the communication overhead was still less even when more than 50 processors were involved. Real-time responses could be expected for more complex algorithms by increasing the number of processors. (author)

  13. Embedded High Performance Scalable Computing Systems

    National Research Council Canada - National Science Library

    Ngo, David

    2003-01-01

    The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...

  14. Cloud Computing for Complex Performance Codes.

    Energy Technology Data Exchange (ETDEWEB)

    Appel, Gordon John [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hadgu, Teklu [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Klein, Brandon Thorin [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Miner, John Gifford [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-02-01

    This report describes the use of cloud computing services for running complex public domain performance assessment problems. The work consisted of two phases: Phase 1 was to demonstrate complex codes, on several differently configured servers, could run and compute trivial small scale problems in a commercial cloud infrastructure. Phase 2 focused on proving non-trivial large scale problems could be computed in the commercial cloud environment. The cloud computing effort was successfully applied using codes of interest to the geohydrology and nuclear waste disposal modeling community.

  15. An Adaptive Middleware for Improved Computational Performance

    DEFF Research Database (Denmark)

    Bonnichsen, Lars Frydendal

    , we are improving computational performance by exploiting modern hardware features, such as dynamic voltage-frequency scaling and transactional memory. Adapting software is an iterative process, requiring that we continually revisit it to meet new requirements or realities; a time consuming process......The performance improvements in computer systems over the past 60 years have been fueled by an exponential increase in energy efficiency. In recent years, the phenomenon known as the end of Dennard’s scaling has slowed energy efficiency improvements — but improving computer energy efficiency...... is more important now than ever. Traditionally, most improvements in computer energy efficiency have come from improvements in lithography — the ability to produce smaller transistors — and computer architecture - the ability to apply those transistors efficiently. Since the end of scaling, we have seen...

  16. UFMulti: A new parallel processing software system for HEP

    Science.gov (United States)

    Avery, Paul; White, Andrew

    1989-12-01

    UFMulti is a multiprocessing software package designed for general purpose high energy physics applications, including physics and detector simulation, data reduction and DST physics analysis. The system is particularly well suited for installations where several workstation or computers are connected through a local area network (LAN). The initial configuration of the software is currently running on VAX/VMS machines with a planned extension to ULTRIX, using the new RISC CPUs from Digital, in the near future.

  17. UFMULTI: A new parallel processing software system for HEP

    International Nuclear Information System (INIS)

    Avery, P.; White, A.

    1989-01-01

    UFMulti is a multiprocessing software package designed for general purpose high energy physics applications, including physics and detector simulation, data reduction and DST physics analysis. The system is particularly well suited for installations where several workstations or computers are connected through a local area network (LAN). The initial configuration of the software is currently running on VAX/VMS machines with a planned extension to ULTRIX, using the new RISC CPUs from Digital, in the near future. (orig.)

  18. Regional-scale calculation of the LS factor using parallel processing

    Science.gov (United States)

    Liu, Kai; Tang, Guoan; Jiang, Ling; Zhu, A.-Xing; Yang, Jianyi; Song, Xiaodong

    2015-05-01

    With the increase of data resolution and the increasing application of USLE over large areas, the existing serial implementation of algorithms for computing the LS factor is becoming a bottleneck. In this paper, a parallel processing model based on message passing interface (MPI) is presented for the calculation of the LS factor, so that massive datasets at a regional scale can be processed efficiently. The parallel model contains algorithms for calculating flow direction, flow accumulation, drainage network, slope, slope length and the LS factor. According to the existence of data dependence, the algorithms are divided into local algorithms and global algorithms. Parallel strategy are designed according to the algorithm characters including the decomposition method for maintaining the integrity of the results, optimized workflow for reducing the time taken for exporting the unnecessary intermediate data and a buffer-communication-computation strategy for improving the communication efficiency. Experiments on a multi-node system show that the proposed parallel model allows efficient calculation of the LS factor at a regional scale with a massive dataset.

  19. Function Follows Performance in Evolutionary Computational Processing

    DEFF Research Database (Denmark)

    Pasold, Anke; Foged, Isak Worre

    2011-01-01

    As the title ‘Function Follows Performance in Evolutionary Computational Processing’ suggests, this paper explores the potentials of employing multiple design and evaluation criteria within one processing model in order to account for a number of performative parameters desired within varied...

  20. Parallel processing based decomposition technique for efficient collaborative optimization

    International Nuclear Information System (INIS)

    Park, Hyung Wook; Kim, Sung Chan; Kim, Min Soo; Choi, Dong Hoon

    2001-01-01

    In practical design studies, most of designers solve multidisciplinary problems with large sized and complex design system. These multidisciplinary problems have hundreds of analysis and thousands of variables. The sequence of process to solve these problems affects the speed of total design cycle. Thus it is very important for designer to reorder the original design processes to minimize total computational cost. This is accomplished by decomposing large multidisciplinary problem into several MultiDisciplinary Analysis SubSystem (MDASS) and processing it in parallel. This paper proposes new strategy for parallel decomposition of multidisciplinary problem to raise design efficiency by using genetic algorithm and shows the relationship between decomposition and Multidisciplinary Design Optimization(MDO) methodology

  1. AHPCRC - Army High Performance Computing Research Center

    Science.gov (United States)

    2010-01-01

    computing. Of particular interest is the ability of a distrib- uted jamming network (DJN) to jam signals in all or part of a sensor or communications net...and reasoning, assistive technologies. FRIEDRICH (FRITZ) PRINZ Finmeccanica Professor of Engineering, Robert Bosch Chair, Department of Engineering...High Performance Computing Research Center www.ahpcrc.org BARBARA BRYAN AHPCRC Research and Outreach Manager, HPTi (650) 604-3732 bbryan@hpti.com Ms

  2. DURIP: High Performance Computing in Biomathematics Applications

    Science.gov (United States)

    2017-05-10

    Mathematics and Statistics (AMS) at the University of California, Santa Cruz (UCSC) to conduct research and research-related education in areas of...Computing in Biomathematics Applications Report Title The goal of this award was to enhance the capabilities of the Department of Applied Mathematics and...DURIP: High Performance Computing in Biomathematics Applications The goal of this award was to enhance the capabilities of the Department of Applied

  3. Computer technique for evaluating collimator performance

    International Nuclear Information System (INIS)

    Rollo, F.D.

    1975-01-01

    A computer program has been developed to theoretically evaluate the overall performance of collimators used with radioisotope scanners and γ cameras. The first step of the program involves the determination of the line spread function (LSF) and geometrical efficiency from the fundamental parameters of the collimator being evaluated. The working equations can be applied to any plane of interest. The resulting LSF is applied to subroutine computer programs which compute corresponding modulation transfer function and contrast efficiency functions. The latter function is then combined with appropriate geometrical efficiency data to determine the performance index function. The overall computer program allows one to predict from the physical parameters of the collimator alone how well the collimator will reproduce various sized spherical voids of activity in the image plane. The collimator performance program can be used to compare the performance of various collimator types, to study the effects of source depth on collimator performance, and to assist in the design of collimators. The theory of the collimator performance equation is discussed, a comparison between the experimental and theoretical LSF values is made, and examples of the application of the technique are presented

  4. Misleading Performance Claims in Parallel Computations

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, David H.

    2009-05-29

    In a previous humorous note entitled 'Twelve Ways to Fool the Masses,' I outlined twelve common ways in which performance figures for technical computer systems can be distorted. In this paper and accompanying conference talk, I give a reprise of these twelve 'methods' and give some actual examples that have appeared in peer-reviewed literature in years past. I then propose guidelines for reporting performance, the adoption of which would raise the level of professionalism and reduce the level of confusion, not only in the world of device simulation but also in the larger arena of technical computing.

  5. High-performance computing for airborne applications

    International Nuclear Information System (INIS)

    Quinn, Heather M.; Manuzatto, Andrea; Fairbanks, Tom; Dallmann, Nicholas; Desgeorges, Rose

    2010-01-01

    Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.

  6. Application of parallel processing for automatic inspection of printed circuits

    International Nuclear Information System (INIS)

    Lougheed, R.M.

    1986-01-01

    Automated visual inspection of printed electronic circuits is a challenging application for image processing systems. Detailed inspection requires high speed analysis of gray scale imagery along with high quality optics, lighting, and sensing equipment. A prototype system has been developed and demonstrated at the Environmental Research Institute of Michigan (ERIM) for inspection of multilayer thick-film circuits. The central problem of real-time image processing is solved by a special-purpose parallel processor which includes a new high-speed Cytocomputer. In this chapter the inspection process and the algorithms used are summarized, along with the functional requirements of the machine vision system. Next, the parallel processor is described in detail and then performance on this application is given

  7. Spatially parallel processing of within-dimension conjunctions.

    Science.gov (United States)

    Linnell, K J; Humphreys, G W

    2001-01-01

    Within-dimension conjunction search for red-green targets amongst red-blue, and blue-green, nontargets is extremely inefficient (Wolfe et al, 1990 Journal of Experimental Psychology: Human Perception and Performance 16 879-892). We tested whether pairs of red-green conjunction targets can nevertheless be processed spatially in parallel. Participants made speeded detection responses whenever a red-green target was present. Across trials where a second identical target was present, the distribution of detection times was compatible with the assumption that targets were processed in parallel (Miller, 1982 Cognitive Psychology 14 247-279). We show that this was not an artifact of response-competition or feature-based processing. We suggest that within-dimension conjunctions can be processed spatially in parallel. Visual search for such items may be inefficient owing to within-dimension grouping between items.

  8. High performance parallel computers for science

    International Nuclear Information System (INIS)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1989-01-01

    This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction

  9. Performative Computation-aided Design Optimization

    Directory of Open Access Journals (Sweden)

    Ming Tang

    2012-12-01

    Full Text Available This article discusses a collaborative research and teaching project between the University of Cincinnati, Perkins+Will’s Tech Lab, and the University of North Carolina Greensboro. The primary investigation focuses on the simulation, optimization, and generation of architectural designs using performance-based computational design approaches. The projects examine various design methods, including relationships between building form, performance and the use of proprietary software tools for parametric design.

  10. High performance computing on vector systems

    CERN Document Server

    Roller, Sabine

    2008-01-01

    Presents the developments in high-performance computing and simulation on modern supercomputer architectures. This book covers trends in hardware and software development in general and specifically the vector-based systems and heterogeneous architectures. It presents innovative fields like coupled multi-physics or multi-scale simulations.

  11. Reliable and Efficient Parallel Processing Algorithms and Architectures for Modern Signal Processing. Ph.D. Thesis

    Science.gov (United States)

    Liu, Kuojuey Ray

    1990-01-01

    Least-squares (LS) estimations and spectral decomposition algorithms constitute the heart of modern signal processing and communication problems. Implementations of recursive LS and spectral decomposition algorithms onto parallel processing architectures such as systolic arrays with efficient fault-tolerant schemes are the major concerns of this dissertation. There are four major results in this dissertation. First, we propose the systolic block Householder transformation with application to the recursive least-squares minimization. It is successfully implemented on a systolic array with a two-level pipelined implementation at the vector level as well as at the word level. Second, a real-time algorithm-based concurrent error detection scheme based on the residual method is proposed for the QRD RLS systolic array. The fault diagnosis, order degraded reconfiguration, and performance analysis are also considered. Third, the dynamic range, stability, error detection capability under finite-precision implementation, order degraded performance, and residual estimation under faulty situations for the QRD RLS systolic array are studied in details. Finally, we propose the use of multi-phase systolic algorithms for spectral decomposition based on the QR algorithm. Two systolic architectures, one based on triangular array and another based on rectangular array, are presented for the multiphase operations with fault-tolerant considerations. Eigenvectors and singular vectors can be easily obtained by using the multi-pase operations. Performance issues are also considered.

  12. Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

    Science.gov (United States)

    Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

    2015-01-01

    Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable.

  13. High-performance computing in seismology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-09-01

    The scientific, technical, and economic importance of the issues discussed here presents a clear agenda for future research in computational seismology. In this way these problems will drive advances in high-performance computing in the field of seismology. There is a broad community that will benefit from this work, including the petroleum industry, research geophysicists, engineers concerned with seismic hazard mitigation, and governments charged with enforcing a comprehensive test ban treaty. These advances may also lead to new applications for seismological research. The recent application of high-resolution seismic imaging of the shallow subsurface for the environmental remediation industry is an example of this activity. This report makes the following recommendations: (1) focused efforts to develop validated documented software for seismological computations should be supported, with special emphasis on scalable algorithms for parallel processors; (2) the education of seismologists in high-performance computing technologies and methodologies should be improved; (3) collaborations between seismologists and computational scientists and engineers should be increased; (4) the infrastructure for archiving, disseminating, and processing large volumes of seismological data should be improved.

  14. HPCToolkit: performance tools for scientific computing

    Energy Technology Data Exchange (ETDEWEB)

    Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M [Department of Computer Science, Rice University, Houston, TX 77005 (United States)

    2008-07-15

    As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei.

  15. HPCToolkit: performance tools for scientific computing

    International Nuclear Information System (INIS)

    Tallent, N; Mellor-Crummey, J; Adhianto, L; Fagan, M; Krentel, M

    2008-01-01

    As part of the U.S. Department of Energy's Scientific Discovery through Advanced Computing (SciDAC) program, science teams are tackling problems that require simulation and modeling on petascale computers. As part of activities associated with the SciDAC Center for Scalable Application Development Software (CScADS) and the Performance Engineering Research Institute (PERI), Rice University is building software tools for performance analysis of scientific applications on the leadership-class platforms. In this poster abstract, we briefly describe the HPCToolkit performance tools and how they can be used to pinpoint bottlenecks in SPMD and multi-threaded parallel codes. We demonstrate HPCToolkit's utility by applying it to two SciDAC applications: the S3D code for simulation of turbulent combustion and the MFDn code for ab initio calculations of microscopic structure of nuclei

  16. High performance computing and communications: Advancing the frontiers of information technology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    This report, which supplements the President`s Fiscal Year 1997 Budget, describes the interagency High Performance Computing and Communications (HPCC) Program. The HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of accomplishments to its credit. Over its five-year history, the HPCC Program has focused on developing high performance computing and communications technologies that can be applied to computation-intensive applications. Major highlights for FY 1996: (1) High performance computing systems enable practical solutions to complex problems with accuracies not possible five years ago; (2) HPCC-funded research in very large scale networking techniques has been instrumental in the evolution of the Internet, which continues exponential growth in size, speed, and availability of information; (3) The combination of hardware capability measured in gigaflop/s, networking technology measured in gigabit/s, and new computational science techniques for modeling phenomena has demonstrated that very large scale accurate scientific calculations can be executed across heterogeneous parallel processing systems located thousands of miles apart; (4) Federal investments in HPCC software R and D support researchers who pioneered the development of parallel languages and compilers, high performance mathematical, engineering, and scientific libraries, and software tools--technologies that allow scientists to use powerful parallel systems to focus on Federal agency mission applications; and (5) HPCC support for virtual environments has enabled the development of immersive technologies, where researchers can explore and manipulate multi-dimensional scientific and engineering problems. Educational programs fostered by the HPCC Program have brought into classrooms new science and engineering curricula designed to teach computational science. This document contains a small sample of the significant HPCC Program accomplishments in FY 1996.

  17. Analytical performance modeling for computer systems

    CERN Document Server

    Tay, Y C

    2013-01-01

    This book is an introduction to analytical performance modeling for computer systems, i.e., writing equations to describe their performance behavior. It is accessible to readers who have taken college-level courses in calculus and probability, networking and operating systems. This is not a training manual for becoming an expert performance analyst. Rather, the objective is to help the reader construct simple models for analyzing and understanding the systems that they are interested in.Describing a complicated system abstractly with mathematical equations requires a careful choice of assumpti

  18. Computer performance optimization systems, applications, processes

    CERN Document Server

    Osterhage, Wolfgang W

    2013-01-01

    Computing power performance was important at times when hardware was still expensive, because hardware had to be put to the best use. Later on this criterion was no longer critical, since hardware had become inexpensive. Meanwhile, however, people have realized that performance again plays a significant role, because of the major drain on system resources involved in developing complex applications. This book distinguishes between three levels of performance optimization: the system level, application level and business processes level. On each, optimizations can be achieved and cost-cutting p

  19. High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  20. High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael; HLRS 2017

    2018-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  1. A methodology for performing computer security reviews

    International Nuclear Information System (INIS)

    Hunteman, W.J.

    1991-01-01

    DOE Order 5637.1, ''Classified Computer Security,'' requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, we have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system. 1 tab

  2. A methodology for performing computer security reviews

    International Nuclear Information System (INIS)

    Hunteman, W.J.

    1991-01-01

    This paper reports on DIE Order 5637.1, Classified Computer Security, which requires regular reviews of the computer security activities for an ADP system and for a site. Based on experiences gained in the Los Alamos computer security program through interactions with DOE facilities, the authors have developed a methodology to aid a site or security officer in performing a comprehensive computer security review. The methodology is designed to aid a reviewer in defining goals of the review (e.g., preparation for inspection), determining security requirements based on DOE policies, determining threats/vulnerabilities based on DOE and local threat guidance, and identifying critical system components to be reviewed. Application of the methodology will result in review procedures and checklists oriented to the review goals, the target system, and DOE policy requirements. The review methodology can be used to prepare for an audit or inspection and as a periodic self-check tool to determine the status of the computer security program for a site or specific ADP system

  3. HIGH PERFORMANCE PHOTOGRAMMETRIC PROCESSING ON COMPUTER CLUSTERS

    Directory of Open Access Journals (Sweden)

    V. N. Adrov

    2012-07-01

    Full Text Available Most cpu consuming tasks in photogrammetric processing can be done in parallel. The algorithms take independent bits as input and produce independent bits as output. The independence of bits comes from the nature of such algorithms since images, stereopairs or small image blocks parts can be processed independently. Many photogrammetric algorithms are fully automatic and do not require human interference. Photogrammetric workstations can perform tie points measurements, DTM calculations, orthophoto construction, mosaicing and many other service operations in parallel using distributed calculations. Distributed calculations save time reducing several days calculations to several hours calculations. Modern trends in computer technology show the increase of cpu cores in workstations, speed increase in local networks, and as a result dropping the price of the supercomputers or computer clusters that can contain hundreds or even thousands of computing nodes. Common distributed processing in DPW is usually targeted for interactive work with a limited number of cpu cores and is not optimized for centralized administration. The bottleneck of common distributed computing in photogrammetry can be in the limited lan throughput and storage performance, since the processing of huge amounts of large raster images is needed.

  4. Monitoring SLAC High Performance UNIX Computing Systems

    International Nuclear Information System (INIS)

    Lettsome, Annette K.

    2005-01-01

    Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia. Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface

  5. High-Performance Compute Infrastructure in Astronomy: 2020 Is Only Months Away

    Science.gov (United States)

    Berriman, B.; Deelman, E.; Juve, G.; Rynge, M.; Vöckler, J. S.

    2012-09-01

    By 2020, astronomy will be awash with as much as 60 PB of public data. Full scientific exploitation of such massive volumes of data will require high-performance computing on server farms co-located with the data. Development of this computing model will be a community-wide enterprise that has profound cultural and technical implications. Astronomers must be prepared to develop environment-agnostic applications that support parallel processing. The community must investigate the applicability and cost-benefit of emerging technologies such as cloud computing to astronomy, and must engage the Computer Science community to develop science-driven cyberinfrastructure such as workflow schedulers and optimizers. We report here the results of collaborations between a science center, IPAC, and a Computer Science research institute, ISI. These collaborations may be considered pathfinders in developing a high-performance compute infrastructure in astronomy. These collaborations investigated two exemplar large-scale science-driver workflow applications: 1) Calculation of an infrared atlas of the Galactic Plane at 18 different wavelengths by placing data from multiple surveys on a common plate scale and co-registering all the pixels; 2) Calculation of an atlas of periodicities present in the public Kepler data sets, which currently contain 380,000 light curves. These products have been generated with two workflow applications, written in C for performance and designed to support parallel processing on multiple environments and platforms, but with different compute resource needs: the Montage image mosaic engine is I/O-bound, and the NASA Star and Exoplanet Database periodogram code is CPU-bound. Our presentation will report cost and performance metrics and lessons-learned for continuing development. Applicability of Cloud Computing: Commercial Cloud providers generally charge for all operations, including processing, transfer of input and output data, and for storage of data

  6. Cloud Computing for Maintenance Performance Improvement

    OpenAIRE

    Kour, Ravdeep; Karim, Ramin; Parida, Aditya

    2013-01-01

    Cloud Computing is an emerging research area. It can be utilised for acquiring an effective and efficient information logistics. This paper uses cloud-based technology for the establishment of information logistics for railway system which requires information based on data from different data sources (e.g. railway maintenance, railway operation, and railway business data). In order to improve the performance of the maintenance process relevant data from various sources need to be acquired, f...

  7. High Performance Computing Operations Review Report

    Energy Technology Data Exchange (ETDEWEB)

    Cupps, Kimberly C. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2013-12-19

    The High Performance Computing Operations Review (HPCOR) meeting—requested by the ASC and ASCR program headquarters at DOE—was held November 5 and 6, 2013, at the Marriott Hotel in San Francisco, CA. The purpose of the review was to discuss the processes and practices for HPC integration and its related software and facilities. Experiences and lessons learned from the most recent systems deployed were covered in order to benefit the deployment of new systems.

  8. Towards OpenVL: Improving Real-Time Performance of Computer Vision Applications

    Science.gov (United States)

    Shen, Changsong; Little, James J.; Fels, Sidney

    Meeting constraints for real-time performance is a main issue for computer vision, especially for embedded computer vision systems. This chapter presents our progress on our open vision library (OpenVL), a novel software architecture to address efficiency through facilitating hardware acceleration, reusability, and scalability for computer vision systems. A logical image understanding pipeline is introduced to allow parallel processing. We also discuss progress on our middleware—vision library utility toolkit (VLUT)—that enables applications to operate transparently over a heterogeneous collection of hardware implementations. OpenVL works as a state machine,with an event-driven mechanismto provide users with application-level interaction. Various explicit or implicit synchronization and communication methods are supported among distributed processes in the logical pipelines. The intent of OpenVL is to allow users to quickly and easily recover useful information from multiple scenes, in a cross-platform, cross-language manner across various software environments and hardware platforms. To validate the critical underlying concepts of OpenVL, a human tracking system and a local positioning system are implemented and described. The novel architecture separates the specification of algorithmic details from the underlying implementation, allowing for different components to be implemented on an embedded system without recompiling code.

  9. Parallel processing data network of master and slave transputers controlled by a serial control network

    Science.gov (United States)

    Crosetto, Dario B.

    1996-01-01

    The present device provides for a dynamically configurable communication network having a multi-processor parallel processing system having a serial communication network and a high speed parallel communication network. The serial communication network is used to disseminate commands from a master processor (100) to a plurality of slave processors (200) to effect communication protocol, to control transmission of high density data among nodes and to monitor each slave processor's status. The high speed parallel processing network is used to effect the transmission of high density data among nodes in the parallel processing system. Each node comprises a transputer (104), a digital signal processor (114), a parallel transfer controller (106), and two three-port memory devices. A communication switch (108) within each node (100) connects it to a fast parallel hardware channel (70) through which all high density data arrives or leaves the node.

  10. The path toward HEP High Performance Computing

    International Nuclear Information System (INIS)

    Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

    2014-01-01

    High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit

  11. A High Performance VLSI Computer Architecture For Computer Graphics

    Science.gov (United States)

    Chin, Chi-Yuan; Lin, Wen-Tai

    1988-10-01

    A VLSI computer architecture, consisting of multiple processors, is presented in this paper to satisfy the modern computer graphics demands, e.g. high resolution, realistic animation, real-time display etc.. All processors share a global memory which are partitioned into multiple banks. Through a crossbar network, data from one memory bank can be broadcasted to many processors. Processors are physically interconnected through a hyper-crossbar network (a crossbar-like network). By programming the network, the topology of communication links among processors can be reconfigurated to satisfy specific dataflows of different applications. Each processor consists of a controller, arithmetic operators, local memory, a local crossbar network, and I/O ports to communicate with other processors, memory banks, and a system controller. Operations in each processor are characterized into two modes, i.e. object domain and space domain, to fully utilize the data-independency characteristics of graphics processing. Special graphics features such as 3D-to-2D conversion, shadow generation, texturing, and reflection, can be easily handled. With the current high density interconnection (MI) technology, it is feasible to implement a 64-processor system to achieve 2.5 billion operations per second, a performance needed in most advanced graphics applications.

  12. NINJA: Java for High Performance Numerical Computing

    Directory of Open Access Journals (Sweden)

    José E. Moreira

    2002-01-01

    Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.

  13. RISC Processors and High Performance Computing

    Science.gov (United States)

    Bailey, David H.; Saini, Subhash; Craw, James M. (Technical Monitor)

    1995-01-01

    This tutorial will discuss the top five RISC microprocessors and the parallel systems in which they are used. It will provide a unique cross-machine comparison not available elsewhere. The effective performance of these processors will be compared by citing standard benchmarks in the context of real applications. The latest NAS Parallel Benchmarks, both absolute performance and performance per dollar, will be listed. The next generation of the NPB will be described. The tutorial will conclude with a discussion of future directions in the field. Technology Transfer Considerations: All of these computer systems are commercially available internationally. Information about these processors is available in the public domain, mostly from the vendors themselves. The NAS Parallel Benchmarks and their results have been previously approved numerous times for public release, beginning back in 1991.

  14. Computer fan performance enhancement via acoustic perturbations

    Energy Technology Data Exchange (ETDEWEB)

    Greenblatt, David, E-mail: davidg@technion.ac.il [Faculty of Mechanical Engineering, Technion - Israel Institute of Technology, Haifa (Israel); Avraham, Tzahi; Golan, Maayan [Faculty of Mechanical Engineering, Technion - Israel Institute of Technology, Haifa (Israel)

    2012-04-15

    Highlights: Black-Right-Pointing-Pointer Computer fan effectiveness was increased by introducing acoustic perturbations. Black-Right-Pointing-Pointer Acoustic perturbations controlled blade boundary layer separation. Black-Right-Pointing-Pointer Optimum frequencies corresponded with airfoils studies. Black-Right-Pointing-Pointer Exploitation of flow instabilities was responsible for performance improvements. Black-Right-Pointing-Pointer Peak pressure and peak flowrate were increased by 40% and 15% respectively. - Abstract: A novel technique for increasing computer fan effectiveness, based on introducing acoustic perturbations onto the fan blades to control boundary layer separation, was assessed. Experiments were conducted in a specially designed facility that simultaneously allowed characterization of fan performance and introduction of the perturbations. A parametric study was conducted to determine the optimum control parameters, namely those that deliver the largest increase in fan pressure for a given flowrate. The optimum reduced frequencies corresponded with those identified on stationary airfoils and it was thus concluded that the exploitation of Kelvin-Helmholtz instabilities, commonly observed on airfoils, was responsible for the fan blade performance improvements. The optimum control inputs, such as acoustic frequency and sound pressure level, showed some variation with different fan flowrates. With the near-optimum control conditions identified, the full operational envelope of the fan, when subjected to acoustic perturbations, was assessed. The peak pressure and peak flowrate were increased by up to 40% and 15% respectively. The peak fan efficiency increased with acoustic perturbations but the overall system efficiency was reduced when the speaker input power was accounted for.

  15. Computer fan performance enhancement via acoustic perturbations

    International Nuclear Information System (INIS)

    Greenblatt, David; Avraham, Tzahi; Golan, Maayan

    2012-01-01

    Highlights: ► Computer fan effectiveness was increased by introducing acoustic perturbations. ► Acoustic perturbations controlled blade boundary layer separation. ► Optimum frequencies corresponded with airfoils studies. ► Exploitation of flow instabilities was responsible for performance improvements. ► Peak pressure and peak flowrate were increased by 40% and 15% respectively. - Abstract: A novel technique for increasing computer fan effectiveness, based on introducing acoustic perturbations onto the fan blades to control boundary layer separation, was assessed. Experiments were conducted in a specially designed facility that simultaneously allowed characterization of fan performance and introduction of the perturbations. A parametric study was conducted to determine the optimum control parameters, namely those that deliver the largest increase in fan pressure for a given flowrate. The optimum reduced frequencies corresponded with those identified on stationary airfoils and it was thus concluded that the exploitation of Kelvin–Helmholtz instabilities, commonly observed on airfoils, was responsible for the fan blade performance improvements. The optimum control inputs, such as acoustic frequency and sound pressure level, showed some variation with different fan flowrates. With the near-optimum control conditions identified, the full operational envelope of the fan, when subjected to acoustic perturbations, was assessed. The peak pressure and peak flowrate were increased by up to 40% and 15% respectively. The peak fan efficiency increased with acoustic perturbations but the overall system efficiency was reduced when the speaker input power was accounted for.

  16. High performance computations using dynamical nucleation theory

    International Nuclear Information System (INIS)

    Windus, T L; Crosby, L D; Kathmann, S M

    2008-01-01

    Chemists continue to explore the use of very large computations to perform simulations that describe the molecular level physics of critical challenges in science. In this paper, we describe the Dynamical Nucleation Theory Monte Carlo (DNTMC) model - a model for determining molecular scale nucleation rate constants - and its parallel capabilities. The potential for bottlenecks and the challenges to running on future petascale or larger resources are delineated. A 'master-slave' solution is proposed to scale to the petascale and will be developed in the NWChem software. In addition, mathematical and data analysis challenges are described

  17. Analysis of parallel computing performance of the code MCNP

    International Nuclear Information System (INIS)

    Wang Lei; Wang Kan; Yu Ganglin

    2006-01-01

    Parallel computing can reduce the running time of the code MCNP effectively. With the MPI message transmitting software, MCNP5 can achieve its parallel computing on PC cluster with Windows operating system. Parallel computing performance of MCNP is influenced by factors such as the type, the complexity level and the parameter configuration of the computing problem. This paper analyzes the parallel computing performance of MCNP regarding with these factors and gives measures to improve the MCNP parallel computing performance. (authors)

  18. Evaluation of high-performance computing software

    Energy Technology Data Exchange (ETDEWEB)

    Browne, S.; Dongarra, J. [Univ. of Tennessee, Knoxville, TN (United States); Rowan, T. [Oak Ridge National Lab., TN (United States)

    1996-12-31

    The absence of unbiased and up to date comparative evaluations of high-performance computing software complicates a user`s search for the appropriate software package. The National HPCC Software Exchange (NHSE) is attacking this problem using an approach that includes independent evaluations of software, incorporation of author and user feedback into the evaluations, and Web access to the evaluations. We are applying this approach to the Parallel Tools Library (PTLIB), a new software repository for parallel systems software and tools, and HPC-Netlib, a high performance branch of the Netlib mathematical software repository. Updating the evaluations with feed-back and making it available via the Web helps ensure accuracy and timeliness, and using independent reviewers produces unbiased comparative evaluations difficult to find elsewhere.

  19. Cocaine Use and Delinquent Behavior among High-Risk Youths: A Growth Model of Parallel Processes

    Science.gov (United States)

    Dembo, Richard; Sullivan, Christopher

    2009-01-01

    We report the results of a parallel-process, latent growth model analysis examining the relationships between cocaine use and delinquent behavior among youths. The study examined a sample of 278 justice-involved juveniles completing at least one of three follow-up interviews as part of a National Institute on Drug Abuse-funded study. The results…

  20. Parallel processing for nonlinear dynamics simulations of structures including rotating bladed-disk assemblies

    Science.gov (United States)

    Hsieh, Shang-Hsien

    1993-01-01

    The principal objective of this research is to develop, test, and implement coarse-grained, parallel-processing strategies for nonlinear dynamic simulations of practical structural problems. There are contributions to four main areas: finite element modeling and analysis of rotational dynamics, numerical algorithms for parallel nonlinear solutions, automatic partitioning techniques to effect load-balancing among processors, and an integrated parallel analysis system.

  1. Psychodrama: A Creative Approach for Addressing Parallel Process in Group Supervision

    Science.gov (United States)

    Hinkle, Michelle Gimenez

    2008-01-01

    This article provides a model for using psychodrama to address issues of parallel process during group supervision. Information on how to utilize the specific concepts and techniques of psychodrama in relation to group supervision is discussed. A case vignette of the model is provided.

  2. Fear Control an Danger Control: A Test of the Extended Parallel Process Model (EPPM).

    Science.gov (United States)

    Witte, Kim

    1994-01-01

    Explores cognitive and emotional mechanisms underlying success and failure of fear appeals in context of AIDS prevention. Offers general support for Extended Parallel Process Model. Suggests that cognitions lead to fear appeal success (attitude, intention, or behavior changes) via danger control processes, whereas the emotion fear leads to fear…

  3. An Inconvenient Truth: An Application of the Extended Parallel Process Model

    Science.gov (United States)

    Goodall, Catherine E.; Roberto, Anthony J.

    2008-01-01

    "An Inconvenient Truth" is an Academy Award-winning documentary about global warming presented by Al Gore. This documentary is appropriate for a lesson on fear appeals and the extended parallel process model (EPPM). The EPPM is concerned with the effects of perceived threat and efficacy on behavior change. Perceived threat is composed of an…

  4. Parallel processing and non-uniform grids in global air quality modeling

    NARCIS (Netherlands)

    Berkvens, P.J.F.; Bochev, Mikhail A.

    2002-01-01

    A large-scale global air quality model, running efficiently on a single vector processor, is enhanced to make more realistic and more long-term simulations feasible. Two strategies are combined: non-uniform grids and parallel processing. The communication through the hierarchy of non-uniform grids

  5. Strong Bisimilarity and Regularity of Basic Parallel Processes is PSPACE-Hard

    DEFF Research Database (Denmark)

    Srba, Jirí

    2002-01-01

    We show that the problem of checking whether two processes definable in the syntax of Basic Parallel Processes (BPP) are strongly bisimilar is PSPACE-hard. We also demonstrate that there is a polynomial time reduction from the strong bisimilarity checking problem of regular BPP to the strong...

  6. Sustainability Attitudes and Behavioral Motivations of College Students: Testing the Extended Parallel Process Model

    Science.gov (United States)

    Perrault, Evan K.; Clark, Scott K.

    2018-01-01

    Purpose: A planet that can no longer sustain life is a frightening thought--and one that is often present in mass media messages. Therefore, this study aims to test the components of a classic fear appeal theory, the extended parallel process model (EPPM) and to determine how well its constructs predict sustainability behavioral intentions. This…

  7. The path toward HEP High Performance Computing

    CERN Document Server

    Apostolakis, John; Carminati, Federico; Gheata, Andrei; Wenzel, Sandro

    2014-01-01

    High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on th...

  8. A High Performance COTS Based Computer Architecture

    Science.gov (United States)

    Patte, Mathieu; Grimoldi, Raoul; Trautner, Roland

    2014-08-01

    Using Commercial Off The Shelf (COTS) electronic components for space applications is a long standing idea. Indeed the difference in processing performance and energy efficiency between radiation hardened components and COTS components is so important that COTS components are very attractive for use in mass and power constrained systems. However using COTS components in space is not straightforward as one must account with the effects of the space environment on the COTS components behavior. In the frame of the ESA funded activity called High Performance COTS Based Computer, Airbus Defense and Space and its subcontractor OHB CGS have developed and prototyped a versatile COTS based architecture for high performance processing. The rest of the paper is organized as follows: in a first section we will start by recapitulating the interests and constraints of using COTS components for space applications; then we will briefly describe existing fault mitigation architectures and present our solution for fault mitigation based on a component called the SmartIO; in the last part of the paper we will describe the prototyping activities executed during the HiP CBC project.

  9. Performance evaluation of a computed radiography system

    Energy Technology Data Exchange (ETDEWEB)

    Roussilhe, J.; Fallet, E. [Carestream Health France, 71 - Chalon/Saone (France); Mango, St.A. [Carestream Health, Inc. Rochester, New York (United States)

    2007-07-01

    Computed radiography (CR) standards have been formalized and published in Europe and in the US. The CR system classification is defined in those standards by - minimum normalized signal-to-noise ratio (SNRN), and - maximum basic spatial resolution (SRb). Both the signal-to-noise ratio (SNR) and the contrast sensitivity of a CR system depend on the dose (exposure time and conditions) at the detector. Because of their wide dynamic range, the same storage phosphor imaging plate can qualify for all six CR system classes. The exposure characteristics from 30 to 450 kV, the contrast sensitivity, and the spatial resolution of the KODAK INDUSTREX CR Digital System have been thoroughly evaluated. This paper will present some of the factors that determine the system's spatial resolution performance. (authors)

  10. Computers in engineering. 1988

    International Nuclear Information System (INIS)

    Tipnis, V.A.; Patton, E.M.

    1988-01-01

    These proceedings discuss the following subjects: Knowledge base systems; Computers in designing; uses of artificial intelligence; engineering optimization and expert systems of accelerators; and parallel processing in designing

  11. High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2003-01-01

    This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.

  12. High-Performance Computing Paradigm and Infrastructure

    CERN Document Server

    Yang, Laurence T

    2006-01-01

    With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream

  13. High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2000-01-01

    The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.

  14. High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    1999-01-01

    The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.

  15. Investigation of Mediational Processes Using Parallel Process Latent Growth Curve Modeling

    Science.gov (United States)

    Cheong, JeeWon; MacKinnon, David P.; Khoo, Siek Toon

    2010-01-01

    This study investigated a method to evaluate mediational processes using latent growth curve modeling. The mediator and the outcome measured across multiple time points were viewed as 2 separate parallel processes. The mediational process was defined as the independent variable influencing the growth of the mediator, which, in turn, affected the growth of the outcome. To illustrate modeling procedures, empirical data from a longitudinal drug prevention program, Adolescents Training and Learning to Avoid Steroids, were used. The program effects on the growth of the mediator and the growth of the outcome were examined first in a 2-group structural equation model. The mediational process was then modeled and tested in a parallel process latent growth curve model by relating the prevention program condition, the growth rate factor of the mediator, and the growth rate factor of the outcome. PMID:20157639

  16. Resistance to awareness of the supervisor's transferences with special reference to the parallel process.

    Science.gov (United States)

    Stimmel, B

    1995-06-01

    Supervision is an essential part of psychoanalytic education. Although not taken for granted, it is not studied with the same critical eye as is the analytic process. This paper examines the supervision specifically with a focus on the supervisor's transference towards the supervisee. The point is made, in the context of clinical examples, that one of the ways these transference reactions may be rationalised is within the setting of the parallel process so often encountered in supervision. Parallel process, a very familiar term, is used frequently and easily when discussing supervision. It may be used also as a resistance to awareness of transference phenomena within the supervisor in relation to the supervisee, particularly because of its clinical presentation. It is an enactment between supervisor and supervisee, thus ripe with possibilities for disguise, displacement and gratification. While transference reactions of the supervisee are often discussed, those of the supervisor are notably missing in our literature.

  17. Strong Bisimilarity and Regularity of Basic Parallel Processes is PSPACE-Hard

    DEFF Research Database (Denmark)

    Srba, Jirí

    2002-01-01

    We show that the problem of checking whether two processes definable in the syntax of Basic Parallel Processes (BPP) are strongly bisimilar is PSPACE-hard. We also demonstrate that there is a polynomial time reduction from the strong bisimilarity checking problem of regular BPP to the strong...... regularity (finiteness) checking of BPP. This implies that strong regularity of BPP is also PSPACE-hard....

  18. When fast logic meets slow belief: Evidence for a parallel-processing model of belief bias

    OpenAIRE

    Trippas, Dries; Thompson, Valerie A.; Handley, Simon J.

    2016-01-01

    Two experiments pitted the default-interventionist account of belief bias against a parallel-processing model. According to the former, belief bias occurs because a fast, belief-based evaluation of the conclusion pre-empts a working-memory demanding logical analysis. In contrast, according to the latter both belief-based and logic-based responding occur in parallel. Participants were given deductive reasoning problems of variable complexity and instructed to decide whether the conclusion was ...

  19. The vector and parallel processing of MORSE code on Monte Carlo Machine

    International Nuclear Information System (INIS)

    Hasegawa, Yukihiro; Higuchi, Kenji.

    1995-11-01

    Multi-group Monte Carlo Code for particle transport, MORSE is modified for high performance computing on Monte Carlo Machine Monte-4. The method and the results are described. Monte-4 was specially developed to realize high performance computing of Monte Carlo codes for particle transport, which have been difficult to obtain high performance in vector processing on conventional vector processors. Monte-4 has four vector processor units with the special hardware called Monte Carlo pipelines. The vectorization and parallelization of MORSE code and the performance evaluation on Monte-4 are described. (author)

  20. DOE research in utilization of high-performance computers

    International Nuclear Information System (INIS)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-12-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure

  1. Performance of Air Pollution Models on Massively Parallel Computers

    DEFF Research Database (Denmark)

    Brown, John; Hansen, Per Christian; Wasniewski, Jerzy

    1996-01-01

    To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on the computers. Using a realistic large-scale model, we gain detailed insight about the performance of the three computers when used to solve large-scale scientific problems...

  2. Peregrine System | High-Performance Computing | NREL

    Science.gov (United States)

    classes of nodes that users access: Login Nodes Peregrine has four login nodes, each of which has Intel E5 /scratch file systems, the /mss file system is mounted on all login nodes. Compute Nodes Peregrine has 2592

  3. I-deas TMG to NX Space Systems Thermal Model Conversion and Computational Performance Comparison

    Science.gov (United States)

    Somawardhana, Ruwan

    2011-01-01

    CAD/CAE packages change on a continuous basis as the power of the tools increase to meet demands. End -users must adapt to new products as they come to market and replace legacy packages. CAE modeling has continued to evolve and is constantly becoming more detailed and complex. Though this comes at the cost of increased computing requirements Parallel processing coupled with appropriate hardware can minimize computation time. Users of Maya Thermal Model Generator (TMG) are faced with transitioning from NX I -deas to NX Space Systems Thermal (SST). It is important to understand what differences there are when changing software packages We are looking for consistency in results.

  4. Contemporary high performance computing from petascale toward exascale

    CERN Document Server

    Vetter, Jeffrey S

    2013-01-01

    Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the

  5. Performing an allreduce operation on a plurality of compute nodes of a parallel computer

    Science.gov (United States)

    Faraj, Ahmad [Rochester, MN

    2012-04-17

    Methods, apparatus, and products are disclosed for performing an allreduce operation on a plurality of compute nodes of a parallel computer. Each compute node includes at least two processing cores. Each processing core has contribution data for the allreduce operation. Performing an allreduce operation on a plurality of compute nodes of a parallel computer includes: establishing one or more logical rings among the compute nodes, each logical ring including at least one processing core from each compute node; performing, for each logical ring, a global allreduce operation using the contribution data for the processing cores included in that logical ring, yielding a global allreduce result for each processing core included in that logical ring; and performing, for each compute node, a local allreduce operation using the global allreduce results for each processing core on that compute node.

  6. LMFAO! Humor as a Response to Fear: Decomposing Fear Control within the Extended Parallel Process Model

    Science.gov (United States)

    Abril, Eulàlia P.; Szczypka, Glen; Emery, Sherry L.

    2017-01-01

    This study seeks to analyze fear control responses to the 2012 Tips from Former Smokers campaign using the Extended Parallel Process Model (EPPM). The goal is to examine the occurrence of ancillary fear control responses, like humor. In order to explore individuals’ responses in an organic setting, we use Twitter data—tweets—collected via the Firehose. Content analysis of relevant fear control tweets (N = 14,281) validated the existence of boomerang responses within the EPPM: denial, defensive avoidance, and reactance. More importantly, results showed that humor tweets were not only a significant occurrence but constituted the majority of fear control responses. PMID:29527092

  7. Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

    Science.gov (United States)

    Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

    2013-01-01

    With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…

  8. Fine-grain Parallel Processing On A Commodity Platform: A Solution For The Atlas Second-level Trigger

    CERN Document Server

    Boosten, M

    2003-01-01

    From 2005 on, CERN expects to have a new accelerator available for experiments: the Large Hadron Collider (LHC), with a circumference of 27 kilometres. The ATLAS detector produces 40 TeraBytes/s of data. Only a fraction of all data is interesting. A computer system, called the trigger, selects the interesting data through real-time data analysis. The trigger consists of three subsequent filtering levels: LVL1, LVL2, and LVL3. LVL1 will be implemented using special-purpose hardware. LVL2 and LVL3 will be implemented using a Network Of Workstations (NOW). A major problem is to make efficient use of the computing power available in each workstation. The major contribution of this designer's project is an infrastructure named MESH. MESH enables CERN to cost- effectively implement the LVL2 trigger. Furthermore, due to the use of commodity technology, MESH enables the LVL2 trigger to be cost-effectively upgraded and supported during its 20 year lifecycle. MESH facilitates efficient parallel processing on PCs interc...

  9. Performance Analysis of Cloud Computing Architectures Using Discrete Event Simulation

    Science.gov (United States)

    Stocker, John C.; Golomb, Andrew M.

    2011-01-01

    Cloud computing offers the economic benefit of on-demand resource allocation to meet changing enterprise computing needs. However, the flexibility of cloud computing is disadvantaged when compared to traditional hosting in providing predictable application and service performance. Cloud computing relies on resource scheduling in a virtualized network-centric server environment, which makes static performance analysis infeasible. We developed a discrete event simulation model to evaluate the overall effectiveness of organizations in executing their workflow in traditional and cloud computing architectures. The two part model framework characterizes both the demand using a probability distribution for each type of service request as well as enterprise computing resource constraints. Our simulations provide quantitative analysis to design and provision computing architectures that maximize overall mission effectiveness. We share our analysis of key resource constraints in cloud computing architectures and findings on the appropriateness of cloud computing in various applications.

  10. Inclusive vision for high performance computing at the CSIR

    CSIR Research Space (South Africa)

    Gazendam, A

    2006-02-01

    Full Text Available and computationally intensive applications. A number of different technologies and standards were identified as core to the open and distributed high-performance infrastructure envisaged...

  11. A parallel process growth model of avoidant personality disorder symptoms and personality traits.

    Science.gov (United States)

    Wright, Aidan G C; Pincus, Aaron L; Lenzenweger, Mark F

    2013-07-01

    Avoidant personality disorder (AVPD), like other personality disorders, has historically been construed as a highly stable disorder. However, results from a number of longitudinal studies have found that the symptoms of AVPD demonstrate marked change over time. Little is known about which other psychological systems are related to this change. Although cross-sectional research suggests a strong relationship between AVPD and personality traits, no work has examined the relationship of their change trajectories. The current study sought to establish the longitudinal relationship between AVPD and basic personality traits using parallel process growth curve modeling. Parallel process growth curve modeling was applied to the trajectories of AVPD and basic personality traits from the Longitudinal Study of Personality Disorders (Lenzenweger, M. F., 2006, The longitudinal study of personality disorders: History, design considerations, and initial findings. Journal of Personality Disorders, 20, 645-670. doi:10.1521/pedi.2006.20.6.645), a naturalistic, prospective, multiwave, longitudinal study of personality disorder, temperament, and normal personality. The focus of these analyses is on the relationship between the rates of change in both AVPD symptoms and basic personality traits. AVPD symptom trajectories demonstrated significant negative relationships with the trajectories of interpersonal dominance and affiliation, and a significant positive relationship to rates of change in neuroticism. These results provide some of the first compelling evidence that trajectories of change in PD symptoms and personality traits are linked. These results have important implications for the ways in which temporal stability is conceptualized in AVPD specifically, and PD in general.

  12. A Parallel Process Growth Model of Avoidant Personality Disorder Symptoms and Personality Traits

    Science.gov (United States)

    Wright, Aidan G. C.; Pincus, Aaron L.; Lenzenweger, Mark F.

    2012-01-01

    Background Avoidant personality disorder (AVPD), like other personality disorders, has historically been construed as a highly stable disorder. However, results from a number of longitudinal studies have found that the symptoms of AVPD demonstrate marked change over time. Little is known about which other psychological systems are related to this change. Although cross-sectional research suggests a strong relationship between AVPD and personality traits, no work has examined the relationship of their change trajectories. The current study sought to establish the longitudinal relationship between AVPD and basic personality traits using parallel process growth curve modeling. Methods Parallel process growth curve modeling was applied to the trajectories of AVPD and basic personality traits from the Longitudinal Study of Personality Disorders (Lenzenweger, 2006), a naturalistic, prospective, multiwave, longitudinal study of personality disorder, temperament, and normal personality. The focus of these analyses is on the relationship between the rates of change in both AVPD symptoms and basic personality traits. Results AVPD symptom trajectories demonstrated significant negative relationships with the trajectories of interpersonal dominance and affiliation, and a significant positive relationship to rates of change in neuroticism. Conclusions These results provide some of the first compelling evidence that trajectories of change in PD symptoms and personality traits are linked. These results have important implications for the ways in which temporal stability is conceptualized in AVPD specifically, and PD in general. PMID:22506627

  13. Advanced Certification Program for Computer Graphic Specialists. Final Performance Report.

    Science.gov (United States)

    Parkland Coll., Champaign, IL.

    A pioneer program in computer graphics was implemented at Parkland College (Illinois) to meet the demand for specialized technicians to visualize data generated on high performance computers. In summer 1989, 23 students were accepted into the pilot program. Courses included C programming, calculus and analytic geometry, computer graphics, and…

  14. Performance of the TRISTAN computer control network

    International Nuclear Information System (INIS)

    Koiso, H.; Abe, K.; Akiyama, A.; Katoh, T.; Kikutani, E.; Kurihara, N.; Kurokawa, S.; Oide, K.; Shinomoto, M.

    1985-01-01

    An N-to-N token ring network of twenty-four minicomputers controls the TRISTAN accelerator complex. The computers are linked by optical fiber cables with 10 Mbps transmission speed. The software system is based on the NODAL, a multi-computer interpreter language developed at CERN SPS. Typical messages exchanged between computers are NODAL programs and NODAL variables transmitted by the EXEC and the REMIT commands. These messages are exchanged as a cluster of packets whose maximum size is 512 bytes. At present, eleven minicomputers are connected to the network and the total length of the ring is 1.5 km. In this condition, the maximum attainable throughput is 980 kbytes/s. The response of a pair of an EXEC and a REMIT transactions which transmit a NODAL array A and one line of program 'REMIT A' and immediately remit the A is measured to be 95+0.039/chi/ ms, where /chi/ is the array size in byte. In ordinary accelerator operations, the maximum channel utilization is 2%, the average packet length is 96 bytes and the transmission rate is 10 kbytes/s

  15. Performing quantum computing experiments in the cloud

    Science.gov (United States)

    Devitt, Simon J.

    2016-09-01

    Quantum computing technology has reached a second renaissance in the past five years. Increased interest from both the private and public sector combined with extraordinary theoretical and experimental progress has solidified this technology as a major advancement in the 21st century. As anticipated my many, some of the first realizations of quantum computing technology has occured over the cloud, with users logging onto dedicated hardware over the classical internet. Recently, IBM has released the Quantum Experience, which allows users to access a five-qubit quantum processor. In this paper we take advantage of this online availability of actual quantum hardware and present four quantum information experiments. We utilize the IBM chip to realize protocols in quantum error correction, quantum arithmetic, quantum graph theory, and fault-tolerant quantum computation by accessing the device remotely through the cloud. While the results are subject to significant noise, the correct results are returned from the chip. This demonstrates the power of experimental groups opening up their technology to a wider audience and will hopefully allow for the next stage of development in quantum information technology.

  16. A high performance scientific cloud computing environment for materials simulations

    Science.gov (United States)

    Jorissen, K.; Vila, F. D.; Rehr, J. J.

    2012-09-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

  17. Performance Aspects of Synthesizable Computing Systems

    DEFF Research Database (Denmark)

    Schleuniger, Pascal

    Embedded systems are used in a broad range of applications that demand high performance within severely constrained mechanical, power, and cost requirements. Embedded systems implemented in ASIC technology tend to provide the highest performance, lowest power consumption and lowest unit cost. How...

  18. A high performance scientific cloud computing environment for materials simulations

    OpenAIRE

    Jorissen, Kevin; Vila, Fernando D.; Rehr, John J.

    2011-01-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including...

  19. High Performance Computing in Science and Engineering '08 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2009-01-01

    The discussions and plans on all scienti?c, advisory, and political levels to realize an even larger “European Supercomputer” in Germany, where the hardware costs alone will be hundreds of millions Euro – much more than in the past – are getting closer to realization. As part of the strategy, the three national supercomputing centres HLRS (Stuttgart), NIC/JSC (Julic ¨ h) and LRZ (Munich) have formed the Gauss Centre for Supercomputing (GCS) as a new virtual organization enabled by an agreement between the Federal Ministry of Education and Research (BMBF) and the state ministries for research of Baden-Wurttem ¨ berg, Bayern, and Nordrhein-Westfalen. Already today, the GCS provides the most powerful high-performance computing - frastructure in Europe. Through GCS, HLRS participates in the European project PRACE (Partnership for Advances Computing in Europe) and - tends its reach to all European member countries. These activities aligns well with the activities of HLRS in the European HPC infrastructur...

  20. Lamb wave propagation modelling and simulation using parallel processing architecture and graphical cards

    International Nuclear Information System (INIS)

    Paćko, P; Bielak, T; Staszewski, W J; Uhl, T; Spencer, A B; Worden, K

    2012-01-01

    This paper demonstrates new parallel computation technology and an implementation for Lamb wave propagation modelling in complex structures. A graphical processing unit (GPU) and computer unified device architecture (CUDA), available in low-cost graphical cards in standard PCs, are used for Lamb wave propagation numerical simulations. The local interaction simulation approach (LISA) wave propagation algorithm has been implemented as an example. Other algorithms suitable for parallel discretization can also be used in practice. The method is illustrated using examples related to damage detection. The results demonstrate good accuracy and effective computational performance of very large models. The wave propagation modelling presented in the paper can be used in many practical applications of science and engineering. (paper)

  1. Parallel Processing and Applied Mathematics. 10th International Conference, PPAM 2013. Revised Selected Papers

    DEFF Research Database (Denmark)

    The following topics are dealt with: parallel scientific computing; numerical algorithms; parallel nonnumerical algorithms; cloud computing; evolutionary computing; metaheuristics; applied mathematics; GPU computing; multicore systems; hybrid architectures; hierarchical parallelism; HPC systems......; power monitoring; energy monitoring; and distributed computing....

  2. Software Systems for High-performance Quantum Computing

    Energy Technology Data Exchange (ETDEWEB)

    Humble, Travis S [ORNL; Britt, Keith A [ORNL

    2016-01-01

    Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventional computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.

  3. Computer Self-Efficacy, Computer Anxiety, Performance and Personal Outcomes of Turkish Physical Education Teachers

    Science.gov (United States)

    Aktag, Isil

    2015-01-01

    The purpose of this study is to determine the computer self-efficacy, performance outcome, personal outcome, and affect and anxiety level of physical education teachers. Influence of teaching experience, computer usage and participation of seminars or in-service programs on computer self-efficacy level were determined. The subjects of this study…

  4. The Masterson Approach with play therapy: a parallel process between mother and child.

    Science.gov (United States)

    Mulherin, M A

    2001-01-01

    This paper discusses a case in which the Masterson Approach was used with play therapy to treat a child with a developing personality disorder. It describes the parallel progression of the child and mother in adjunct therapy throughout a six-year period. The unique value of the Masterson Approach is that it provides the therapist with a framework and tool to diagnose and treat a child during the dynamic process of play. The case describes the mother-child dyad throughout therapy. It traces their parallel processes that involve separation, individuation, rapprochement, and the recovery of real self-capacities. Each stage of treatment is described, including verbal interventions. The child's internal affective state and intrapsychic structure during the various stages of treatment are illustrated by representative pictures.

  5. When fast logic meets slow belief: Evidence for a parallel-processing model of belief bias.

    Science.gov (United States)

    Trippas, Dries; Thompson, Valerie A; Handley, Simon J

    2017-05-01

    Two experiments pitted the default-interventionist account of belief bias against a parallel-processing model. According to the former, belief bias occurs because a fast, belief-based evaluation of the conclusion pre-empts a working-memory demanding logical analysis. In contrast, according to the latter both belief-based and logic-based responding occur in parallel. Participants were given deductive reasoning problems of variable complexity and instructed to decide whether the conclusion was valid on half the trials or to decide whether the conclusion was believable on the other half. When belief and logic conflict, the default-interventionist view predicts that it should take less time to respond on the basis of belief than logic, and that the believability of a conclusion should interfere with judgments of validity, but not the reverse. The parallel-processing view predicts that beliefs should interfere with logic judgments only if the processing required to evaluate the logical structure exceeds that required to evaluate the knowledge necessary to make a belief-based judgment, and vice versa otherwise. Consistent with this latter view, for the simplest reasoning problems (modus ponens), judgments of belief resulted in lower accuracy than judgments of validity, and believability interfered more with judgments of validity than the converse. For problems of moderate complexity (modus tollens and single-model syllogisms), the interference was symmetrical, in that validity interfered with belief judgments to the same degree that believability interfered with validity judgments. For the most complex (three-term multiple-model syllogisms), conclusion believability interfered more with judgments of validity than vice versa, in spite of the significant interference from conclusion validity on judgments of belief.

  6. Computer performance evaluation of FACOM 230-75 computer system, (2)

    International Nuclear Information System (INIS)

    Fujii, Minoru; Asai, Kiyoshi

    1980-08-01

    In this report are described computer performance evaluations for FACOM230-75 computers in JAERI. The evaluations are performed on following items: (1) Cost/benefit analysis of timesharing terminals, (2) Analysis of the response time of timesharing terminals, (3) Analysis of throughout time for batch job processing, (4) Estimation of current potential demands for computer time, (5) Determination of appropriate number of card readers and line printers. These evaluations are done mainly from the standpoint of cost reduction of computing facilities. The techniques adapted are very practical ones. This report will be useful for those people who are concerned with the management of computing installation. (author)

  7. High Performance Computing in Science and Engineering '14

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2015-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and   engineers. The book comes with a wealth of color illustrations and tables of results.  

  8. High Performance Computing Modernization Program Kerberos Throughput Test Report

    Science.gov (United States)

    2017-10-26

    Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/5524--17-9751 High Performance Computing Modernization Program Kerberos Throughput Test ...NUMBER 5d. PROJECT NUMBER 5e. TASK NUMBER 5f. WORK UNIT NUMBER 2. REPORT TYPE1. REPORT DATE (DD-MM-YYYY) 4. TITLE AND SUBTITLE 6. AUTHOR(S) 8. PERFORMING...PAGE 18. NUMBER OF PAGES 17. LIMITATION OF ABSTRACT High Performance Computing Modernization Program Kerberos Throughput Test Report Daniel G. Gdula* and

  9. Performance evaluation of computer and communication systems

    CERN Document Server

    Le Boudec, Jean-Yves

    2011-01-01

    … written by a scientist successful in performance evaluation, it is based on his experience and provides many ideas not only to laymen entering the field, but also to practitioners looking for inspiration. The work can be read systematically as a textbook on how to model and test the derived hypotheses on the basis of simulations. Also, separate parts can be studied, as the chapters are self-contained. … the book can be successfully used either for self-study or as a supplementary book for a lecture. I believe that different types of readers will like it: practicing engineers and resea

  10. Computer task performance by subjects with Duchenne muscular dystrophy.

    Science.gov (United States)

    Malheiros, Silvia Regina Pinheiro; da Silva, Talita Dias; Favero, Francis Meire; de Abreu, Luiz Carlos; Fregni, Felipe; Ribeiro, Denise Cardoso; de Mello Monteiro, Carlos Bandeira

    2016-01-01

    Two specific objectives were established to quantify computer task performance among people with Duchenne muscular dystrophy (DMD). First, we compared simple computational task performance between subjects with DMD and age-matched typically developing (TD) subjects. Second, we examined correlations between the ability of subjects with DMD to learn the computational task and their motor functionality, age, and initial task performance. The study included 84 individuals (42 with DMD, mean age of 18±5.5 years, and 42 age-matched controls). They executed a computer maze task; all participants performed the acquisition (20 attempts) and retention (five attempts) phases, repeating the same maze. A different maze was used to verify transfer performance (five attempts). The Motor Function Measure Scale was applied, and the results were compared with maze task performance. In the acquisition phase, a significant decrease was found in movement time (MT) between the first and last acquisition block, but only for the DMD group. For the DMD group, MT during transfer was shorter than during the first acquisition block, indicating improvement from the first acquisition block to transfer. In addition, the TD group showed shorter MT than the DMD group across the study. DMD participants improved their performance after practicing a computational task; however, the difference in MT was present in all attempts among DMD and control subjects. Computational task improvement was positively influenced by the initial performance of individuals with DMD. In turn, the initial performance was influenced by their distal functionality but not their age or overall functionality.

  11. Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

    Science.gov (United States)

    Ahn, Sul-Ah; Jung, Youngim

    2016-10-01

    The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.

  12. A program system for ab initio MO calculations on vector and parallel processing machines. Pt. 1

    International Nuclear Information System (INIS)

    Ernenwein, R.; Rohmer, M.M.; Benard, M.

    1990-01-01

    We present a program system for ab initio molecular orbital calculations on vector and parallel computers. The present article is devoted to the computation of one- and two-electron integrals over contracted Gaussian basis sets involving s-, p-, d- and f-type functions. The McMurchie and Davidson (MMD) algorithm has been implemented and parallelized by distributing over a limited number of logical tasks the calculation of the 55 relevant classes of integrals. All sections of the MMD algorithm have been efficiently vectorized, leading to a scalar/vector ratio of 5.8. Different algorithms are proposed and compared for an optimal vectorization of the contraction of the 'intermediate integrals' generated by the MMD formalism. Advantage is taken of the dynamic storage allocation for tuning the length of the vector loops (i.e. the size of the vectorization buffer) as a function of (i) the total memory available for the job, (ii) the number of logical tasks defined by the user (≤13), and (iii) the storage requested by each specific class of integrals. Test calculations carried out on a CRAY-2 computer show that the average number of finite integrals computed over a (s, p, d, f) CGTO basis set is about 1180000 per second and per processor. The combination of vectorization and parallelism on this 4-processor machine reduces the CPU time by a factor larger than 20 with respect to the scalar and sequential performance. (orig.)

  13. Quantum Accelerators for High-performance Computing Systems

    Energy Technology Data Exchange (ETDEWEB)

    Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

    2017-11-01

    We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.

  14. A Heterogeneous High-Performance System for Computational and Computer Science

    Science.gov (United States)

    2016-11-15

    expand the research infrastructure at the institution but also to enhance the high -performance computing training provided to both undergraduate and... cloud computing, supercomputing, and the availability of cheap memory and storage led to enormous amounts of data to be sifted through in forensic... High -Performance Computing (HPC) tools that can be integrated with existing curricula and support our research to modernize and dramatically advance

  15. A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

    Energy Technology Data Exchange (ETDEWEB)

    Ma, Kwan-Liu [Univ. of California, Davis, CA (United States)

    2017-02-01

    Most of today’s visualization libraries and applications are based off of what is known today as the visualization pipeline. In the visualization pipeline model, algorithms are encapsulated as “filtering” components with inputs and outputs. These components can be combined by connecting the outputs of one filter to the inputs of another filter. The visualization pipeline model is popular because it provides a convenient abstraction that allows users to combine algorithms in powerful ways. Unfortunately, the visualization pipeline cannot run effectively on exascale computers. Experts agree that the exascale machine will comprise processors that contain many cores. Furthermore, physical limitations will prevent data movement in and out of the chip (that is, between main memory and the processing cores) from keeping pace with improvements in overall compute performance. To use these processors to their fullest capability, it is essential to carefully consider memory access. This is where the visualization pipeline fails. Each filtering component in the visualization library is expected to take a data set in its entirety, perform some computation across all of the elements, and output the complete results. The process of iterating over all elements must be repeated in each filter, which is one of the worst possible ways to traverse memory when trying to maximize the number of executions per memory access. This project investigates a new type of visualization framework that exhibits a pervasive parallelism necessary to run on exascale machines. Our framework achieves this by defining algorithms in terms of functors, which are localized, stateless operations. Functors can be composited in much the same way as filters in the visualization pipeline. But, functors’ design allows them to be concurrently running on massive amounts of lightweight threads. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for

  16. Quantum Accelerators for High-Performance Computing Systems

    OpenAIRE

    Britt, Keith A.; Mohiyaddin, Fahd A.; Humble, Travis S.

    2017-01-01

    We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantu...

  17. High Performance Networks From Supercomputing to Cloud Computing

    CERN Document Server

    Abts, Dennis

    2011-01-01

    Datacenter networks provide the communication substrate for large parallel computer systems that form the ecosystem for high performance computing (HPC) systems and modern Internet applications. The design of new datacenter networks is motivated by an array of applications ranging from communication intensive climatology, complex material simulations and molecular dynamics to such Internet applications as Web search, language translation, collaborative Internet applications, streaming video and voice-over-IP. For both Supercomputing and Cloud Computing the network enables distributed applicati

  18. High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  19. Simulating electron wave dynamics in graphene superlattices exploiting parallel processing advantages

    Science.gov (United States)

    Rodrigues, Manuel J.; Fernandes, David E.; Silveirinha, Mário G.; Falcão, Gabriel

    2018-01-01

    This work introduces a parallel computing framework to characterize the propagation of electron waves in graphene-based nanostructures. The electron wave dynamics is modeled using both "microscopic" and effective medium formalisms and the numerical solution of the two-dimensional massless Dirac equation is determined using a Finite-Difference Time-Domain scheme. The propagation of electron waves in graphene superlattices with localized scattering centers is studied, and the role of the symmetry of the microscopic potential in the electron velocity is discussed. The computational methodologies target the parallel capabilities of heterogeneous multi-core CPU and multi-GPU environments and are built with the OpenCL parallel programming framework which provides a portable, vendor agnostic and high throughput-performance solution. The proposed heterogeneous multi-GPU implementation achieves speedup ratios up to 75x when compared to multi-thread and multi-core CPU execution, reducing simulation times from several hours to a couple of minutes.

  20. Shredder: GPU-Accelerated Incremental Storage and Computation

    OpenAIRE

    Bhatotia, Pramod; Rodrigues, Rodrigo; Verma, Akshat

    2012-01-01

    Redundancy elimination using data deduplication and incremental data processing has emerged as an important technique to minimize storage and computation requirements in data center computing. In this paper, we present the design, implementation and evaluation of Shredder, a high performance content-based chunking framework for supporting incremental storage and computation systems. Shredder exploits the massively parallel processing power of GPUs to overcome the CPU bottlenecks of content-ba...

  1. Examination of concept of next generation computer. Progress report 1999

    Energy Technology Data Exchange (ETDEWEB)

    Higuchi, Kenji; Hasegawa, Yukihiro; Hirayama, Toshio

    2000-12-01

    The Center for Promotion of Computational Science and Engineering has conducted R and D works on the technology of parallel processing and has started the examination of the next generation computer in 1999. This report describes the behavior analyses of quantum calculation codes. It also describes the consideration for the analyses and examination results for the method to reduce cash misses. Furthermore, it describes a performance simulator that is being developed to quantitatively examine the concept of the next generation computer. (author)

  2. Acceleration and sensitivity analysis of lattice kinetic Monte Carlo simulations using parallel processing and rate constant rescaling.

    Science.gov (United States)

    Núñez, M; Robie, T; Vlachos, D G

    2017-10-28

    Kinetic Monte Carlo (KMC) simulation provides insights into catalytic reactions unobtainable with either experiments or mean-field microkinetic models. Sensitivity analysis of KMC models assesses the robustness of the predictions to parametric perturbations and identifies rate determining steps in a chemical reaction network. Stiffness in the chemical reaction network, a ubiquitous feature, demands lengthy run times for KMC models and renders efficient sensitivity analysis based on the likelihood ratio method unusable. We address the challenge of efficiently conducting KMC simulations and performing accurate sensitivity analysis in systems with unknown time scales by employing two acceleration techniques: rate constant rescaling and parallel processing. We develop statistical criteria that ensure sufficient sampling of non-equilibrium steady state conditions. Our approach provides the twofold benefit of accelerating the simulation itself and enabling likelihood ratio sensitivity analysis, which provides further speedup relative to finite difference sensitivity analysis. As a result, the likelihood ratio method can be applied to real chemistry. We apply our methodology to the water-gas shift reaction on Pt(111).

  3. Comparison of Efficacy and Threat Perception Processes in Predicting Smoking among University Students Based on Extended Parallel Process Model

    Directory of Open Access Journals (Sweden)

    S. Bashirian

    2014-04-01

    Full Text Available Introduction & Objective: The survey of smoking as the most toxic, common and cheapest ad-diction, and its psychological and demographic variables especially among the youth who are efficient and constructive individuals of the society is of great importance. This study was performed to compare efficacy and threat perception in predicting cigarette smoking among university students based on Expended Parallel Process Model (EPPM. Material & Methods: This cross sectional descriptive study was carried out on 700 college stu-dents of Hamadan recruited with a stratified sampling method. The participants completed a self-administered questionnaire including demographic characteristics, smoking status and EPPM Data analysis was done with the SPSS software (version 16, using t-test, one way ANOVA, Pierson correlation and logistic regression methods. Results: The average scores of threat and efficacy perception were 39.7 and 38.6, respectively. The prevalence of cigarette smoking among participants was 27.1 percent. Also, there were significant differences between the average score of efficacy perception and age, gender, his-tory of drug abuse and dwelling of students (P<0.05. Efficacy and threat perception both predicted student cigarette smoking. Conclusions: Cognitive mediating process of threat perception was a more powerful predictor of cigarette smoking as an unsafe behavior. Therefore, increasing self efficacy and response efficacy of university students aimed at facilitating the acceptance of safe behavior could be note-worthy as a principle in education. (Sci J Hamadan Univ Med Sci 2014; 21 (1:58-65

  4. Enabling High-Performance Computing as a Service

    KAUST Repository

    AbdelBaky, Moustafa; Parashar, Manish; Kim, Hyunjoo; Jordan, Kirk E.; Sachdeva, Vipin; Sexton, James; Jamjoom, Hani; Shae, Zon-Yin; Pencheva, Gergina; Tavakoli, Reza; Wheeler, Mary F.

    2012-01-01

    With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a

  5. Multicore Challenges and Benefits for High Performance Scientific Computing

    Directory of Open Access Journals (Sweden)

    Ida M.B. Nielsen

    2008-01-01

    Full Text Available Until recently, performance gains in processors were achieved largely by improvements in clock speeds and instruction level parallelism. Thus, applications could obtain performance increases with relatively minor changes by upgrading to the latest generation of computing hardware. Currently, however, processor performance improvements are realized by using multicore technology and hardware support for multiple threads within each core, and taking full advantage of this technology to improve the performance of applications requires exposure of extreme levels of software parallelism. We will here discuss the architecture of parallel computers constructed from many multicore chips as well as techniques for managing the complexity of programming such computers, including the hybrid message-passing/multi-threading programming model. We will illustrate these ideas with a hybrid distributed memory matrix multiply and a quantum chemistry algorithm for energy computation using Møller–Plesset perturbation theory.

  6. Enabling high performance computational science through combinatorial algorithms

    International Nuclear Information System (INIS)

    Boman, Erik G; Bozdag, Doruk; Catalyurek, Umit V; Devine, Karen D; Gebremedhin, Assefaw H; Hovland, Paul D; Pothen, Alex; Strout, Michelle Mills

    2007-01-01

    The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation

  7. Enabling high performance computational science through combinatorial algorithms

    Energy Technology Data Exchange (ETDEWEB)

    Boman, Erik G [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Bozdag, Doruk [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Catalyurek, Umit V [Biomedical Informatics, and Electrical and Computer Engineering, Ohio State University (United States); Devine, Karen D [Discrete Algorithms and Math Department, Sandia National Laboratories (United States); Gebremedhin, Assefaw H [Computer Science and Center for Computational Science, Old Dominion University (United States); Hovland, Paul D [Mathematics and Computer Science Division, Argonne National Laboratory (United States); Pothen, Alex [Computer Science and Center for Computational Science, Old Dominion University (United States); Strout, Michelle Mills [Computer Science, Colorado State University (United States)

    2007-07-15

    The Combinatorial Scientific Computing and Petascale Simulations (CSCAPES) Institute is developing algorithms and software for combinatorial problems that play an enabling role in scientific and engineering computations. Discrete algorithms will be increasingly critical for achieving high performance for irregular problems on petascale architectures. This paper describes recent contributions by researchers at the CSCAPES Institute in the areas of load balancing, parallel graph coloring, performance improvement, and parallel automatic differentiation.

  8. Metastable states in the hierarchical Dyson model drive parallel processing in the hierarchical Hopfield network

    International Nuclear Information System (INIS)

    Agliari, Elena; Barra, Adriano; Guerra, Francesco; Galluzzi, Andrea; Tantari, Daniele; Tavani, Flavia

    2015-01-01

    In this paper, we introduce and investigate the statistical mechanics of hierarchical neural networks. First, we approach these systems à la Mattis, by thinking of the Dyson model as a single-pattern hierarchical neural network. We also discuss the stability of different retrievable states as predicted by the related self-consistencies obtained both from a mean-field bound and from a bound that bypasses the mean-field limitation. The latter is worked out by properly reabsorbing the magnetization fluctuations related to higher levels of the hierarchy into effective fields for the lower levels. Remarkably, mixing Amit's ansatz technique for selecting candidate-retrievable states with the interpolation procedure for solving for the free energy of these states, we prove that, due to gauge symmetry, the Dyson model accomplishes both serial and parallel processing. We extend this scenario to multiple stored patterns by implementing the Hebb prescription for learning within the couplings. This results in Hopfield-like networks constrained on a hierarchical topology, for which, by restricting to the low-storage regime where the number of patterns grows at its most logarithmical with the amount of neurons, we prove the existence of the thermodynamic limit for the free energy, and we give an explicit expression of its mean-field bound and of its related improved bound. We studied the resulting self-consistencies for the Mattis magnetizations, which act as order parameters, are studied and the stability of solutions is analyzed to get a picture of the overall retrieval capabilities of the system according to both mean-field and non-mean-field scenarios. Our main finding is that embedding the Hebbian rule on a hierarchical topology allows the network to accomplish both serial and parallel processing. By tuning the level of fast noise affecting it or triggering the decay of the interactions with the distance among neurons, the system may switch from sequential retrieval to

  9. Micromagnetics on high-performance workstation and mobile computational platforms

    Science.gov (United States)

    Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

    2015-05-01

    The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.

  10. Computational performance of a smoothed particle hydrodynamics simulation for shared-memory parallel computing

    Science.gov (United States)

    Nishiura, Daisuke; Furuichi, Mikito; Sakaguchi, Hide

    2015-09-01

    The computational performance of a smoothed particle hydrodynamics (SPH) simulation is investigated for three types of current shared-memory parallel computer devices: many integrated core (MIC) processors, graphics processing units (GPUs), and multi-core CPUs. We are especially interested in efficient shared-memory allocation methods for each chipset, because the efficient data access patterns differ between compute unified device architecture (CUDA) programming for GPUs and OpenMP programming for MIC processors and multi-core CPUs. We first introduce several parallel implementation techniques for the SPH code, and then examine these on our target computer architectures to determine the most effective algorithms for each processor unit. In addition, we evaluate the effective computing performance and power efficiency of the SPH simulation on each architecture, as these are critical metrics for overall performance in a multi-device environment. In our benchmark test, the GPU is found to produce the best arithmetic performance as a standalone device unit, and gives the most efficient power consumption. The multi-core CPU obtains the most effective computing performance. The computational speed of the MIC processor on Xeon Phi approached that of two Xeon CPUs. This indicates that using MICs is an attractive choice for existing SPH codes on multi-core CPUs parallelized by OpenMP, as it gains computational acceleration without the need for significant changes to the source code.

  11. High-performance scientific computing in the cloud

    Science.gov (United States)

    Jorissen, Kevin; Vila, Fernando; Rehr, John

    2011-03-01

    Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.

  12. Decreasing Data Analytics Time: Hybrid Architecture MapReduce-Massive Parallel Processing for a Smart Grid

    Directory of Open Access Journals (Sweden)

    Abdeslam Mehenni

    2017-03-01

    Full Text Available As our populations grow in a world of limited resources enterprise seek ways to lighten our load on the planet. The idea of modifying consumer behavior appears as a foundation for smart grids. Enterprise demonstrates the value available from deep analysis of electricity consummation histories, consumers’ messages, and outage alerts, etc. Enterprise mines massive structured and unstructured data. In a nutshell, smart grids result in a flood of data that needs to be analyzed, for better adjust to demand and give customers more ability to delve into their power consumption. Simply put, smart grids will increasingly have a flexible data warehouse attached to them. The key driver for the adoption of data management strategies is clearly the need to handle and analyze the large amounts of information utilities are now faced with. New approaches to data integration are nauseating moment; Hadoop is in fact now being used by the utility to help manage the huge growth in data whilst maintaining coherence of the Data Warehouse. In this paper we define a new Meter Data Management System Architecture repository that differ with three leaders MDMS, where we use MapReduce programming model for ETL and Parallel DBMS in Query statements(Massive Parallel Processing MPP.

  13. Online measurement for geometrical parameters of wheel set based on structure light and CUDA parallel processing

    Science.gov (United States)

    Wu, Kaihua; Shao, Zhencheng; Chen, Nian; Wang, Wenjie

    2018-01-01

    The wearing degree of the wheel set tread is one of the main factors that influence the safety and stability of running train. Geometrical parameters mainly include flange thickness and flange height. Line structure laser light was projected on the wheel tread surface. The geometrical parameters can be deduced from the profile image. An online image acquisition system was designed based on asynchronous reset of CCD and CUDA parallel processing unit. The image acquisition was fulfilled by hardware interrupt mode. A high efficiency parallel segmentation algorithm based on CUDA was proposed. The algorithm firstly divides the image into smaller squares, and extracts the squares of the target by fusion of k_means and STING clustering image segmentation algorithm. Segmentation time is less than 0.97ms. A considerable acceleration ratio compared with the CPU serial calculation was obtained, which greatly improved the real-time image processing capacity. When wheel set was running in a limited speed, the system placed alone railway line can measure the geometrical parameters automatically. The maximum measuring speed is 120km/h.

  14. Hardware system of parallel processing for fast CT image reconstruction based on circular shifting float memory architecture

    International Nuclear Information System (INIS)

    Wang Shi; Kang Kejun; Wang Jingjin

    1995-01-01

    Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture and hardware design of PIRS-4 (4-processor Parallel Image Reconstruction System) which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes structure and component of the system, the design of cross bar switch and details of control model. The test results are described

  15. COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1991-03-15

    In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour.

  16. COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

    International Nuclear Information System (INIS)

    Anon.

    1991-01-01

    In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour

  17. Optical interconnection networks for high-performance computing systems

    International Nuclear Information System (INIS)

    Biberman, Aleksandr; Bergman, Keren

    2012-01-01

    Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)

  18. Computational Fluid Dynamics and Building Energy Performance Simulation

    DEFF Research Database (Denmark)

    Nielsen, Peter V.; Tryggvason, Tryggvi

    An interconnection between a building energy performance simulation program and a Computational Fluid Dynamics program (CFD) for room air distribution will be introduced for improvement of the predictions of both the energy consumption and the indoor environment. The building energy performance...

  19. Contemporary high performance computing from petascale toward exascale

    CERN Document Server

    Vetter, Jeffrey S

    2015-01-01

    A continuation of Contemporary High Performance Computing: From Petascale toward Exascale, this second volume continues the discussion of HPC flagship systems, major application workloads, facilities, and sponsors. The book includes of figures and pictures that capture the state of existing systems: pictures of buildings, systems in production, floorplans, and many block diagrams and charts to illustrate system design and performance.

  20. Connecting Performance Analysis and Visualization to Advance Extreme Scale Computing

    Energy Technology Data Exchange (ETDEWEB)

    Bremer, Peer-Timo [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mohr, Bernd [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Schulz, Martin [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Pasccci, Valerio [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Gamblin, Todd [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Brunst, Holger [Dresden Univ. of Technology (Germany)

    2015-07-29

    The characterization, modeling, analysis, and tuning of software performance has been a central topic in High Performance Computing (HPC) since its early beginnings. The overall goal is to make HPC software run faster on particular hardware, either through better scheduling, on-node resource utilization, or more efficient distributed communication.

  1. Human performance models for computer-aided engineering

    Science.gov (United States)

    Elkind, Jerome I. (Editor); Card, Stuart K. (Editor); Hochberg, Julian (Editor); Huey, Beverly Messick (Editor)

    1989-01-01

    This report discusses a topic important to the field of computational human factors: models of human performance and their use in computer-based engineering facilities for the design of complex systems. It focuses on a particular human factors design problem -- the design of cockpit systems for advanced helicopters -- and on a particular aspect of human performance -- vision and related cognitive functions. By focusing in this way, the authors were able to address the selected topics in some depth and develop findings and recommendations that they believe have application to many other aspects of human performance and to other design domains.

  2. Survey of computer codes applicable to waste facility performance evaluations

    International Nuclear Information System (INIS)

    Alsharif, M.; Pung, D.L.; Rivera, A.L.; Dole, L.R.

    1988-01-01

    This study is an effort to review existing information that is useful to develop an integrated model for predicting the performance of a radioactive waste facility. A summary description of 162 computer codes is given. The identified computer programs address the performance of waste packages, waste transport and equilibrium geochemistry, hydrological processes in unsaturated and saturated zones, and general waste facility performance assessment. Some programs also deal with thermal analysis, structural analysis, and special purposes. A number of these computer programs are being used by the US Department of Energy, the US Nuclear Regulatory Commission, and their contractors to analyze various aspects of waste package performance. Fifty-five of these codes were identified as being potentially useful on the analysis of low-level radioactive waste facilities located above the water table. The code summaries include authors, identification data, model types, and pertinent references. 14 refs., 5 tabs

  3. Routing performance analysis and optimization within a massively parallel computer

    Science.gov (United States)

    Archer, Charles Jens; Peters, Amanda; Pinnow, Kurt Walter; Swartz, Brent Allen

    2013-04-16

    An apparatus, program product and method optimize the operation of a massively parallel computer system by, in part, receiving actual performance data concerning an application executed by the plurality of interconnected nodes, and analyzing the actual performance data to identify an actual performance pattern. A desired performance pattern may be determined for the application, and an algorithm may be selected from among a plurality of algorithms stored within a memory, the algorithm being configured to achieve the desired performance pattern based on the actual performance data.

  4. A Pervasive Parallel Processing Framework for Data Visualization and Analysis at Extreme Scale

    Energy Technology Data Exchange (ETDEWEB)

    Moreland, Kenneth [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Geveci, Berk [Kitware, Inc., Clifton Park, NY (United States)

    2014-11-01

    The evolution of the computing world from teraflop to petaflop has been relatively effortless, with several of the existing programming models scaling effectively to the petascale. The migration to exascale, however, poses considerable challenges. All industry trends infer that the exascale machine will be built using processors containing hundreds to thousands of cores per chip. It can be inferred that efficient concurrency on exascale machines requires a massive amount of concurrent threads, each performing many operations on a localized piece of data. Currently, visualization libraries and applications are based off what is known as the visualization pipeline. In the pipeline model, algorithms are encapsulated as filters with inputs and outputs. These filters are connected by setting the output of one component to the input of another. Parallelism in the visualization pipeline is achieved by replicating the pipeline for each processing thread. This works well for today’s distributed memory parallel computers but cannot be sustained when operating on processors with thousands of cores. Our project investigates a new visualization framework designed to exhibit the pervasive parallelism necessary for extreme scale machines. Our framework achieves this by defining algorithms in terms of worklets, which are localized stateless operations. Worklets are atomic operations that execute when invoked unlike filters, which execute when a pipeline request occurs. The worklet design allows execution on a massive amount of lightweight threads with minimal overhead. Only with such fine-grained parallelism can we hope to fill the billions of threads we expect will be necessary for efficient computation on an exascale machine.

  5. 3rd International Conference on High Performance Scientific Computing

    CERN Document Server

    Kostina, Ekaterina; Phu, Hoang; Rannacher, Rolf

    2008-01-01

    This proceedings volume contains a selection of papers presented at the Third International Conference on High Performance Scientific Computing held at the Hanoi Institute of Mathematics, Vietnamese Academy of Science and Technology (VAST), March 6-10, 2006. The conference has been organized by the Hanoi Institute of Mathematics, Interdisciplinary Center for Scientific Computing (IWR), Heidelberg, and its International PhD Program ``Complex Processes: Modeling, Simulation and Optimization'', and Ho Chi Minh City University of Technology. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and applications in practice. Subjects covered are mathematical modelling, numerical simulation, methods for optimization and control, parallel computing, software development, applications of scientific computing in physics, chemistry, biology and mechanics, environmental and hydrology problems, transport, logistics and site loca...

  6. 5th International Conference on High Performance Scientific Computing

    CERN Document Server

    Hoang, Xuan; Rannacher, Rolf; Schlöder, Johannes

    2014-01-01

    This proceedings volume gathers a selection of papers presented at the Fifth International Conference on High Performance Scientific Computing, which took place in Hanoi on March 5-9, 2012. The conference was organized by the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) of Heidelberg University, Ho Chi Minh City University of Technology, and the Vietnam Institute for Advanced Study in Mathematics. The contributions cover the broad interdisciplinary spectrum of scientific computing and present recent advances in theory, development of methods, and practical applications. Subjects covered include mathematical modeling; numerical simulation; methods for optimization and control; parallel computing; software development; and applications of scientific computing in physics, mechanics and biomechanics, material science, hydrology, chemistry, biology, biotechnology, medicine, sports, psychology, transport, logistics, com...

  7. 6th International Conference on High Performance Scientific Computing

    CERN Document Server

    Phu, Hoang; Rannacher, Rolf; Schlöder, Johannes

    2017-01-01

    This proceedings volume highlights a selection of papers presented at the Sixth International Conference on High Performance Scientific Computing, which took place in Hanoi, Vietnam on March 16-20, 2015. The conference was jointly organized by the Heidelberg Institute of Theoretical Studies (HITS), the Institute of Mathematics of the Vietnam Academy of Science and Technology (VAST), the Interdisciplinary Center for Scientific Computing (IWR) at Heidelberg University, and the Vietnam Institute for Advanced Study in Mathematics, Ministry of Education The contributions cover a broad, interdisciplinary spectrum of scientific computing and showcase recent advances in theory, methods, and practical applications. Subjects covered numerical simulation, methods for optimization and control, parallel computing, and software development, as well as the applications of scientific computing in physics, mechanics, biomechanics and robotics, material science, hydrology, biotechnology, medicine, transport, scheduling, and in...

  8. GPU-based high-performance computing for radiation therapy

    International Nuclear Information System (INIS)

    Jia, Xun; Jiang, Steve B; Ziegenhein, Peter

    2014-01-01

    Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. (topical review)

  9. Computer Simulation Performed for Columbia Project Cooling System

    Science.gov (United States)

    Ahmad, Jasim

    2005-01-01

    This demo shows a high-fidelity simulation of the air flow in the main computer room housing the Columbia (10,024 intel titanium processors) system. The simulation asseses the performance of the cooling system and identified deficiencies, and recommended modifications to eliminate them. It used two in house software packages on NAS supercomputers: Chimera Grid tools to generate a geometric model of the computer room, OVERFLOW-2 code for fluid and thermal simulation. This state-of-the-art technology can be easily extended to provide a general capability for air flow analyses on any modern computer room. Columbia_CFD_black.tiff

  10. Prediction of Adequate Prenatal Care Utilization Based on the Extended Parallel Process Model.

    Science.gov (United States)

    Hajian, Sepideh; Imani, Fatemeh; Riazi, Hedyeh; Salmani, Fatemeh

    2017-10-01

    Pregnancy complications are one of the major public health concerns. One of the main causes of preventable complications is the absence of or inadequate provision of prenatal care. The present study was conducted to investigate whether Extended Parallel Process Model's constructs can predict the utilization of prenatal care services. The present longitudinal prospective study was conducted on 192 pregnant women selected through the multi-stage sampling of health facilities in Qeshm, Hormozgan province, from April to June 2015. Participants were followed up from the first half of pregnancy until their childbirth to assess adequate or inadequate/non-utilization of prenatal care services. Data were collected using the structured Risk Behavior Diagnosis Scale. The analysis of the data was carried out in SPSS-22 using one-way ANOVA, linear regression and logistic regression analysis. The level of significance was set at 0.05. Totally, 178 pregnant women with a mean age of 25.31±5.42 completed the study. Perceived self-efficacy (OR=25.23; Pprenatal care. Husband's occupation in the labor market (OR=0.43; P=0.02), unwanted pregnancy (OR=0.352; Pcare for the minors or elderly at home (OR=0.35; P=0.045) were associated with lower odds of receiving prenatal care. The model showed that when perceived efficacy of the prenatal care services overcame the perceived threat, the likelihood of prenatal care usage will increase. This study identified some modifiable factors associated with prenatal care usage by women, providing key targets for appropriate clinical interventions.

  11. High-Performance Java Codes for Computational Fluid Dynamics

    Science.gov (United States)

    Riley, Christopher; Chatterjee, Siddhartha; Biswas, Rupak; Biegel, Bryan (Technical Monitor)

    2001-01-01

    The computational science community is reluctant to write large-scale computationally -intensive applications in Java due to concerns over Java's poor performance, despite the claimed software engineering advantages of its object-oriented features. Naive Java implementations of numerical algorithms can perform poorly compared to corresponding Fortran or C implementations. To achieve high performance, Java applications must be designed with good performance as a primary goal. This paper presents the object-oriented design and implementation of two real-world applications from the field of Computational Fluid Dynamics (CFD): a finite-volume fluid flow solver (LAURA, from NASA Langley Research Center), and an unstructured mesh adaptation algorithm (2D_TAG, from NASA Ames Research Center). This work builds on our previous experience with the design of high-performance numerical libraries in Java. We examine the performance of the applications using the currently available Java infrastructure and show that the Java version of the flow solver LAURA performs almost within a factor of 2 of the original procedural version. Our Java version of the mesh adaptation algorithm 2D_TAG performs within a factor of 1.5 of its original procedural version on certain platforms. Our results demonstrate that object-oriented software design principles are not necessarily inimical to high performance.

  12. High Performance Computing Software Applications for Space Situational Awareness

    Science.gov (United States)

    Giuliano, C.; Schumacher, P.; Matson, C.; Chun, F.; Duncan, B.; Borelli, K.; Desonia, R.; Gusciora, G.; Roe, K.

    The High Performance Computing Software Applications Institute for Space Situational Awareness (HSAI-SSA) has completed its first full year of applications development. The emphasis of our work in this first year was in improving space surveillance sensor models and image enhancement software. These applications are the Space Surveillance Network Analysis Model (SSNAM), the Air Force Space Fence simulation (SimFence), and physically constrained iterative de-convolution (PCID) image enhancement software tool. Specifically, we have demonstrated order of magnitude speed-up in those codes running on the latest Cray XD-1 Linux supercomputer (Hoku) at the Maui High Performance Computing Center. The software applications improvements that HSAI-SSA has made, has had significant impact to the warfighter and has fundamentally changed the role of high performance computing in SSA.

  13. A Perspective on Computational Human Performance Models as Design Tools

    Science.gov (United States)

    Jones, Patricia M.

    2010-01-01

    The design of interactive systems, including levels of automation, displays, and controls, is usually based on design guidelines and iterative empirical prototyping. A complementary approach is to use computational human performance models to evaluate designs. An integrated strategy of model-based and empirical test and evaluation activities is particularly attractive as a methodology for verification and validation of human-rated systems for commercial space. This talk will review several computational human performance modeling approaches and their applicability to design of display and control requirements.

  14. High performance computing and communications: FY 1997 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-12-01

    The High Performance Computing and Communications (HPCC) Program was formally authorized by passage, with bipartisan support, of the High-Performance Computing Act of 1991, signed on December 9, 1991. The original Program, in which eight Federal agencies participated, has now grown to twelve agencies. This Plan provides a detailed description of the agencies` FY 1996 HPCC accomplishments and FY 1997 HPCC plans. Section 3 of this Plan provides an overview of the HPCC Program. Section 4 contains more detailed definitions of the Program Component Areas, with an emphasis on the overall directions and milestones planned for each PCA. Appendix A provides a detailed look at HPCC Program activities within each agency.

  15. Visualization and Data Analysis for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Sewell, Christopher Meyer [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-09-27

    This is a set of slides from a guest lecture for a class at the University of Texas, El Paso on visualization and data analysis for high-performance computing. The topics covered are the following: trends in high-performance computing; scientific visualization, such as OpenGL, ray tracing and volume rendering, VTK, and ParaView; data science at scale, such as in-situ visualization, image databases, distributed memory parallelism, shared memory parallelism, VTK-m, "big data", and then an analysis example.

  16. Current and planned numerical development for improving computing performance for long duration and/or low pressure transients

    Energy Technology Data Exchange (ETDEWEB)

    Faydide, B. [Commissariat a l`Energie Atomique, Grenoble (France)

    1997-07-01

    This paper presents the current and planned numerical development for improving computing performance in case of Cathare applications needing real time, like simulator applications. Cathare is a thermalhydraulic code developed by CEA (DRN), IPSN, EDF and FRAMATOME for PWR safety analysis. First, the general characteristics of the code are presented, dealing with physical models, numerical topics, and validation strategy. Then, the current and planned applications of Cathare in the field of simulators are discussed. Some of these applications were made in the past, using a simplified and fast-running version of Cathare (Cathare-Simu); the status of the numerical improvements obtained with Cathare-Simu is presented. The planned developments concern mainly the Simulator Cathare Release (SCAR) project which deals with the use of the most recent version of Cathare inside simulators. In this frame, the numerical developments are related with the speed up of the calculation process, using parallel processing and improvement of code reliability on a large set of NPP transients.

  17. Current and planned numerical development for improving computing performance for long duration and/or low pressure transients

    International Nuclear Information System (INIS)

    Faydide, B.

    1997-01-01

    This paper presents the current and planned numerical development for improving computing performance in case of Cathare applications needing real time, like simulator applications. Cathare is a thermalhydraulic code developed by CEA (DRN), IPSN, EDF and FRAMATOME for PWR safety analysis. First, the general characteristics of the code are presented, dealing with physical models, numerical topics, and validation strategy. Then, the current and planned applications of Cathare in the field of simulators are discussed. Some of these applications were made in the past, using a simplified and fast-running version of Cathare (Cathare-Simu); the status of the numerical improvements obtained with Cathare-Simu is presented. The planned developments concern mainly the Simulator Cathare Release (SCAR) project which deals with the use of the most recent version of Cathare inside simulators. In this frame, the numerical developments are related with the speed up of the calculation process, using parallel processing and improvement of code reliability on a large set of NPP transients

  18. A Review of Parallel Processing Approaches to Robot Kinematics and Jacobian

    OpenAIRE

    Henrich, Dominik; Karl, Joachim; Wörn, Heinz

    1997-01-01

    Due to continuously increasing demands in the area of advanced robot control, it became necessary to speed up the computation. One way to reduce the computation time is to distribute the computation onto several processing units. In this survey we present different approaches to parallel computation of robot kinematics and Jacobian. Thereby, we discuss both the forward and the reverse problem. We introduce a classification scheme and class...

  19. SU-E-T-531: Performance Evaluation of Multithreaded Geant4 for Proton Therapy Dose Calculations in a High Performance Computing Facility

    International Nuclear Information System (INIS)

    Shin, J; Coss, D; McMurry, J; Farr, J; Faddegon, B

    2014-01-01

    Purpose: To evaluate the efficiency of multithreaded Geant4 (Geant4-MT, version 10.0) for proton Monte Carlo dose calculations using a high performance computing facility. Methods: Geant4-MT was used to calculate 3D dose distributions in 1×1×1 mm3 voxels in a water phantom and patient's head with a 150 MeV proton beam covering approximately 5×5 cm2 in the water phantom. Three timestamps were measured on the fly to separately analyze the required time for initialization (which cannot be parallelized), processing time of individual threads, and completion time. Scalability of averaged processing time per thread was calculated as a function of thread number (1, 100, 150, and 200) for both 1M and 50 M histories. The total memory usage was recorded. Results: Simulations with 50 M histories were fastest with 100 threads, taking approximately 1.3 hours and 6 hours for the water phantom and the CT data, respectively with better than 1.0 % statistical uncertainty. The calculations show 1/N scalability in the event loops for both cases. The gains from parallel calculations started to decrease with 150 threads. The memory usage increases linearly with number of threads. No critical failures were observed during the simulations. Conclusion: Multithreading in Geant4-MT decreased simulation time in proton dose distribution calculations by a factor of 64 and 54 at a near optimal 100 threads for water phantom and patient's data respectively. Further simulations will be done to determine the efficiency at the optimal thread number. Considering the trend of computer architecture development, utilizing Geant4-MT for radiotherapy simulations is an excellent cost-effective alternative for a distributed batch queuing system. However, because the scalability depends highly on simulation details, i.e., the ratio of the processing time of one event versus waiting time to access for the shared event queue, a performance evaluation as described is recommended

  20. Clusters of PCS for high-speed computation for modelling of the climate

    International Nuclear Information System (INIS)

    Pabon C, Jose Daniel; Eslava R, Jesus Antonio; Montoya G, Gerardo de Jesus

    2001-01-01

    In order to create high speed computing capability, the Program of Post grade in Meteorology of the Department of Geosciences, National University of Colombia installed a cluster of 8 PCs for parallel processing. This high-speed processing machine was tested with the Climate Community Model (CCM3). In this paper, the results related to the performance of this machine are presented

  1. Visual Analysis of Cloud Computing Performance Using Behavioral Lines.

    Science.gov (United States)

    Muelder, Chris; Zhu, Biao; Chen, Wei; Zhang, Hongxin; Ma, Kwan-Liu

    2016-02-29

    Cloud computing is an essential technology to Big Data analytics and services. A cloud computing system is often comprised of a large number of parallel computing and storage devices. Monitoring the usage and performance of such a system is important for efficient operations, maintenance, and security. Tracing every application on a large cloud system is untenable due to scale and privacy issues. But profile data can be collected relatively efficiently by regularly sampling the state of the system, including properties such as CPU load, memory usage, network usage, and others, creating a set of multivariate time series for each system. Adequate tools for studying such large-scale, multidimensional data are lacking. In this paper, we present a visual based analysis approach to understanding and analyzing the performance and behavior of cloud computing systems. Our design is based on similarity measures and a layout method to portray the behavior of each compute node over time. When visualizing a large number of behavioral lines together, distinct patterns often appear suggesting particular types of performance bottleneck. The resulting system provides multiple linked views, which allow the user to interactively explore the data by examining the data or a selected subset at different levels of detail. Our case studies, which use datasets collected from two different cloud systems, show that this visual based approach is effective in identifying trends and anomalies of the systems.

  2. Development of a parallel processing couple for calculations of control rod worth in terms of burn-up in a WWER-1000 reactor

    Energy Technology Data Exchange (ETDEWEB)

    Noori-Kalkhoran, Omid; Ahangari, R. [Nuclear Science and Technology Research Institute (NSTRI), Tehran (Iran, Islamic Republic of). Reactor Research school; Shirani, A.S. [Shahid Beheshti Univ., Tehran (Iran, Islamic Republic of). Faculty of Engineering

    2017-03-15

    In this study a code based method has been developed for calculation of integral and differential control rod worth in terms of burn-up for a WWER-1000 reactor. Parallel processing of WIMSD-5B, PARCS V2.7 and COBRA-EN has been used for this purpose. WIMSD-5B has been used for cell calculation and handling burn-up of core at different days. PARCS V2.7?has been used for neutronic calculation of core and critical boron concentration search. Thermal-hydraulic calculation has been performed by COBRA-EN. A Parallel processing algorithm has been developed by MATLAB to couple and transfer suitable data between these codes in each step. Steady-State Power Picking Factors (PPFs) of the core and Control rod worth have been calculated from Beginning Of Cycle (BOC) to 289.7 Effective full Power Days (EFPDs) in some steps. Results have been compared with Bushehr Nuclear Power Plant (BNPP) Final Safety Analysis Report (FSAR) results. The results show great similarity and confirm the ability of developed coupling in calculation of control rod worth in terms of burn-up.

  3. A comprehensive approach to decipher biological computation to achieve next generation high-performance exascale computing.

    Energy Technology Data Exchange (ETDEWEB)

    James, Conrad D.; Schiess, Adrian B.; Howell, Jamie; Baca, Michael J.; Partridge, L. Donald; Finnegan, Patrick Sean; Wolfley, Steven L.; Dagel, Daryl James; Spahn, Olga Blum; Harper, Jason C.; Pohl, Kenneth Roy; Mickel, Patrick R.; Lohn, Andrew; Marinella, Matthew

    2013-10-01

    The human brain (volume=1200cm3) consumes 20W and is capable of performing > 10^16 operations/s. Current supercomputer technology has reached 1015 operations/s, yet it requires 1500m^3 and 3MW, giving the brain a 10^12 advantage in operations/s/W/cm^3. Thus, to reach exascale computation, two achievements are required: 1) improved understanding of computation in biological tissue, and 2) a paradigm shift towards neuromorphic computing where hardware circuits mimic properties of neural tissue. To address 1), we will interrogate corticostriatal networks in mouse brain tissue slices, specifically with regard to their frequency filtering capabilities as a function of input stimulus. To address 2), we will instantiate biological computing characteristics such as multi-bit storage into hardware devices with future computational and memory applications. Resistive memory devices will be modeled, designed, and fabricated in the MESA facility in consultation with our internal and external collaborators.

  4. High performance computing and communications: FY 1996 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-05-16

    The High Performance Computing and Communications (HPCC) Program was formally authorized by passage of the High Performance Computing Act of 1991, signed on December 9, 1991. Twelve federal agencies, in collaboration with scientists and managers from US industry, universities, and research laboratories, have developed the Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1995 and FY 1996. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency.

  5. Multi-Language Programming Environments for High Performance Java Computing

    OpenAIRE

    Vladimir Getov; Paul Gray; Sava Mintchev; Vaidy Sunderam

    1999-01-01

    Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI) tool which provides ...

  6. Computational modelling of expressive music performance in hexaphonic guitar

    OpenAIRE

    Siquier, Marc

    2017-01-01

    Computational modelling of expressive music performance has been widely studied in the past. While previous work in this area has been mainly focused on classical piano music, there has been very little work on guitar music, and such work has focused on monophonic guitar playing. In this work, we present a machine learning approach to automatically generate expressive performances from non expressive music scores for polyphonic guitar. We treated guitar as an hexaphonic instrument, obtaining ...

  7. Enabling High-Performance Computing as a Service

    KAUST Repository

    AbdelBaky, Moustafa

    2012-10-01

    With the right software infrastructure, clouds can provide scientists with as a service access to high-performance computing resources. An award-winning prototype framework transforms the Blue Gene/P system into an elastic cloud to run a representative HPC application. © 2012 IEEE.

  8. Computer science of the high performance; Informatica del alto rendimiento

    Energy Technology Data Exchange (ETDEWEB)

    Moraleda, A.

    2008-07-01

    The high performance computing is taking shape as a powerful accelerator of the process of innovation, to drastically reduce the waiting times for access to the results and the findings in a growing number of processes and activities as complex and important as medicine, genetics, pharmacology, environment, natural resources management or the simulation of complex processes in a wide variety of industries. (Author)

  9. Performativity, Fabrication and Trust: Exploring Computer-Mediated Moderation

    Science.gov (United States)

    Clapham, Andrew

    2013-01-01

    Based on research conducted in an English secondary school, this paper explores computer-mediated moderation as a performative tool. The Module Assessment Meeting (MAM) was the moderation approach under investigation. I mobilise ethnographic data generated by a key informant, and triangulated with that from other actors in the setting, in order to…

  10. Running Interactive Jobs on Peregrine | High-Performance Computing | NREL

    Science.gov (United States)

    shell prompt, which allows users to execute commands and scripts as they would on the login nodes. Login performed on the compute nodes rather than on login nodes. This page provides instructions and examples of , start GUIs etc. and the commands will execute on that node instead of on the login node. The -V option

  11. Performance Evaluation of a Mobile Wireless Computational Grid ...

    African Journals Online (AJOL)

    This work developed and simulated a mathematical model for a mobile wireless computational Grid architecture using networks of queuing theory. This was in order to evaluate the performance of theload-balancing three tier hierarchical configuration. The throughput and resource utilizationmetrics were measured and the ...

  12. The performance of low-cost commercial cloud computing as an alternative in computational chemistry.

    Science.gov (United States)

    Thackston, Russell; Fortenberry, Ryan C

    2015-05-05

    The growth of commercial cloud computing (CCC) as a viable means of computational infrastructure is largely unexplored for the purposes of quantum chemistry. In this work, the PSI4 suite of computational chemistry programs is installed on five different types of Amazon World Services CCC platforms. The performance for a set of electronically excited state single-point energies is compared between these CCC platforms and typical, "in-house" physical machines. Further considerations are made for the number of cores or virtual CPUs (vCPUs, for the CCC platforms), but no considerations are made for full parallelization of the program (even though parallelization of the BLAS library is implemented), complete high-performance computing cluster utilization, or steal time. Even with this most pessimistic view of the computations, CCC resources are shown to be more cost effective for significant numbers of typical quantum chemistry computations. Large numbers of large computations are still best utilized by more traditional means, but smaller-scale research may be more effectively undertaken through CCC services. © 2015 Wiley Periodicals, Inc.

  13. Performance evaluation of scientific programs on advanced architecture computers

    International Nuclear Information System (INIS)

    Walker, D.W.; Messina, P.; Baille, C.F.

    1988-01-01

    Recently a number of advanced architecture machines have become commercially available. These new machines promise better cost-performance then traditional computers, and some of them have the potential of competing with current supercomputers, such as the Cray X/MP, in terms of maximum performance. This paper describes an on-going project to evaluate a broad range of advanced architecture computers using a number of complete scientific application programs. The computers to be evaluated include distributed- memory machines such as the NCUBE, INTEL and Caltech/JPL hypercubes, and the MEIKO computing surface, shared-memory, bus architecture machines such as the Sequent Balance and the Alliant, very long instruction word machines such as the Multiflow Trace 7/200 computer, traditional supercomputers such as the Cray X.MP and Cray-2, and SIMD machines such as the Connection Machine. Currently 11 application codes from a number of scientific disciplines have been selected, although it is not intended to run all codes on all machines. Results are presented for two of the codes (QCD and missile tracking), and future work is proposed

  14. High-performance computing in accelerating structure design and analysis

    International Nuclear Information System (INIS)

    Li Zenghai; Folwell, Nathan; Ge Lixin; Guetz, Adam; Ivanov, Valentin; Kowalski, Marc; Lee, Lie-Quan; Ng, Cho-Kuen; Schussman, Greg; Stingelin, Lukas; Uplenchwar, Ravindra; Wolf, Michael; Xiao, Liling; Ko, Kwok

    2006-01-01

    Future high-energy accelerators such as the Next Linear Collider (NLC) will accelerate multi-bunch beams of high current and low emittance to obtain high luminosity, which put stringent requirements on the accelerating structures for efficiency and beam stability. While numerical modeling has been quite standard in accelerator R and D, designing the NLC accelerating structure required a new simulation capability because of the geometric complexity and level of accuracy involved. Under the US DOE Advanced Computing initiatives (first the Grand Challenge and now SciDAC), SLAC has developed a suite of electromagnetic codes based on unstructured grids and utilizing high-performance computing to provide an advanced tool for modeling structures at accuracies and scales previously not possible. This paper will discuss the code development and computational science research (e.g. domain decomposition, scalable eigensolvers, adaptive mesh refinement) that have enabled the large-scale simulations needed for meeting the computational challenges posed by the NLC as well as projects such as the PEP-II and RIA. Numerical results will be presented to show how high-performance computing has made a qualitative improvement in accelerator structure modeling for these accelerators, either at the component level (single cell optimization), or on the scale of an entire structure (beam heating and long-range wakefields)

  15. Static Memory Deduplication for Performance Optimization in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Gangyong Jia

    2017-04-01

    Full Text Available In a cloud computing environment, the number of virtual machines (VMs on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.

  16. Static Memory Deduplication for Performance Optimization in Cloud Computing.

    Science.gov (United States)

    Jia, Gangyong; Han, Guangjie; Wang, Hao; Yang, Xuan

    2017-04-27

    In a cloud computing environment, the number of virtual machines (VMs) on a single physical server and the number of applications running on each VM are continuously growing. This has led to an enormous increase in the demand of memory capacity and subsequent increase in the energy consumption in the cloud. Lack of enough memory has become a major bottleneck for scalability and performance of virtualization interfaces in cloud computing. To address this problem, memory deduplication techniques which reduce memory demand through page sharing are being adopted. However, such techniques suffer from overheads in terms of number of online comparisons required for the memory deduplication. In this paper, we propose a static memory deduplication (SMD) technique which can reduce memory capacity requirement and provide performance optimization in cloud computing. The main innovation of SMD is that the process of page detection is performed offline, thus potentially reducing the performance cost, especially in terms of response time. In SMD, page comparisons are restricted to the code segment, which has the highest shared content. Our experimental results show that SMD efficiently reduces memory capacity requirement and improves performance. We demonstrate that, compared to other approaches, the cost in terms of the response time is negligible.

  17. Computational Analysis on Performance of Thermal Energy Storage (TES) Diffuser

    Science.gov (United States)

    Adib, M. A. H. M.; Adnan, F.; Ismail, A. R.; Kardigama, K.; Salaam, H. A.; Ahmad, Z.; Johari, N. H.; Anuar, Z.; Azmi, N. S. N.

    2012-09-01

    Application of thermal energy storage (TES) system reduces cost and energy consumption. The performance of the overall operation is affected by diffuser design. In this study, computational analysis is used to determine the thermocline thickness. Three dimensional simulations with different tank height-to-diameter ratio (HD), diffuser opening and the effect of difference number of diffuser holes are investigated. Medium HD tanks simulations with double ring octagonal diffuser show good thermocline behavior and clear distinction between warm and cold water. The result show, the best performance of thermocline thickness during 50% time charging occur in medium tank with height-to-diameter ratio of 4.0 and double ring octagonal diffuser with 48 holes (9mm opening ~ 60%) acceptable compared to diffuser with 6mm ~ 40% and 12mm ~ 80% opening. The conclusion is computational analysis method are very useful in the study on performance of thermal energy storage (TES).

  18. Computational Analysis on Performance of Thermal Energy Storage (TES) Diffuser

    International Nuclear Information System (INIS)

    Adib, M A H M; Ismail, A R; Kardigama, K; Salaam, H A; Ahmad, Z; Johari, N H; Anuar, Z; Azmi, N S N; Adnan, F

    2012-01-01

    Application of thermal energy storage (TES) system reduces cost and energy consumption. The performance of the overall operation is affected by diffuser design. In this study, computational analysis is used to determine the thermocline thickness. Three dimensional simulations with different tank height-to-diameter ratio (HD), diffuser opening and the effect of difference number of diffuser holes are investigated. Medium HD tanks simulations with double ring octagonal diffuser show good thermocline behavior and clear distinction between warm and cold water. The result show, the best performance of thermocline thickness during 50% time charging occur in medium tank with height-to-diameter ratio of 4.0 and double ring octagonal diffuser with 48 holes (9mm opening ∼ 60%) acceptable compared to diffuser with 6mm ∼ 40% and 12mm ∼ 80% opening. The conclusion is computational analysis method are very useful in the study on performance of thermal energy storage (TES).

  19. Benchmarking high performance computing architectures with CMS’ skeleton framework

    Science.gov (United States)

    Sexton-Kennedy, E.; Gartung, P.; Jones, C. D.

    2017-10-01

    In 2012 CMS evaluated which underlying concurrency technology would be the best to use for its multi-threaded framework. The available technologies were evaluated on the high throughput computing systems dominating the resources in use at that time. A skeleton framework benchmarking suite that emulates the tasks performed within a CMSSW application was used to select Intel’s Thread Building Block library, based on the measured overheads in both memory and CPU on the different technologies benchmarked. In 2016 CMS will get access to high performance computing resources that use new many core architectures; machines such as Cori Phase 1&2, Theta, Mira. Because of this we have revived the 2012 benchmark to test it’s performance and conclusions on these new architectures. This talk will discuss the results of this exercise.

  20. Rotary engine performance computer program (RCEMAP and RCEMAPPC): User's guide

    Science.gov (United States)

    Bartrand, Timothy A.; Willis, Edward A.

    1993-01-01

    This report is a user's guide for a computer code that simulates the performance of several rotary combustion engine configurations. It is intended to assist prospective users in getting started with RCEMAP and/or RCEMAPPC. RCEMAP (Rotary Combustion Engine performance MAP generating code) is the mainframe version, while RCEMAPPC is a simplified subset designed for the personal computer, or PC, environment. Both versions are based on an open, zero-dimensional combustion system model for the prediction of instantaneous pressures, temperature, chemical composition and other in-chamber thermodynamic properties. Both versions predict overall engine performance and thermal characteristics, including bmep, bsfc, exhaust gas temperature, average material temperatures, and turbocharger operating conditions. Required inputs include engine geometry, materials, constants for use in the combustion heat release model, and turbomachinery maps. Illustrative examples and sample input files for both versions are included.

  1. A performance model for the communication in fast multipole methods on high-performance computing platforms

    KAUST Repository

    Ibeid, Huda; Yokota, Rio; Keyes, David E.

    2016-01-01

    model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization

  2. Scalability of DL_POLY on High Performance Computing Platform

    CSIR Research Space (South Africa)

    Mabakane, Mabule S

    2017-12-01

    Full Text Available stream_source_info Mabakanea_19979_2017.pdf.txt stream_content_type text/plain stream_size 33716 Content-Encoding UTF-8 stream_name Mabakanea_19979_2017.pdf.txt Content-Type text/plain; charset=UTF-8 SACJ 29(3) December... when using many processors within the compute nodes of the supercomputer. The type of the processors of compute nodes and their memory also play an important role in the overall performance of the parallel application running on a supercomputer. DL...

  3. Component-based software for high-performance scientific computing

    Energy Technology Data Exchange (ETDEWEB)

    Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

    2005-01-01

    Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly.

  4. Performance measurements in 3D ideal magnetohydrodynamic stability computations

    International Nuclear Information System (INIS)

    Anderson, D.V.; Cooper, W.A.; Gruber, R.; Schwenn, U.

    1989-10-01

    The 3D ideal magnetohydrodynamic stability code TERPSICHORE has been designed to take advantage of vector and microtasking capabilities of the latest CRAY computers. To keep the number of operations small most efficient algorithms have been applied in each computational step. The program investigates the stability properties of fusion reactor relevant plasma configurations confined by magnetic fields. For a typical 3D HELIAS configuration that has been considered we obtain an overall performance in excess of 1 Gflops on an eight processor CRAY-YMP machine. (author) 3 figs., 1 tab., 11 refs

  5. Component-based software for high-performance scientific computing

    International Nuclear Information System (INIS)

    Alexeev, Yuri; Allan, Benjamin A; Armstrong, Robert C; Bernholdt, David E; Dahlgren, Tamara L; Gannon, Dennis; Janssen, Curtis L; Kenny, Joseph P; Krishnan, Manojkumar; Kohl, James A; Kumfert, Gary; McInnes, Lois Curfman; Nieplocha, Jarek; Parker, Steven G; Rasmussen, Craig; Windus, Theresa L

    2005-01-01

    Recent advances in both computational hardware and multidisciplinary science have given rise to an unprecedented level of complexity in scientific simulation software. This paper describes an ongoing grass roots effort aimed at addressing complexity in high-performance computing through the use of Component-Based Software Engineering (CBSE). Highlights of the benefits and accomplishments of the Common Component Architecture (CCA) Forum and SciDAC ISIC are given, followed by an illustrative example of how the CCA has been applied to drive scientific discovery in quantum chemistry. Thrusts for future research are also described briefly

  6. Nuclear forces and high-performance computing: The perfect match

    International Nuclear Information System (INIS)

    Luu, T; Walker-Loud, A

    2009-01-01

    High-performance computing is now enabling the calculation of certain hadronic interaction parameters directly from Quantum Chromodynamics, the quantum field theory that governs the behavior of quarks and gluons and is ultimately responsible for the nuclear strong force. In this paper we briefly describe the state of the field and show how other aspects of hadronic interactions will be ascertained in the near future. We give estimates of computational requirements needed to obtain these goals, and outline a procedure for incorporating these results into the broader nuclear physics community.

  7. Study on Parallel Processing for Efficient Flexible Multibody Analysis based on Subsystem Synthesis Method

    Energy Technology Data Exchange (ETDEWEB)

    Han, Jong-Boo; Song, Hajun; Kim, Sung-Soo [Chungnam Nat’l Univ., Daejeon (Korea, Republic of)

    2017-06-15

    Flexible multibody simulations are widely used in the industry to design mechanical systems. In flexible multibody dynamics, deformation coordinates are described either relatively in the body reference frame that is floating in the space or in the inertial reference frame. Moreover, these deformation coordinates are generated based on the discretization of the body according to the finite element approach. Therefore, the formulation of the flexible multibody system always deals with a huge number of degrees of freedom and the numerical solution methods require a substantial amount of computational time. Parallel computational methods are a solution for efficient computation. However, most of the parallel computational methods are focused on the efficient solution of large-sized linear equations. For multibody analysis, we need to develop an efficient formulation that could be suitable for parallel computation. In this paper, we developed a subsystem synthesis method for a flexible multibody system and proposed efficient parallel computational schemes based on the OpenMP API in order to achieve efficient computation. Simulations of a rotating blade system, which consists of three identical blades, were carried out with two different parallel computational schemes. Actual CPU times were measured to investigate the efficiency of the proposed parallel schemes.

  8. Computer-aided performance monitoring program at Diablo Canyon

    International Nuclear Information System (INIS)

    Nelson, T.; Glynn, R. III; Kessler, T.C.

    1992-01-01

    This paper describes the thermal performance monitoring program at Pacific Gas ampersand Electric Company's (PG ampersand E's) Diablo Canyon Nuclear Power Plant. The plant performance monitoring program at Diablo Canyon uses the THERMAC performance monitoring and analysis computer software provided by Expert-EASE Systems. THERMAC is used to collect performance data from the plant process computers, condition that data to adjust for measurement errors and missing data points, evaluate cycle and component-level performance, archive the data for trend analysis and generate performance reports. The current status of the program is that, after a fair amount of open-quotes tuningclose quotes of the basic open-quotes thermal kitclose quotes models provided with the initial THERMAC installation, we have successfully baselined both units to cycle isolation test data from previous reload cycles. Over the course of the past few months, we have accumulated enough data to generate meaningful performance trends and, as a result, have been able to use THERMAC to track a condenser fouling problem that was costing enough megawatts to attract corporate-level attention. Trends from THERMAC clearly related the megawatt loss to a steadily degrading condenser cleanliness factor and verified the subsequent gain in megawatts after the condenser was cleaned. In the future, we expect to rebaseline THERMAC to a beginning of cycle (BOC) data set and to use the program to help track feedwater nozzle fouling

  9. Performance Measurements in a High Throughput Computing Environment

    CERN Document Server

    AUTHOR|(CDS)2145966; Gribaudo, Marco

    The IT infrastructures of companies and research centres are implementing new technologies to satisfy the increasing need of computing resources for big data analysis. In this context, resource profiling plays a crucial role in identifying areas where the improvement of the utilisation efficiency is needed. In order to deal with the profiling and optimisation of computing resources, two complementary approaches can be adopted: the measurement-based approach and the model-based approach. The measurement-based approach gathers and analyses performance metrics executing benchmark applications on computing resources. Instead, the model-based approach implies the design and implementation of a model as an abstraction of the real system, selecting only those aspects relevant to the study. This Thesis originates from a project carried out by the author within the CERN IT department. CERN is an international scientific laboratory that conducts fundamental researches in the domain of elementary particle physics. The p...

  10. Overview of Parallel Platforms for Common High Performance Computing

    Directory of Open Access Journals (Sweden)

    T. Fryza

    2012-04-01

    Full Text Available The paper deals with various parallel platforms used for high performance computing in the signal processing domain. More precisely, the methods exploiting the multicores central processing units such as message passing interface and OpenMP are taken into account. The properties of the programming methods are experimentally proved in the application of a fast Fourier transform and a discrete cosine transform and they are compared with the possibilities of MATLAB's built-in functions and Texas Instruments digital signal processors with very long instruction word architectures. New FFT and DCT implementations were proposed and tested. The implementation phase was compared with CPU based computing methods and with possibilities of the Texas Instruments digital signal processing library on C6747 floating-point DSPs. The optimal combination of computing methods in the signal processing domain and new, fast routines' implementation is proposed as well.

  11. Heat exchanger performance analysis programs for the personal computer

    International Nuclear Information System (INIS)

    Putman, R.E.

    1992-01-01

    Numerous utility industry heat exchange calculations are repetitive and thus lend themselves to being performed on a Personal Computer. These programs may be regarded as engineering tools which, when put together, can form a Toolbox. However, the practicing Results Engineer in the utility industry desires not only programs that are robust as well as easy to use but can also be used both on desktop and laptop PC's. The latter also offer the opportunity to take the computer into the plant or control room, and use it there to process test or operating data right on the spot. Most programs evolve through the needs which arise in the course of day-to-day work. This paper describes several of the more useful programs of this type and outlines some of the guidelines to be followed when designing personal computer programs for use by the practicing Results Engineer

  12. The parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture

    International Nuclear Information System (INIS)

    Wang Shi; Kang Kejun; Wang Jingjin

    1996-01-01

    Computerized Tomography (CT) is expected to become an inevitable diagnostic technique in the future. However, the long time required to reconstruct an image has been one of the major drawbacks associated with this technique. Parallel process is one of the best way to solve this problem. This paper gives the architecture, hardware and software design of PIRS-4 (4-processor Parallel Image Reconstruction System), which is a parallel processing system for fast 3D-CT image reconstruction by circular shifting float memory architecture. It includes the structure and components of the system, the design of crossbar switch and details of control model, the description of RPBP image reconstruction, the choice of OS (Operate System) and language, the principle of imitating EMS, direct memory R/W of float and programming in the protect model. Finally, the test results are given

  13. Practical parallel computing

    CERN Document Server

    Morse, H Stephen

    1994-01-01

    Practical Parallel Computing provides information pertinent to the fundamental aspects of high-performance parallel processing. This book discusses the development of parallel applications on a variety of equipment.Organized into three parts encompassing 12 chapters, this book begins with an overview of the technology trends that converge to favor massively parallel hardware over traditional mainframes and vector machines. This text then gives a tutorial introduction to parallel hardware architectures. Other chapters provide worked-out examples of programs using several parallel languages. Thi

  14. High performance parallel computers for science: New developments at the Fermilab advanced computer program

    International Nuclear Information System (INIS)

    Nash, T.; Areti, H.; Atac, R.

    1988-08-01

    Fermilab's Advanced Computer Program (ACP) has been developing highly cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 MFlops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction. 10 refs., 7 figs

  15. Simple, parallel, high-performance virtual machines for extreme computations

    International Nuclear Information System (INIS)

    Chokoufe Nejad, Bijan; Ohl, Thorsten; Reuter, Jurgen

    2014-11-01

    We introduce a high-performance virtual machine (VM) written in a numerically fast language like Fortran or C to evaluate very large expressions. We discuss the general concept of how to perform computations in terms of a VM and present specifically a VM that is able to compute tree-level cross sections for any number of external legs, given the corresponding byte code from the optimal matrix element generator, O'Mega. Furthermore, this approach allows to formulate the parallel computation of a single phase space point in a simple and obvious way. We analyze hereby the scaling behaviour with multiple threads as well as the benefits and drawbacks that are introduced with this method. Our implementation of a VM can run faster than the corresponding native, compiled code for certain processes and compilers, especially for very high multiplicities, and has in general runtimes in the same order of magnitude. By avoiding the tedious compile and link steps, which may fail for source code files of gigabyte sizes, new processes or complex higher order corrections that are currently out of reach could be evaluated with a VM given enough computing power.

  16. Performance monitoring for brain-computer-interface actions.

    Science.gov (United States)

    Schurger, Aaron; Gale, Steven; Gozel, Olivia; Blanke, Olaf

    2017-02-01

    When presented with a difficult perceptual decision, human observers are able to make metacognitive judgements of subjective certainty. Such judgements can be made independently of and prior to any overt response to a sensory stimulus, presumably via internal monitoring. Retrospective judgements about one's own task performance, on the other hand, require first that the subject perform a task and thus could potentially be made based on motor processes, proprioceptive, and other sensory feedback rather than internal monitoring. With this dichotomy in mind, we set out to study performance monitoring using a brain-computer interface (BCI), with which subjects could voluntarily perform an action - moving a cursor on a computer screen - without any movement of the body, and thus without somatosensory feedback. Real-time visual feedback was available to subjects during training, but not during the experiment where the true final position of the cursor was only revealed after the subject had estimated where s/he thought it had ended up after 6s of BCI-based cursor control. During the first half of the experiment subjects based their assessments primarily on the prior probability of the end position of the cursor on previous trials. However, during the second half of the experiment subjects' judgements moved significantly closer to the true end position of the cursor, and away from the prior. This suggests that subjects can monitor task performance when the task is performed without overt movement of the body. Copyright © 2016 Elsevier Inc. All rights reserved.

  17. High performance stream computing for particle beam transport simulations

    International Nuclear Information System (INIS)

    Appleby, R; Bailey, D; Higham, J; Salt, M

    2008-01-01

    Understanding modern particle accelerators requires simulating charged particle transport through the machine elements. These simulations can be very time consuming due to the large number of particles and the need to consider many turns of a circular machine. Stream computing offers an attractive way to dramatically improve the performance of such simulations by calculating the simultaneous transport of many particles using dedicated hardware. Modern Graphics Processing Units (GPUs) are powerful and affordable stream computing devices. The results of simulations of particle transport through the booster-to-storage-ring transfer line of the DIAMOND synchrotron light source using an NVidia GeForce 7900 GPU are compared to the standard transport code MAD. It is found that particle transport calculations are suitable for stream processing and large performance increases are possible. The accuracy and potential speed gains are compared and the prospects for future work in the area are discussed

  18. Unravelling the structure of matter on high-performance computers

    International Nuclear Information System (INIS)

    Kieu, T.D.; McKellar, B.H.J.

    1992-11-01

    The various phenomena and the different forms of matter in nature are believed to be the manifestation of only a handful set of fundamental building blocks-the elementary particles-which interact through the four fundamental forces. In the study of the structure of matter at this level one has to consider forces which are not sufficiently weak to be treated as small perturbations to the system, an example of which is the strong force that binds the nucleons together. High-performance computers, both vector and parallel machines, have facilitated the necessary non-perturbative treatments. The principles and the techniques of computer simulations applied to Quantum Chromodynamics are explained examples include the strong interactions, the calculation of the mass of nucleons and their decay rates. Some commercial and special-purpose high-performance machines for such calculations are also mentioned. 3 refs., 2 tabs

  19. A performance evaluation of the IBM 370/XT personal computer

    Science.gov (United States)

    Dominick, Wayne D. (Editor); Triantafyllopoulos, Spiros

    1984-01-01

    An evaluation of the IBM 370/XT personal computer is given. This evaluation focuses primarily on the use of the 370/XT for scientific and technical applications and applications development. A measurement of the capabilities of the 370/XT was performed by means of test programs which are presented. Also included is a review of facilities provided by the operating system (VM/PC), along with comments on the IBM 370/XT hardware configuration.

  20. Neuroanatomical correlates of brain-computer interface performance.

    Science.gov (United States)

    Kasahara, Kazumi; DaSalla, Charles Sayo; Honda, Manabu; Hanakawa, Takashi

    2015-04-15

    Brain-computer interfaces (BCIs) offer a potential means to replace or restore lost motor function. However, BCI performance varies considerably between users, the reasons for which are poorly understood. Here we investigated the relationship between sensorimotor rhythm (SMR)-based BCI performance and brain structure. Participants were instructed to control a computer cursor using right- and left-hand motor imagery, which primarily modulated their left- and right-hemispheric SMR powers, respectively. Although most participants were able to control the BCI with success rates significantly above chance level even at the first encounter, they also showed substantial inter-individual variability in BCI success rate. Participants also underwent T1-weighted three-dimensional structural magnetic resonance imaging (MRI). The MRI data were subjected to voxel-based morphometry using BCI success rate as an independent variable. We found that BCI performance correlated with gray matter volume of the supplementary motor area, supplementary somatosensory area, and dorsal premotor cortex. We suggest that SMR-based BCI performance is associated with development of non-primary somatosensory and motor areas. Advancing our understanding of BCI performance in relation to its neuroanatomical correlates may lead to better customization of BCIs based on individual brain structure. Copyright © 2015 Elsevier Inc. All rights reserved.

  1. High performance computer code for molecular dynamics simulations

    International Nuclear Information System (INIS)

    Levay, I.; Toekesi, K.

    2007-01-01

    Complete text of publication follows. Molecular Dynamics (MD) simulation is a widely used technique for modeling complicated physical phenomena. Since 2005 we are developing a MD simulations code for PC computers. The computer code is written in C++ object oriented programming language. The aim of our work is twofold: a) to develop a fast computer code for the study of random walk of guest atoms in Be crystal, b) 3 dimensional (3D) visualization of the particles motion. In this case we mimic the motion of the guest atoms in the crystal (diffusion-type motion), and the motion of atoms in the crystallattice (crystal deformation). Nowadays, it is common to use Graphics Devices in intensive computational problems. There are several ways to use this extreme processing performance, but never before was so easy to programming these devices as now. The CUDA (Compute Unified Device) Architecture introduced by nVidia Corporation in 2007 is a very useful for every processor hungry application. A Unified-architecture GPU include 96-128, or more stream processors, so the raw calculation performance is 576(!) GFLOPS. It is ten times faster, than the fastest dual Core CPU [Fig.1]. Our improved MD simulation software uses this new technology, which speed up our software and the code run 10 times faster in the critical calculation code segment. Although the GPU is a very powerful tool, it has a strongly paralleled structure. It means, that we have to create an algorithm, which works on several processors without deadlock. Our code currently uses 256 threads, shared and constant on-chip memory, instead of global memory, which is 100 times slower than others. It is possible to implement the total algorithm on GPU, therefore we do not need to download and upload the data in every iteration. On behalf of maximal throughput, every thread run with the same instructions

  2. Scalability of DL_POLY on High Performance Computing Platform

    Directory of Open Access Journals (Sweden)

    Mabule Samuel Mabakane

    2017-12-01

    Full Text Available This paper presents a case study on the scalability of several versions of the molecular dynamics code (DL_POLY performed on South Africa‘s Centre for High Performance Computing e1350 IBM Linux cluster, Sun system and Lengau supercomputers. Within this study different problem sizes were designed and the same chosen systems were employed in order to test the performance of DL_POLY using weak and strong scalability. It was found that the speed-up results for the small systems were better than large systems on both Ethernet and Infiniband network. However, simulations of large systems in DL_POLY performed well using Infiniband network on Lengau cluster as compared to e1350 and Sun supercomputer.

  3. Scintillator performance considerations for dedicated breast computed tomography

    Science.gov (United States)

    Vedantham, Srinivasan; Shi, Linxi; Karellas, Andrew

    2017-09-01

    Dedicated breast computed tomography (BCT) is an emerging clinical modality that can eliminate tissue superposition and has the potential for improved sensitivity and specificity for breast cancer detection and diagnosis. It is performed without physical compression of the breast. Most of the dedicated BCT systems use large-area detectors operating in cone-beam geometry and are referred to as cone-beam breast CT (CBBCT) systems. The large-area detectors in CBBCT systems are energy-integrating, indirect-type detectors employing a scintillator that converts x-ray photons to light, followed by detection of optical photons. A key consideration that determines the image quality achieved by such CBBCT systems is the choice of scintillator and its performance characteristics. In this work, a framework for analyzing the impact of the scintillator on CBBCT performance and its use for task-specific optimization of CBBCT imaging performance is described.

  4. What Physicists Should Know About High Performance Computing - Circa 2002

    Science.gov (United States)

    Frederick, Donald

    2002-08-01

    High Performance Computing (HPC) is a dynamic, cross-disciplinary field that traditionally has involved applied mathematicians, computer scientists, and others primarily from the various disciplines that have been major users of HPC resources - physics, chemistry, engineering, with increasing use by those in the life sciences. There is a technological dynamic that is powered by economic as well as by technical innovations and developments. This talk will discuss practical ideas to be considered when developing numerical applications for research purposes. Even with the rapid pace of development in the field, the author believes that these concepts will not become obsolete for a while, and will be of use to scientists who either are considering, or who have already started down the HPC path. These principles will be applied in particular to current parallel HPC systems, but there will also be references of value to desktop users. The talk will cover such topics as: computing hardware basics, single-cpu optimization, compilers, timing, numerical libraries, debugging and profiling tools and the emergence of Computational Grids.

  5. Computational Fluid Dynamics (CFD) Computations With Zonal Navier-Stokes Flow Solver (ZNSFLOW) Common High Performance Computing Scalable Software Initiative (CHSSI) Software

    National Research Council Canada - National Science Library

    Edge, Harris

    1999-01-01

    ...), computational fluid dynamics (CFD) 6 project. Under the project, a proven zonal Navier-Stokes solver was rewritten for scalable parallel performance on both shared memory and distributed memory high performance computers...

  6. High performance computing in science and engineering '09: transactions of the High Performance Computing Center, Stuttgart (HLRS) 2009

    National Research Council Canada - National Science Library

    Nagel, Wolfgang E; Kröner, Dietmar; Resch, Michael

    2010-01-01

    ...), NIC/JSC (J¨ u lich), and LRZ (Munich). As part of that strategic initiative, in May 2009 already NIC/JSC has installed the first phase of the GCS HPC Tier-0 resources, an IBM Blue Gene/P with roughly 300.000 Cores, this time in J¨ u lich, With that, the GCS provides the most powerful high-performance computing infrastructure in Europe alread...

  7. Using Monte Carlo techniques and parallel processing for debris hazard analysis of rocket systems

    Energy Technology Data Exchange (ETDEWEB)

    LaFarge, R.A.

    1994-02-01

    Sandia National Laboratories has been involved with rocket systems for many years. Some of these systems have carried high explosive onboard, while others have had FTS for destruction purposes whenever a potential hazard is detected. Recently, Sandia has also been involved with flight tests in which a target vehicle is intentionally destroyed by a projectile. Such endeavors always raise questions about the safety of personnel and the environment in the event of a premature detonation of the explosive or an activation of the FTS, as well as intentional vehicle destruction. Previous attempts to investigate fragmentation hazards for similar configurations have analyzed fragment size and shape in detail but have computed only a limited number of trajectories to determine the probabilities of impact and casualty expectations. A computer program SAFETIE has been written in support of various SNL flight experiments to compute better approximations of the hazards. SAFETIE uses the AMEER trajectory computer code and the Engineering Sciences Center LAN of Sun workstations to determine more realistically the probability of impact for an arbitrary number of exclusion areas. The various debris generation models are described.

  8. Information-Limited Parallel Processing in Difficult Heterogeneous Covert Visual Search

    Science.gov (United States)

    Dosher, Barbara Anne; Han, Songmei; Lu, Zhong-Lin

    2010-01-01

    Difficult visual search is often attributed to time-limited serial attention operations, although neural computations in the early visual system are parallel. Using probabilistic search models (Dosher, Han, & Lu, 2004) and a full time-course analysis of the dynamics of covert visual search, we distinguish unlimited capacity parallel versus serial…

  9. High performance computing and communications: FY 1995 implementation plan

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1994-04-01

    The High Performance Computing and Communications (HPCC) Program was formally established following passage of the High Performance Computing Act of 1991 signed on December 9, 1991. Ten federal agencies in collaboration with scientists and managers from US industry, universities, and laboratories have developed the HPCC Program to meet the challenges of advancing computing and associated communications technologies and practices. This plan provides a detailed description of the agencies` HPCC implementation plans for FY 1994 and FY 1995. This Implementation Plan contains three additional sections. Section 3 provides an overview of the HPCC Program definition and organization. Section 4 contains a breakdown of the five major components of the HPCC Program, with an emphasis on the overall directions and milestones planned for each one. Section 5 provides a detailed look at HPCC Program activities within each agency. Although the Department of Education is an official HPCC agency, its current funding and reporting of crosscut activities goes through the Committee on Education and Health Resources, not the HPCC Program. For this reason the Implementation Plan covers nine HPCC agencies.

  10. High Performance Computing - Power Application Programming Interface Specification.

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H.,; Kelly, Suzanne M.; Pedretti, Kevin; Grant, Ryan; Olivier, Stephen Lecler; Levenhagen, Michael J.; DeBonis, David

    2014-08-01

    Measuring and controlling the power and energy consumption of high performance computing systems by various components in the software stack is an active research area [13, 3, 5, 10, 4, 21, 19, 16, 7, 17, 20, 18, 11, 1, 6, 14, 12]. Implementations in lower level software layers are beginning to emerge in some production systems, which is very welcome. To be most effective, a portable interface to measurement and control features would significantly facilitate participation by all levels of the software stack. We present a proposal for a standard power Application Programming Interface (API) that endeavors to cover the entire software space, from generic hardware interfaces to the input from the computer facility manager.

  11. Performance of an extrapolation chamber in computed tomography standard beams

    International Nuclear Information System (INIS)

    Castro, Maysa C.; Silva, Natália F.; Caldas, Linda V.E.

    2017-01-01

    Among the medical uses of ionizing radiations, the computed tomography (CT) diagnostic exams are responsible for the highest dose values to the patients. The dosimetry procedure in CT scanner beams makes use of pencil ionization chambers with sensitive volume lengths of 10 cm. The aim of its calibration is to compare the values that are obtained with the instrument to be calibrated and a standard reference system. However, there is no primary standard system for this kind of radiation beam. Therefore, an extrapolation ionization chamber built at the Calibration Laboratory (LCI), was used to establish a CT primary standard. The objective of this work was to perform some characterization tests (short- and medium-term stabilities, saturation curve, polarity effect and ion collection efficiency) in the standard X-rays beams established for computed tomography at the LCI. (author)

  12. Performance of an extrapolation chamber in computed tomography standard beams

    Energy Technology Data Exchange (ETDEWEB)

    Castro, Maysa C.; Silva, Natália F.; Caldas, Linda V.E., E-mail: mcastro@ipen.br [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil)

    2017-07-01

    Among the medical uses of ionizing radiations, the computed tomography (CT) diagnostic exams are responsible for the highest dose values to the patients. The dosimetry procedure in CT scanner beams makes use of pencil ionization chambers with sensitive volume lengths of 10 cm. The aim of its calibration is to compare the values that are obtained with the instrument to be calibrated and a standard reference system. However, there is no primary standard system for this kind of radiation beam. Therefore, an extrapolation ionization chamber built at the Calibration Laboratory (LCI), was used to establish a CT primary standard. The objective of this work was to perform some characterization tests (short- and medium-term stabilities, saturation curve, polarity effect and ion collection efficiency) in the standard X-rays beams established for computed tomography at the LCI. (author)

  13. Evaluating computer program performance on the CRAY-1

    International Nuclear Information System (INIS)

    Rudsinski, L.; Pieper, G.W.

    1979-01-01

    The Advanced Scientific Computers Project of Argonne's Applied Mathematics Division has two objectives: to evaluate supercomputers and to determine their effect on Argonne's computing workload. Initial efforts have focused on the CRAY-1, which is the only advanced computer currently available. Users from seven Argonne divisions executed test programs on the CRAY and made performance comparisons with the IBM 370/195 at Argonne. This report describes these experiences and discusses various techniques for improving run times on the CRAY. Direct translations of code from scalar to vector processor reduced running times as much as two-fold, and this reduction will become more pronounced as the CRAY compiler is developed. Further improvement (two- to ten-fold) was realized by making minor code changes to facilitate compiler recognition of the parallel and vector structure within the programs. Finally, extensive rewriting of the FORTRAN code structure reduced execution times dramatically, in three cases by a factor of more than 20; and even greater reduction should be possible by changing algorithms within a production code. It is condluded that the CRAY-1 would be of great benefit to Argonne researchers. Existing codes could be modified with relative ease to run significantly faster than on the 370/195. More important, the CRAY would permit scientists to investigate complex problems currently deemed infeasibile on traditional scalar machines. Finally, an interface between the CRAY-1 and IBM computers such as the 370/195, scheduled by Cray Research for the first quarter of 1979, would considerably facilitate the task of integrating the CRAY into Argonne's Central Computing Facility. 13 tables

  14. Parallel computation for distributed parameter system-from vector processors to Adena computer

    Energy Technology Data Exchange (ETDEWEB)

    Nogi, T

    1983-04-01

    Research on advanced parallel hardware and software architectures for very high-speed computation deserves and needs more support and attention to fulfil its promise. Novel architectures for parallel processing are being made ready. Architectures for parallel processing can be roughly divided into two groups. One is a vector processor in which a single central processing unit involves multiple vector-arithmetic registers. The other is a processor array in which slave processors are connected to a host processor to perform parallel computation. In this review, the concept and data structure of the Adena (alternating-direction edition nexus array) architecture, which is conformable to distributed-parameter simulation algorithms, are described. 5 references.

  15. The parallel processing impact in the optimization of the reactors neutronic by genetic algorithms

    International Nuclear Information System (INIS)

    Pereira, Claudio M.N.A.; Universidade Federal, Rio de Janeiro, RJ; Lapa, Celso M.F.; Mol, Antonio C.A.

    2002-01-01

    Nowadays, many optimization problems found in nuclear engineering has been solved through genetic algorithms (GA). The robustness of such methods is strongly related to the nature of search process which is based on populations of solution candidates, and this fact implies high computational cost in the optimization process. The use of GA become more critical when the evaluation process of a solution candidate is highly time consuming. Problems of this nature are common in the nuclear engineering, and an example is the reactor design optimization, where neutronic codes, which consume high CPU time, must be run. Aiming to investigate the impact of the use of parallel computation in the solution, through GA, of a reactor design optimization problem, a parallel genetic algorithm (PGA), using the Island Model, was developed. Exhaustive experiments, then 1500 processing hours in 550 MHz personal computers, have been done, in order to compare the conventional GA with the PGA. Such experiments have demonstrating the superiority of the PGA not only in terms of execution time, but also, in the optimization results. (author)

  16. A checkpoint compression study for high-performance computing systems

    Energy Technology Data Exchange (ETDEWEB)

    Ibtesham, Dewan [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science; Ferreira, Kurt B. [Sandia National Laboratories (SNL-NM), Albuquerque, NM (United States). Scalable System Software Dept.; Arnold, Dorian [Univ. of New Mexico, Albuquerque, NM (United States). Dept. of Computer Science

    2015-02-17

    As high-performance computing systems continue to increase in size and complexity, higher failure rates and increased overheads for checkpoint/restart (CR) protocols have raised concerns about the practical viability of CR protocols for future systems. Previously, compression has proven to be a viable approach for reducing checkpoint data volumes and, thereby, reducing CR protocol overhead leading to improved application performance. In this article, we further explore compression-based CR optimization by exploring its baseline performance and scaling properties, evaluating whether improved compression algorithms might lead to even better application performance and comparing checkpoint compression against and alongside other software- and hardware-based optimizations. Our results highlights are: (1) compression is a very viable CR optimization; (2) generic, text-based compression algorithms appear to perform near optimally for checkpoint data compression and faster compression algorithms will not lead to better application performance; (3) compression-based optimizations fare well against and alongside other software-based optimizations; and (4) while hardware-based optimizations outperform software-based ones, they are not as cost effective.

  17. Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

    2017-07-14

    As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.

  18. Review of quantum computation

    International Nuclear Information System (INIS)

    Lloyd, S.

    1992-01-01

    Digital computers are machines that can be programmed to perform logical and arithmetical operations. Contemporary digital computers are ''universal,'' in the sense that a program that runs on one computer can, if properly compiled, run on any other computer that has access to enough memory space and time. Any one universal computer can simulate the operation of any other; and the set of tasks that any such machine can perform is common to all universal machines. Since Bennett's discovery that computation can be carried out in a non-dissipative fashion, a number of Hamiltonian quantum-mechanical systems have been proposed whose time-evolutions over discrete intervals are equivalent to those of specific universal computers. The first quantum-mechanical treatment of computers was given by Benioff, who exhibited a Hamiltonian system with a basis whose members corresponded to the logical states of a Turing machine. In order to make the Hamiltonian local, in the sense that its structure depended only on the part of the computation being performed at that time, Benioff found it necessary to make the Hamiltonian time-dependent. Feynman discovered a way to make the computational Hamiltonian both local and time-independent by incorporating the direction of computation in the initial condition. In Feynman's quantum computer, the program is a carefully prepared wave packet that propagates through different computational states. Deutsch presented a quantum computer that exploits the possibility of existing in a superposition of computational states to perform tasks that a classical computer cannot, such as generating purely random numbers, and carrying out superpositions of computations as a method of parallel processing. In this paper, we show that such computers, by virtue of their common function, possess a common form for their quantum dynamics

  19. Fast Performance Computing Model for Smart Distributed Power Systems

    Directory of Open Access Journals (Sweden)

    Umair Younas

    2017-06-01

    Full Text Available Plug-in Electric Vehicles (PEVs are becoming the more prominent solution compared to fossil fuels cars technology due to its significant role in Greenhouse Gas (GHG reduction, flexible storage, and ancillary service provision as a Distributed Generation (DG resource in Vehicle to Grid (V2G regulation mode. However, large-scale penetration of PEVs and growing demand of energy intensive Data Centers (DCs brings undesirable higher load peaks in electricity demand hence, impose supply-demand imbalance and threaten the reliability of wholesale and retail power market. In order to overcome the aforementioned challenges, the proposed research considers smart Distributed Power System (DPS comprising conventional sources, renewable energy, V2G regulation, and flexible storage energy resources. Moreover, price and incentive based Demand Response (DR programs are implemented to sustain the balance between net demand and available generating resources in the DPS. In addition, we adapted a novel strategy to implement the computational intensive jobs of the proposed DPS model including incoming load profiles, V2G regulation, battery State of Charge (SOC indication, and fast computation in decision based automated DR algorithm using Fast Performance Computing resources of DCs. In response, DPS provide economical and stable power to DCs under strict power quality constraints. Finally, the improved results are verified using case study of ISO California integrated with hybrid generation.

  20. Elastic Alignment of Microscopic Images Using Parallel Processing on CUDA-Supported Graphics Processor Units

    Czech Academy of Sciences Publication Activity Database

    Michálek, Jan; Čapek, M.; Janáček, Jiří; Kubínová, Lucie

    2010-01-01

    Roč. 16, Suppl.2 (2010), s. 730-731 ISSN 1431-9276. [Microscopy and Microanalysis 2010. Portland, 01.08.2010-05.08.2010] R&D Projects: GA ČR(CZ) GA102/08/0691; GA ČR(CZ) GA304/09/0733; GA MŠk(CZ) LC06063 Institutional research plan: CEZ:AV0Z50110509 Keywords : elastic alignment * CUDA * confocal microscopy Subject RIV: JD - Computer Applications, Robotics Impact factor: 2.179, year: 2010

  1. Real-time Tsunami Inundation Prediction Using High Performance Computers

    Science.gov (United States)

    Oishi, Y.; Imamura, F.; Sugawara, D.

    2014-12-01

    Recently off-shore tsunami observation stations based on cabled ocean bottom pressure gauges are actively being deployed especially in Japan. These cabled systems are designed to provide real-time tsunami data before tsunamis reach coastlines for disaster mitigation purposes. To receive real benefits of these observations, real-time analysis techniques to make an effective use of these data are necessary. A representative study was made by Tsushima et al. (2009) that proposed a method to provide instant tsunami source prediction based on achieving tsunami waveform data. As time passes, the prediction is improved by using updated waveform data. After a tsunami source is predicted, tsunami waveforms are synthesized from pre-computed tsunami Green functions of linear long wave equations. Tsushima et al. (2014) updated the method by combining the tsunami waveform inversion with an instant inversion of coseismic crustal deformation and improved the prediction accuracy and speed in the early stages. For disaster mitigation purposes, real-time predictions of tsunami inundation are also important. In this study, we discuss the possibility of real-time tsunami inundation predictions, which require faster-than-real-time tsunami inundation simulation in addition to instant tsunami source analysis. Although the computational amount is large to solve non-linear shallow water equations for inundation predictions, it has become executable through the recent developments of high performance computing technologies. We conducted parallel computations of tsunami inundation and achieved 6.0 TFLOPS by using 19,000 CPU cores. We employed a leap-frog finite difference method with nested staggered grids of which resolution range from 405 m to 5 m. The resolution ratio of each nested domain was 1/3. Total number of grid points were 13 million, and the time step was 0.1 seconds. Tsunami sources of 2011 Tohoku-oki earthquake were tested. The inundation prediction up to 2 hours after the

  2. Computational Environments and Analysis methods available on the NCI High Performance Computing (HPC) and High Performance Data (HPD) Platform

    Science.gov (United States)

    Evans, B. J. K.; Foster, C.; Minchin, S. A.; Pugh, T.; Lewis, A.; Wyborn, L. A.; Evans, B. J.; Uhlherr, A.

    2014-12-01

    The National Computational Infrastructure (NCI) has established a powerful in-situ computational environment to enable both high performance computing and data-intensive science across a wide spectrum of national environmental data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress in addressing harmonisation of the underlying data collections for future transdisciplinary research that enable accurate climate projections. NCI makes available 10+ PB major data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the national scientific records), major research communities, and collaborating overseas organisations. The data is accessible within an integrated HPC-HPD environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large scale and high-bandwidth Lustre filesystems. This computational environment supports a catalogue of integrated reusable software and workflows from earth system and ecosystem modelling, weather research, satellite and other observed data processing and analysis. To enable transdisciplinary research on this scale, data needs to be harmonised so that researchers can readily apply techniques and software across the corpus of data available and not be constrained to work within artificial disciplinary boundaries. Future challenges will

  3. Mixed-Language High-Performance Computing for Plasma Simulations

    Directory of Open Access Journals (Sweden)

    Quanming Lu

    2003-01-01

    Full Text Available Java is receiving increasing attention as the most popular platform for distributed computing. However, programmers are still reluctant to embrace Java as a tool for writing scientific and engineering applications due to its still noticeable performance drawbacks compared with other programming languages such as Fortran or C. In this paper, we present a hybrid Java/Fortran implementation of a parallel particle-in-cell (PIC algorithm for plasma simulations. In our approach, the time-consuming components of this application are designed and implemented as Fortran subroutines, while less calculation-intensive components usually involved in building the user interface are written in Java. The two types of software modules have been glued together using the Java native interface (JNI. Our mixed-language PIC code was tested and its performance compared with pure Java and Fortran versions of the same algorithm on a Sun E6500 SMP system and a Linux cluster of Pentium~III machines.

  4. Computational Fluid Dynamics and Building Energy Performance Simulation

    DEFF Research Database (Denmark)

    Nielsen, Peter Vilhelm; Tryggvason, T.

    1998-01-01

    An interconnection between a building energy performance simulation program and a Computational Fluid Dynamics program (CFD) for room air distribution will be introduced for improvement of the predictions of both the energy consumption and the indoor environment. The building energy performance...... simulation program requires a detailed description of the energy flow in the air movement which can be obtained by a CFD program. The paper describes an energy consumption calculation in a large building, where the building energy simulation program is modified by CFD predictions of the flow between three...... zones connected by open areas with pressure and buoyancy driven air flow. The two programs are interconnected in an iterative procedure. The paper shows also an evaluation of the air quality in the main area of the buildings based on CFD predictions. It is shown that an interconnection between a CFD...

  5. Leveraging human oversight and intervention in large-scale parallel processing of open-source data

    Science.gov (United States)

    Casini, Enrico; Suri, Niranjan; Bradshaw, Jeffrey M.

    2015-05-01

    The popularity of cloud computing along with the increased availability of cheap storage have led to the necessity of elaboration and transformation of large volumes of open-source data, all in parallel. One way to handle such extensive volumes of information properly is to take advantage of distributed computing frameworks like Map-Reduce. Unfortunately, an entirely automated approach that excludes human intervention is often unpredictable and error prone. Highly accurate data processing and decision-making can be achieved by supporting an automatic process through human collaboration, in a variety of environments such as warfare, cyber security and threat monitoring. Although this mutual participation seems easily exploitable, human-machine collaboration in the field of data analysis presents several challenges. First, due to the asynchronous nature of human intervention, it is necessary to verify that once a correction is made, all the necessary reprocessing is done in chain. Second, it is often needed to minimize the amount of reprocessing in order to optimize the usage of resources due to limited availability. In order to improve on these strict requirements, this paper introduces improvements to an innovative approach for human-machine collaboration in the processing of large amounts of open-source data in parallel.

  6. NCI's High Performance Computing (HPC) and High Performance Data (HPD) Computing Platform for Environmental and Earth System Data Science

    Science.gov (United States)

    Evans, Ben; Allen, Chris; Antony, Joseph; Bastrakova, Irina; Gohar, Kashif; Porter, David; Pugh, Tim; Santana, Fabiana; Smillie, Jon; Trenham, Claire; Wang, Jingbo; Wyborn, Lesley

    2015-04-01

    The National Computational Infrastructure (NCI) has established a powerful and flexible in-situ petascale computational environment to enable both high performance computing and Data-intensive Science across a wide spectrum of national environmental and earth science data collections - in particular climate, observational data and geoscientific assets. This paper examines 1) the computational environments that supports the modelling and data processing pipelines, 2) the analysis environments and methods to support data analysis, and 3) the progress so far to harmonise the underlying data collections for future interdisciplinary research across these large volume data collections. NCI has established 10+ PBytes of major national and international data collections from both the government and research sectors based on six themes: 1) weather, climate, and earth system science model simulations, 2) marine and earth observations, 3) geosciences, 4) terrestrial ecosystems, 5) water and hydrology, and 6) astronomy, social and biosciences. Collectively they span the lithosphere, crust, biosphere, hydrosphere, troposphere, and stratosphere. The data is largely sourced from NCI's partners (which include the custodians of many of the major Australian national-scale scientific collections), leading research communities, and collaborating overseas organisations. New infrastructures created at NCI mean the data collections are now accessible within an integrated High Performance Computing and Data (HPC-HPD) environment - a 1.2 PFlop supercomputer (Raijin), a HPC class 3000 core OpenStack cloud system and several highly connected large-scale high-bandwidth Lustre filesystems. The hardware was designed at inception to ensure that it would allow the layered software environment to flexibly accommodate the advancement of future data science. New approaches to software technology and data models have also had to be developed to enable access to these large and exponentially

  7. Performance Evaluation of Resource Management in Cloud Computing Environments.

    Science.gov (United States)

    Batista, Bruno Guazzelli; Estrella, Julio Cezar; Ferreira, Carlos Henrique Gomes; Filho, Dionisio Machado Leite; Nakamura, Luis Hideo Vasconcelos; Reiff-Marganiec, Stephan; Santana, Marcos José; Santana, Regina Helena Carlucci

    2015-01-01

    Cloud computing is a computational model in which resource providers can offer on-demand services to clients in a transparent way. However, to be able to guarantee quality of service without limiting the number of accepted requests, providers must be able to dynamically manage the available resources so that they can be optimized. This dynamic resource management is not a trivial task, since it involves meeting several challenges related to workload modeling, virtualization, performance modeling, deployment and monitoring of applications on virtualized resources. This paper carries out a performance evaluation of a module for resource management in a cloud environment that includes handling available resources during execution time and ensuring the quality of service defined in the service level agreement. An analysis was conducted of different resource configurations to define which dimension of resource scaling has a real influence on client requests. The results were used to model and implement a simulated cloud system, in which the allocated resource can be changed on-the-fly, with a corresponding change in price. In this way, the proposed module seeks to satisfy both the client by ensuring quality of service, and the provider by ensuring the best use of resources at a fair price.

  8. Performance of scientific computing platforms with MCNP4B

    International Nuclear Information System (INIS)

    McLaughlin, H.E.; Hendricks, J.S.

    1998-01-01

    Several computing platforms were evaluated with the MCNP4B Monte Carlo radiation transport code. The DEC AlphaStation 500/500 was the fastest to run MCNP4B. Compared to the HP 9000-735, the fastest platform 4 yr ago, the AlphaStation is 335% faster, the HP C180 is 133% faster, the SGI Origin 2000 is 82% faster, the Cray T94/4128 is 1% faster, the IBM RS/6000-590 is 93% as fast, the DEC 3000/600 is 81% as fast, the Sun Sparc20 is 57% as fast, the Cray YMP 8/8128 is 57% as fast, the sun Sparc5 is 33% as fast, and the Sun Sparc2 is 13% as fast. All results presented are reproducible and allow for comparison to computer platforms not included in this study. Timing studies are seen to be very problem dependent. The performance gains resulting from advances in software were also investigated. Various compilers and operating systems were seen to have a modest impact on performance, whereas hardware improvements have resulted in a factor of 4 improvement. MCNP4B also ran approximately as fast as MCNP4A

  9. Performance Evaluation of Resource Management in Cloud Computing Environments.

    Directory of Open Access Journals (Sweden)

    Bruno Guazzelli Batista

    Full Text Available Cloud computing is a computational model in which resource providers can offer on-demand services to clients in a transparent way. However, to be able to guarantee quality of service without limiting the number of accepted requests, providers must be able to dynamically manage the available resources so that they can be optimized. This dynamic resource management is not a trivial task, since it involves meeting several challenges related to workload modeling, virtualization, performance modeling, deployment and monitoring of applications on virtualized resources. This paper carries out a performance evaluation of a module for resource management in a cloud environment that includes handling available resources during execution time and ensuring the quality of service defined in the service level agreement. An analysis was conducted of different resource configurations to define which dimension of resource scaling has a real influence on client requests. The results were used to model and implement a simulated cloud system, in which the allocated resource can be changed on-the-fly, with a corresponding change in price. In this way, the proposed module seeks to satisfy both the client by ensuring quality of service, and the provider by ensuring the best use of resources at a fair price.

  10. Play for Performance: Using Computer Games to Improve Motivation and Test-Taking Performance

    Science.gov (United States)

    Dennis, Alan R.; Bhagwatwar, Akshay; Minas, Randall K.

    2013-01-01

    The importance of testing, especially certification and high-stakes testing, has increased substantially over the past decade. Building on the "serious gaming" literature and the psychology "priming" literature, we developed a computer game designed to improve test-taking performance using psychological priming. The game primed…

  11. PARALLEL PROCESSING OF BIG POINT CLOUDS USING Z-ORDER-BASED PARTITIONING

    Directory of Open Access Journals (Sweden)

    C. Alis

    2016-06-01

    Full Text Available As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. We propose a partitioning based on Z-order which is a form of locality-sensitive hashing. The Z-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the Z-order code for the grid square with coordinates (x = 1 = 012, y = 3 = 112 is 10112 = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the k nearest

  12. Parallel Processing of Big Point Clouds Using Z-Order Partitioning

    Science.gov (United States)

    Alis, C.; Boehm, J.; Liu, K.

    2016-06-01

    As laser scanning technology improves and costs are coming down, the amount of point cloud data being generated can be prohibitively difficult and expensive to process on a single machine. This data explosion is not only limited to point cloud data. Voluminous amounts of high-dimensionality and quickly accumulating data, collectively known as Big Data, such as those generated by social media, Internet of Things devices and commercial transactions, are becoming more prevalent as well. New computing paradigms and frameworks are being developed to efficiently handle the processing of Big Data, many of which utilize a compute cluster composed of several commodity grade machines to process chunks of data in parallel. A central concept in many of these frameworks is data locality. By its nature, Big Data is large enough that the entire dataset would not fit on the memory and hard drives of a single node hence replicating the entire dataset to each worker node is impractical. The data must then be partitioned across worker nodes in a manner that minimises data transfer across the network. This is a challenge for point cloud data because there exist different ways to partition data and they may require data transfer. We propose a partitioning based on Z-order which is a form of locality-sensitive hashing. The Z-order or Morton code is computed by dividing each dimension to form a grid then interleaving the binary representation of each dimension. For example, the Z-order code for the grid square with coordinates (x = 1 = 012, y = 3 = 112) is 10112 = 11. The number of points in each partition is controlled by the number of bits per dimension: the more bits, the fewer the points. The number of bits per dimension also controls the level of detail with more bits yielding finer partitioning. We present this partitioning method by implementing it on Apache Spark and investigating how different parameters affect the accuracy and running time of the k nearest neighbour algorithm

  13. A New Track Reconstruction Algorithm suitable for Parallel Processing based on Hit Triplets and Broken Lines

    Directory of Open Access Journals (Sweden)

    Schöning André

    2016-01-01

    Full Text Available Track reconstruction in high track multiplicity environments at current and future high rate particle physics experiments is a big challenge and very time consuming. The search for track seeds and the fitting of track candidates are usually the most time consuming steps in the track reconstruction. Here, a new and fast track reconstruction method based on hit triplets is proposed which exploits a three-dimensional fit model including multiple scattering and hit uncertainties from the very start, including the search for track seeds. The hit triplet based reconstruction method assumes a homogeneous magnetic field which allows to give an analytical solutions for the triplet fit result. This method is highly parallelizable, needs fewer operations than other standard track reconstruction methods and is therefore ideal for the implementation on parallel computing architectures. The proposed track reconstruction algorithm has been studied in the context of the Mu3e-experiment and a typical LHC experiment.

  14. Global restructuring of the CPM-2 transport algorithm for vector and parallel processing

    International Nuclear Information System (INIS)

    Vujic, J.L.; Martin, W.R.

    1989-01-01

    The CPM-2 code is an assembly transport code based on the collision probability (CP) method. It can in principle be applied to global reactor problems, but its excessive computational demands prevent this application. Therefore, a new transport algorithm for CPM-2 has been developed for vector-parallel architectures, which has resulted in an overall factor of 20 speedup (wall clock) on the IBM 3090-600E. This paper presents the detailed results of this effort as well as a brief description of ongoing effort to remove some of the modeling limitations in CPM-2 that inhibit its use for global applications, such as the use of the pure CP treatment and the assumption of isotropic scattering

  15. Department of Energy Mathematical, Information, and Computational Sciences Division: High Performance Computing and Communications Program

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-11-01

    This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, The DOE Program in HPCC), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW).

  16. Department of Energy: MICS (Mathematical Information, and Computational Sciences Division). High performance computing and communications program

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-06-01

    This document is intended to serve two purposes. Its first purpose is that of a program status report of the considerable progress that the Department of Energy (DOE) has made since 1993, the time of the last such report (DOE/ER-0536, {open_quotes}The DOE Program in HPCC{close_quotes}), toward achieving the goals of the High Performance Computing and Communications (HPCC) Program. The second purpose is that of a summary report of the many research programs administered by the Mathematical, Information, and Computational Sciences (MICS) Division of the Office of Energy Research under the auspices of the HPCC Program and to provide, wherever relevant, easy access to pertinent information about MICS-Division activities via universal resource locators (URLs) on the World Wide Web (WWW). The information pointed to by the URL is updated frequently, and the interested reader is urged to access the WWW for the latest information.

  17. Integrated modeling tool for performance engineering of complex computer systems

    Science.gov (United States)

    Wright, Gary; Ball, Duane; Hoyt, Susan; Steele, Oscar

    1989-01-01

    This report summarizes Advanced System Technologies' accomplishments on the Phase 2 SBIR contract NAS7-995. The technical objectives of the report are: (1) to develop an evaluation version of a graphical, integrated modeling language according to the specification resulting from the Phase 2 research; and (2) to determine the degree to which the language meets its objectives by evaluating ease of use, utility of two sets of performance predictions, and the power of the language constructs. The technical approach followed to meet these objectives was to design, develop, and test an evaluation prototype of a graphical, performance prediction tool. The utility of the prototype was then evaluated by applying it to a variety of test cases found in the literature and in AST case histories. Numerous models were constructed and successfully tested. The major conclusion of this Phase 2 SBIR research and development effort is that complex, real-time computer systems can be specified in a non-procedural manner using combinations of icons, windows, menus, and dialogs. Such a specification technique provides an interface that system designers and architects find natural and easy to use. In addition, PEDESTAL's multiview approach provides system engineers with the capability to perform the trade-offs necessary to produce a design that meets timing performance requirements. Sample system designs analyzed during the development effort showed that models could be constructed in a fraction of the time required by non-visual system design capture tools.

  18. Multi-Language Programming Environments for High Performance Java Computing

    Directory of Open Access Journals (Sweden)

    Vladimir Getov

    1999-01-01

    Full Text Available Recent developments in processor capabilities, software tools, programming languages and programming paradigms have brought about new approaches to high performance computing. A steadfast component of this dynamic evolution has been the scientific community’s reliance on established scientific packages. As a consequence, programmers of high‐performance applications are reluctant to embrace evolving languages such as Java. This paper describes the Java‐to‐C Interface (JCI tool which provides application programmers wishing to use Java with immediate accessibility to existing scientific packages. The JCI tool also facilitates rapid development and reuse of existing code. These benefits are provided at minimal cost to the programmer. While beneficial to the programmer, the additional advantages of mixed‐language programming in terms of application performance and portability are addressed in detail within the context of this paper. In addition, we discuss how the JCI tool is complementing other ongoing projects such as IBM’s High‐Performance Compiler for Java (HPCJ and IceT’s metacomputing environment.

  19. High Performance Computing Facility Operational Assessment, FY 2010 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; White, Julia C [ORNL

    2010-08-01

    Oak Ridge National Laboratory's (ORNL's) Cray XT5 supercomputer, Jaguar, kicked off the era of petascale scientific computing in 2008 with applications that sustained more than a thousand trillion floating point calculations per second - or 1 petaflop. Jaguar continues to grow even more powerful as it helps researchers broaden the boundaries of knowledge in virtually every domain of computational science, including weather and climate, nuclear energy, geosciences, combustion, bioenergy, fusion, and materials science. Their insights promise to broaden our knowledge in areas that are vitally important to the Department of Energy (DOE) and the nation as a whole, particularly energy assurance and climate change. The science of the 21st century, however, will demand further revolutions in computing, supercomputers capable of a million trillion calculations a second - 1 exaflop - and beyond. These systems will allow investigators to continue attacking global challenges through modeling and simulation and to unravel longstanding scientific questions. Creating such systems will also require new approaches to daunting challenges. High-performance systems of the future will need to be codesigned for scientific and engineering applications with best-in-class communications networks and data-management infrastructures and teams of skilled researchers able to take full advantage of these new resources. The Oak Ridge Leadership Computing Facility (OLCF) provides the nation's most powerful open resource for capability computing, with a sustainable path that will maintain and extend national leadership for DOE's Office of Science (SC). The OLCF has engaged a world-class team to support petascale science and to take a dramatic step forward, fielding new capabilities for high-end science. This report highlights the successful delivery and operation of a petascale system and shows how the OLCF fosters application development teams, developing cutting-edge tools

  20. Performance characteristics of a Kodak computed radiography system.

    Science.gov (United States)

    Bradford, C D; Peppler, W W; Dobbins, J T

    1999-01-01

    The performance characteristics of a photostimulable phosphor based computed radiographic (CR) system were studied. The modulation transfer function (MTF), noise power spectra (NPS), and detective quantum efficiency (DQE) of the Kodak Digital Science computed radiography (CR) system (Eastman Kodak Co.-model 400) were measured and compared to previously published results of a Fuji based CR system (Philips Medical Systems-PCR model 7000). To maximize comparability, the same measurement techniques and analysis methods were used. The DQE at four exposure levels (30, 3, 0.3, 0.03 mR) and two plate types (standard and high resolution) were calculated from the NPS and MTF measurements. The NPS was determined from two-dimensional Fourier analysis of uniformly exposed plates. The presampling MTF was determined from the Fourier transform (FT) of the system's finely sampled line spread function (LSF) as produced by a narrow slit. A comparison of the slit type ("beveled edge" versus "straight edge") and its effect on the resulting MTF measurements was also performed. The results show that both systems are comparable in resolution performance. The noise power studies indicated a higher level of noise for the Kodak images (approximately 20% at the low exposure levels and 40%-70% at higher exposure levels). Within the clinically relevant exposure range (0.3-3 mR), the resulting DQE for the Kodak plates ranged between 20%-50% lower than for the corresponding Fuji plates. Measurements of the presampling MTF with the two slit types have shown that a correction factor can be applied to compensate for transmission through the relief edges.

  1. The Development of Reading and Spelling in Arabic Orthography: Two Parallel Processes?

    Science.gov (United States)

    Taha, Haitham

    2016-01-01

    The parallels between reading and spelling skills in Arabic were tested. One-hundred forty-three native Arab students, with typical reading development, from second, fourth, and sixth grades were tested with reading, spelling and orthographic decision tasks. The results indicated a full parallel between the reading and spelling performances within…

  2. Canadian conference on electrical and computer engineering proceedings. Congres canadien en genie electrique et informatique

    Energy Technology Data Exchange (ETDEWEB)

    Bhargava, V K [ed.

    1993-01-01

    A conference was held on the subject of electrical and computer engineering. Papers were presented on the subjects of artificial intelligence, video, signal processing, radar, power electronics, neural networks, control, computer systems, transportation electronics, software tools, error control coding, electrothermal phenomena, performance evaluation of computer systems, wireless communication, satellite communication, very large scale integration, parallel processing, pattern recognition, telephony, graphs and algorithms, multimedia, broadcast systems, remote sensing, computer networks, modulation and coding, robotics, computer architecture, spread spectrum, image processing, microwave circuits, biomedical engineering, specification and verification, image restoration, communications networks, computer-aided design, drives, energy systems, expert systems, and optics. Separate abstracts have been prepared for 56 papers from the conference.

  3. High Performance Computing Facility Operational Assessment 2015: Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Barker, Ashley D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bernholdt, David E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Bland, Arthur S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Gary, Jeff D. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Hack, James J. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; McNally, Stephen T. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Rogers, James H. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Smith, Brian E. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Straatsma, T. P. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Sukumar, Sreenivas Rangan [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Thach, Kevin G. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Tichenor, Suzy [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Vazhkudai, Sudharshan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility; Wells, Jack C. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States). Oak Ridge Leadership Computing Facility

    2016-03-01

    Oak Ridge National Laboratory’s (ORNL’s) Leadership Computing Facility (OLCF) continues to surpass its operational target goals: supporting users; delivering fast, reliable systems; creating innovative solutions for high-performance computing (HPC) needs; and managing risks, safety, and security aspects associated with operating one of the most powerful computers in the world. The results can be seen in the cutting-edge science delivered by users and the praise from the research community. Calendar year (CY) 2015 was filled with outstanding operational results and accomplishments: a very high rating from users on overall satisfaction that ties the highest-ever mark set in CY 2014; the greatest number of core-hours delivered to research projects; the largest percentage of capability usage since the OLCF began tracking the metric in 2009; and success in delivering on the allocation of 60, 30, and 10% of core hours offered for the INCITE (Innovative and Novel Computational Impact on Theory and Experiment), ALCC (Advanced Scientific Computing Research Leadership Computing Challenge), and Director’s Discretionary programs, respectively. These accomplishments, coupled with the extremely high utilization rate, represent the fulfillment of the promise of Titan: maximum use by maximum-size simulations. The impact of all of these successes and more is reflected in the accomplishments of OLCF users, with publications this year in notable journals Nature, Nature Materials, Nature Chemistry, Nature Physics, Nature Climate Change, ACS Nano, Journal of the American Chemical Society, and Physical Review Letters, as well as many others. The achievements included in the 2015 OLCF Operational Assessment Report reflect first-ever or largest simulations in their communities; for example Titan enabled engineers in Los Angeles and the surrounding region to design and begin building improved critical infrastructure by enabling the highest-resolution Cybershake map for Southern

  4. High performance computing network for cloud environment using simulators

    OpenAIRE

    Singh, N. Ajith; Hemalatha, M.

    2012-01-01

    Cloud computing is the next generation computing. Adopting the cloud computing is like signing up new form of a website. The GUI which controls the cloud computing make is directly control the hardware resource and your application. The difficulty part in cloud computing is to deploy in real environment. Its' difficult to know the exact cost and it's requirement until and unless we buy the service not only that whether it will support the existing application which is available on traditional...

  5. High performance computing environment for multidimensional image analysis.

    Science.gov (United States)

    Rao, A Ravishankar; Cecchi, Guillermo A; Magnasco, Marcelo

    2007-07-10

    The processing of images acquired through microscopy is a challenging task due to the large size of datasets (several gigabytes) and the fast turnaround time required. If the throughput of the image processing stage is significantly increased, it can have a major impact in microscopy applications. We present a high performance computing (HPC) solution to this problem. This involves decomposing the spatial 3D image into segments that are assigned to unique processors, and matched to the 3D torus architecture of the IBM Blue Gene/L machine. Communication between segments is restricted to the nearest neighbors. When running on a 2 Ghz Intel CPU, the task of 3D median filtering on a typical 256 megabyte dataset takes two and a half hours, whereas by using 1024 nodes of Blue Gene, this task can be performed in 18.8 seconds, a 478x speedup. Our parallel solution dramatically improves the performance of image processing, feature extraction and 3D reconstruction tasks. This increased throughput permits biologists to conduct unprecedented large scale experiments with massive datasets.

  6. Improving Software Performance in the Compute Unified Device Architecture

    Directory of Open Access Journals (Sweden)

    Alexandru PIRJAN

    2010-01-01

    Full Text Available This paper analyzes several aspects regarding the improvement of software performance for applications written in the Compute Unified Device Architecture CUDA. We address an issue of great importance when programming a CUDA application: the Graphics Processing Unit’s (GPU’s memory management through ranspose ernels. We also benchmark and evaluate the performance for progressively optimizing a transposing matrix application in CUDA. One particular interest was to research how well the optimization techniques, applied to software application written in CUDA, scale to the latest generation of general-purpose graphic processors units (GPGPU, like the Fermi architecture implemented in the GTX480 and the previous architecture implemented in GTX280. Lately, there has been a lot of interest in the literature for this type of optimization analysis, but none of the works so far (to our best knowledge tried to validate if the optimizations can apply to a GPU from the latest Fermi architecture and how well does the Fermi architecture scale to these software performance improving techniques.

  7. A program system for ab initio MO calculations on vector and parallel processing machines. Pt. 3

    International Nuclear Information System (INIS)

    Wiest, R.; Demuynck, J.; Benard, M.; Rohmer, M.M.; Ernenwein, R.

    1991-01-01

    This series of three papers presents a program system for ab initio molecular orbital calculations on vector and parallel computers. Part III is devoted to the four-index transformation on a molecular orbital basis of size NMO of the file of two-electorn integrals (pqparallelrs) generated by a contracted Gaussian set of size NATO (number of atomic orbitals). A fast Yoshimine algorithm first sorts the (pqparallelrs) integrals with respect to index pq only. This file of half-sorted integrals labelled by their rs-index can be processed without further modification to generate either the transformed integrals or the supermatrix elements. The large memory available on the CRAY-2 hase made possible to implement the transformation algorithm proposed by Bender in 1972, which requires a core-storage allocation varying as (NATO) 3 . Two versions of Bender's algorithm are included in the present program. The first version is an in-core version, where the complete file of accumulated contributions to transformed integrals in stored and updated in central memory. This version has been parallelized by distributing over a limited number of logical tasks the NATO steps corresponding to the scanning of the most external loop. The second version is an out-of-core version, in which twin files are alternatively used as input and output for the accumulated contributions to transformed integrals. This version is not parallel. The choice of one or another version and (for version 1) the determination of the number of tasks depends upon the balance between the available and the requested amounts of storage. The storage management and the choice of the proper version are carried out automatically using dynamic storage allocation. Both versions are vectorized and take advantage of the molecular symmetry. (orig.)

  8. Trends in high-performance computing for engineering calculations.

    Science.gov (United States)

    Giles, M B; Reguly, I

    2014-08-13

    High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  9. Power/energy use cases for high performance computing

    Energy Technology Data Exchange (ETDEWEB)

    Laros, James H. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Kelly, Suzanne M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Hammond, Steven [National Renewable Energy Lab. (NREL), Golden, CO (United States); Elmore, Ryan [National Renewable Energy Lab. (NREL), Golden, CO (United States); Munch, Kristin [National Renewable Energy Lab. (NREL), Golden, CO (United States)

    2013-12-01

    Power and Energy have been identified as a first order challenge for future extreme scale high performance computing (HPC) systems. In practice the breakthroughs will need to be provided by the hardware vendors. But to make the best use of the solutions in an HPC environment, it will likely require periodic tuning by facility operators and software components. This document describes the actions and interactions needed to maximize power resources. It strives to cover the entire operational space in which an HPC system occupies. The descriptions are presented as formal use cases, as documented in the Unified Modeling Language Specification [1]. The document is intended to provide a common understanding to the HPC community of the necessary management and control capabilities. Assuming a common understanding can be achieved, the next step will be to develop a set of Application Programing Interfaces (APIs) to which hardware vendors and software developers could utilize to steer power consumption.

  10. Topology and computational performance of attractor neural networks

    International Nuclear Information System (INIS)

    McGraw, Patrick N.; Menzinger, Michael

    2003-01-01

    To explore the relation between network structure and function, we studied the computational performance of Hopfield-type attractor neural nets with regular lattice, random, small-world, and scale-free topologies. The random configuration is the most efficient for storage and retrieval of patterns by the network as a whole. However, in the scale-free case retrieval errors are not distributed uniformly among the nodes. The portion of a pattern encoded by the subset of highly connected nodes is more robust and efficiently recognized than the rest of the pattern. The scale-free network thus achieves a very strong partial recognition. The implications of these findings for brain function and social dynamics are suggestive

  11. Real-time hypothesis driven feature extraction on parallel processing architectures

    DEFF Research Database (Denmark)

    Granmo, O.-C.; Jensen, Finn Verner

    2002-01-01

    the problem of higher-order feature-content/feature-feature correlation, causally complexly interacting features are identified through Bayesian network d-separation analysis and combined into joint features. When used on a moderately complex object-tracking case, the technique is able to select...... extraction, which selectively extract relevant features one-by-one, have in some cases achieved real-time performance on single processing element architectures. In this paperwe propose a novel technique which combines the above two approaches. Features are selectively extracted in parallelizable sets...

  12. A New Tool for Intelligent Parallel Processing of Radar/SAR Remotely Sensed Imagery

    Directory of Open Access Journals (Sweden)

    A. Castillo Atoche

    2013-01-01

    Full Text Available A novel parallel tool for large-scale image enhancement/reconstruction and postprocessing of radar/SAR sensor systems is addressed. The proposed parallel tool performs the following intelligent processing steps: image formation, for the application of different system-level effects of image degradation with a particular remote sensing (RS system and simulation of random noising effects, enhancement/reconstruction by employing nonparametric robust high-resolution techniques, and image postprocessing using the fuzzy anisotropic diffusion technique which incorporates a better edge-preserving noise removal effect and faster diffusion process. This innovative tool allows the processing of high-resolution images provided with different radar/SAR sensor systems as required by RS endusers for environmental monitoring, risk prevention, and resource management. To verify the performance implementation of the proposed parallel framework, the processing steps are developed and specifically tested on graphic processing units (GPU, achieving considerable speedups compared to the serial version of the same techniques implemented in C language.

  13. Vectorization of KENO IV code and an estimate of vector-parallel processing

    International Nuclear Information System (INIS)

    Asai, Kiyoshi; Higuchi, Kenji; Katakura, Jun-ichi; Kurita, Yutaka.

    1986-10-01

    The multi-group criticality safety code KENO IV has been vectorized and tested on FACOM VP-100 vector processor. At first the vectorized KENO IV on a scalar processor became slower than the original one by a factor of 1.4 because of the overhead introduced by the vectorization. Making modifications of algorithms and techniques for vectorization, the vectorized version has become faster than the original one by a factor of 1.4 and 3.0 on the vector processor for sample problems of complex and simple geometries, respectively. For further speedup of the code, some improvements on compiler and hardware, especially on addition of Monte Carlo pipelines to the vector processor, are discussed. Finally a pipelined parallel processor system is proposed and its performance is estimated. (author)

  14. Parallel processing streams for motor output and sensory prediction during action preparation.

    Science.gov (United States)

    Stenner, Max-Philipp; Bauer, Markus; Heinze, Hans-Jochen; Haggard, Patrick; Dolan, Raymond J

    2015-03-15

    Sensory consequences of one's own actions are perceived as less intense than identical, externally generated stimuli. This is generally taken as evidence for sensory prediction of action consequences. Accordingly, recent theoretical models explain this attenuation by an anticipatory modulation of sensory processing prior to stimulus onset (Roussel et al. 2013) or even action execution (Brown et al. 2013). Experimentally, prestimulus changes that occur in anticipation of self-generated sensations are difficult to disentangle from more general effects of stimulus expectation, attention and task load (performing an action). Here, we show that an established manipulation of subjective agency over a stimulus leads to a predictive modulation in sensory cortex that is independent of these factors. We recorded magnetoencephalography while subjects performed a simple action with either hand and judged the loudness of a tone caused by the action. Effector selection was manipulated by subliminal motor priming. Compatible priming is known to enhance a subjective experience of agency over a consequent stimulus (Chambon and Haggard 2012). In line with this effect on subjective agency, we found stronger sensory attenuation when the action that caused the tone was compatibly primed. This perceptual effect was reflected in a transient phase-locked signal in auditory cortex before stimulus onset and motor execution. Interestingly, this sensory signal emerged at a time when the hemispheric lateralization of motor signals in M1 indicated ongoing effector selection. Our findings confirm theoretical predictions of a sensory modulation prior to self-generated sensations and support the idea that a sensory prediction is generated in parallel to motor output (Walsh and Haggard 2010), before an efference copy becomes available. Copyright © 2015 the American Physiological Society.

  15. A collaborative brain-computer interface for improving human performance.

    Directory of Open Access Journals (Sweden)

    Yijun Wang

    Full Text Available Electroencephalogram (EEG based brain-computer interfaces (BCI have been studied since the 1970s. Currently, the main focus of BCI research lies on the clinical use, which aims to provide a new communication channel to patients with motor disabilities to improve their quality of life. However, the BCI technology can also be used to improve human performance for normal healthy users. Although this application has been proposed for a long time, little progress has been made in real-world practices due to technical limits of EEG. To overcome the bottleneck of low single-user BCI performance, this study proposes a collaborative paradigm to improve overall BCI performance by integrating information from multiple users. To test the feasibility of a collaborative BCI, this study quantitatively compares the classification accuracies of collaborative and single-user BCI applied to the EEG data collected from 20 subjects in a movement-planning experiment. This study also explores three different methods for fusing and analyzing EEG data from multiple subjects: (1 Event-related potentials (ERP averaging, (2 Feature concatenating, and (3 Voting. In a demonstration system using the Voting method, the classification accuracy of predicting movement directions (reaching left vs. reaching right was enhanced substantially from 66% to 80%, 88%, 93%, and 95% as the numbers of subjects increased from 1 to 5, 10, 15, and 20, respectively. Furthermore, the decision of reaching direction could be made around 100-250 ms earlier than the subject's actual motor response by decoding the ERP activities arising mainly from the posterior parietal cortex (PPC, which are related to the processing of visuomotor transmission. Taken together, these results suggest that a collaborative BCI can effectively fuse brain activities of a group of people to improve the overall performance of natural human behavior.

  16. Reconstruction for Time-Domain In Vivo EPR 3D Multigradient Oximetric Imaging—A Parallel Processing Perspective

    Directory of Open Access Journals (Sweden)

    Christopher D. Dharmaraj

    2009-01-01

    Full Text Available Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23×23×23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet. The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  17. Reconstruction for time-domain in vivo EPR 3D multigradient oximetric imaging--a parallel processing perspective.

    Science.gov (United States)

    Dharmaraj, Christopher D; Thadikonda, Kishan; Fletcher, Anthony R; Doan, Phuc N; Devasahayam, Nallathamby; Matsumoto, Shingo; Johnson, Calvin A; Cook, John A; Mitchell, James B; Subramanian, Sankaran; Krishna, Murali C

    2009-01-01

    Three-dimensional Oximetric Electron Paramagnetic Resonance Imaging using the Single Point Imaging modality generates unpaired spin density and oxygen images that can readily distinguish between normal and tumor tissues in small animals. It is also possible with fast imaging to track the changes in tissue oxygenation in response to the oxygen content in the breathing air. However, this involves dealing with gigabytes of data for each 3D oximetric imaging experiment involving digital band pass filtering and background noise subtraction, followed by 3D Fourier reconstruction. This process is rather slow in a conventional uniprocessor system. This paper presents a parallelization framework using OpenMP runtime support and parallel MATLAB to execute such computationally intensive programs. The Intel compiler is used to develop a parallel C++ code based on OpenMP. The code is executed on four Dual-Core AMD Opteron shared memory processors, to reduce the computational burden of the filtration task significantly. The results show that the parallel code for filtration has achieved a speed up factor of 46.66 as against the equivalent serial MATLAB code. In addition, a parallel MATLAB code has been developed to perform 3D Fourier reconstruction. Speedup factors of 4.57 and 4.25 have been achieved during the reconstruction process and oximetry computation, for a data set with 23 x 23 x 23 gradient steps. The execution time has been computed for both the serial and parallel implementations using different dimensions of the data and presented for comparison. The reported system has been designed to be easily accessible even from low-cost personal computers through local internet (NIHnet). The experimental results demonstrate that the parallel computing provides a source of high computational power to obtain biophysical parameters from 3D EPR oximetric imaging, almost in real-time.

  18. FY 1995 Blue Book: High Performance Computing and Communications: Technology for the National Information Infrastructure

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program was created to accelerate the development of future generations of high performance computers...

  19. The contribution of high-performance computing and modelling for industrial development

    CSIR Research Space (South Africa)

    Sithole, Happy

    2017-10-01

    Full Text Available Performance Computing and Modelling for Industrial Development Dr Happy Sithole and Dr Onno Ubbink 2 Strategic context • High-performance computing (HPC) combined with machine Learning and artificial intelligence present opportunities to non...

  20. Real-time SHVC software decoding with multi-threaded parallel processing

    Science.gov (United States)

    Gudumasu, Srinivas; He, Yuwen; Ye, Yan; He, Yong; Ryu, Eun-Seok; Dong, Jie; Xiu, Xiaoyu

    2014-09-01

    This paper proposes a parallel decoding framework for scalable HEVC (SHVC). Various optimization technologies are implemented on the basis of SHVC reference software SHM-2.0 to achieve real-time decoding speed for the two layer spatial scalability configuration. SHVC decoder complexity is analyzed with profiling information. The decoding process at each layer and the up-sampling process are designed in parallel and scheduled by a high level application task manager. Within each layer, multi-threaded decoding is applied to accelerate the layer decoding speed. Entropy decoding, reconstruction, and in-loop processing are pipeline designed with multiple threads based on groups of coding tree units (CTU). A group of CTUs is treated as a processing unit in each pipeline stage to achieve a better trade-off between parallelism and synchronization. Motion compensation, inverse quantization, and inverse transform modules are further optimized with SSE4 SIMD instructions. Simulations on a desktop with an Intel i7 processor 2600 running at 3.4 GHz show that the parallel SHVC software decoder is able to decode 1080p spatial 2x at up to 60 fps (frames per second) and 1080p spatial 1.5x at up to 50 fps for those bitstreams generated with SHVC common test conditions in the JCT-VC standardization group. The decoding performance at various bitrates with different optimization technologies and different numbers of threads are compared in terms of decoding speed and resource usage, including processor and memory.

  1. Enabling Efficient Climate Science Workflows in High Performance Computing Environments

    Science.gov (United States)

    Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

    2015-12-01

    A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.

  2. Using the extended parallel process model to prevent noise-induced hearing loss among coal miners in Appalachia

    Energy Technology Data Exchange (ETDEWEB)

    Murray-Johnson, L.; Witte, K.; Patel, D.; Orrego, V.; Zuckerman, C.; Maxfield, A.M.; Thimons, E.D. [Ohio State University, Columbus, OH (US)

    2004-12-15

    Occupational noise-induced hearing loss is the second most self-reported occupational illness or injury in the United States. Among coal miners, more than 90% of the population reports a hearing deficit by age 55. In this formative evaluation, focus groups were conducted with coal miners in Appalachia to ascertain whether miners perceive hearing loss as a major health risk and if so, what would motivate the consistent wearing of hearing protection devices (HPDs). The theoretical framework of the Extended Parallel Process Model was used to identify the miners' knowledge, attitudes, beliefs, and current behaviors regarding hearing protection. Focus group participants had strong perceived severity and varying levels of perceived susceptibility to hearing loss. Various barriers significantly reduced the self-efficacy and the response efficacy of using hearing protection.

  3. New computing systems, future computing environment, and their implications on structural analysis and design

    Science.gov (United States)

    Noor, Ahmed K.; Housner, Jerrold M.

    1993-01-01

    Recent advances in computer technology that are likely to impact structural analysis and design of flight vehicles are reviewed. A brief summary is given of the advances in microelectronics, networking technologies, and in the user-interface hardware and software. The major features of new and projected computing systems, including high performance computers, parallel processing machines, and small systems, are described. Advances in programming environments, numerical algorithms, and computational strategies for new computing systems are reviewed. The impact of the advances in computer technology on structural analysis and the design of flight vehicles is described. A scenario for future computing paradigms is presented, and the near-term needs in the computational structures area are outlined.

  4. High Performance Computing Facility Operational Assessment, CY 2011 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Baker, Ann E [ORNL; Barker, Ashley D [ORNL; Bland, Arthur S Buddy [ORNL; Boudwin, Kathlyn J. [ORNL; Hack, James J [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL; Hudson, Douglas L [ORNL

    2012-02-01

    Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.4 billion core hours in calendar year (CY) 2011 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Users reported more than 670 publications this year arising from their use of OLCF resources. Of these we report the 300 in this review that are consistent with guidance provided. Scientific achievements by OLCF users cut across all range scales from atomic to molecular to large-scale structures. At the atomic scale, researchers discovered that the anomalously long half-life of Carbon-14 can be explained by calculating, for the first time, the very complex three-body interactions between all the neutrons and protons in the nucleus. At the molecular scale, researchers combined experimental results from LBL's light source and simulations on Jaguar to discover how DNA replication continues past a damaged site so a mutation can be repaired later. Other researchers combined experimental results from ORNL's Spallation Neutron Source and simulations on Jaguar to reveal the molecular structure of ligno-cellulosic material used in bioethanol production. This year, Jaguar has been used to do billion-cell CFD calculations to develop shock wave compression turbo machinery as a means to meet DOE goals for reducing carbon sequestration costs. General Electric used Jaguar to calculate the unsteady flow through turbo machinery to learn what efficiencies the traditional steady flow assumption is hiding from designers. Even a 1% improvement in turbine design can save the nation

  5. Comparing the performance of SIMD computers by running large air pollution models

    DEFF Research Database (Denmark)

    Brown, J.; Hansen, Per Christian; Wasniewski, J.

    1996-01-01

    To compare the performance and use of three massively parallel SIMD computers, we implemented a large air pollution model on these computers. Using a realistic large-scale model, we gained detailed insight about the performance of the computers involved when used to solve large-scale scientific...... problems that involve several types of numerical computations. The computers used in our study are the Connection Machines CM-200 and CM-5, and the MasPar MP-2216...

  6. Parallel computers and three-dimensional computational electromagnetics

    International Nuclear Information System (INIS)

    Madsen, N.K.

    1994-01-01

    The authors have continued to enhance their ability to use new massively parallel processing computers to solve time-domain electromagnetic problems. New vectorization techniques have improved the performance of their code DSI3D by factors of 5 to 15, depending on the computer used. New radiation boundary conditions and far-field transformations now allow the computation of radar cross-section values for complex objects. A new parallel-data extraction code has been developed that allows the extraction of data subsets from large problems, which have been run on parallel computers, for subsequent post-processing on workstations with enhanced graphics capabilities. A new charged-particle-pushing version of DSI3D is under development. Finally, DSI3D has become a focal point for several new Cooperative Research and Development Agreement activities with industrial companies such as Lockheed Advanced Development Company, Varian, Hughes Electron Dynamics Division, General Atomic, and Cray

  7. High performance computing in power and energy systems

    Energy Technology Data Exchange (ETDEWEB)

    Khaitan, Siddhartha Kumar [Iowa State Univ., Ames, IA (United States); Gupta, Anshul (eds.) [IBM Watson Research Center, Yorktown Heights, NY (United States)

    2013-07-01

    The twin challenge of meeting global energy demands in the face of growing economies and populations and restricting greenhouse gas emissions is one of the most daunting ones that humanity has ever faced. Smart electrical generation and distribution infrastructure will play a crucial role in meeting these challenges. We would need to develop capabilities to handle large volumes of data generated by the power system components like PMUs, DFRs and other data acquisition devices as well as by the capacity to process these data at high resolution via multi-scale and multi-period simulations, cascading and security analysis, interaction between hybrid systems (electric, transport, gas, oil, coal, etc.) and so on, to get meaningful information in real time to ensure a secure, reliable and stable power system grid. Advanced research on development and implementation of market-ready leading-edge high-speed enabling technologies and algorithms for solving real-time, dynamic, resource-critical problems will be required for dynamic security analysis targeted towards successful implementation of Smart Grid initiatives. This books aims to bring together some of the latest research developments as well as thoughts on the future research directions of the high performance computing applications in electric power systems planning, operations, security, markets, and grid integration of alternate sources of energy, etc.

  8. The Future of Software Engineering for High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Pope, G [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-07-16

    DOE ASCR requested that from May through mid-July 2015 a study group identify issues and recommend solutions from a software engineering perspective transitioning into the next generation of High Performance Computing. The approach used was to ask some of the DOE complex experts who will be responsible for doing this work to contribute to the study group. The technique used was to solicit elevator speeches: a short and concise write up done as if the author was a speaker with only a few minutes to convince a decision maker of their top issues. Pages 2-18 contain the original texts of the contributed elevator speeches and end notes identifying the 20 contributors. The study group also ranked the importance of each topic, and those scores are displayed with each topic heading. A perfect score (and highest priority) is three, two is medium priority, and one is lowest priority. The highest scoring topic areas were software engineering and testing resources; the lowest scoring area was compliance to DOE standards. The following two paragraphs are an elevator speech summarizing the contributed elevator speeches. Each sentence or phrase in the summary is hyperlinked to its source via a numeral embedded in the text. A risk one liner has also been added to each topic to allow future risk tracking and mitigation.

  9. Tablet computer enhanced training improves internal medicine exam performance.

    Science.gov (United States)

    Baumgart, Daniel C; Wende, Ilja; Grittner, Ulrike

    2017-01-01

    Traditional teaching concepts in medical education do not take full advantage of current information technology. We aimed to objectively determine the impact of Tablet PC enhanced training on learning experience and MKSAP® (medical knowledge self-assessment program) exam performance. In this single center, prospective, controlled study final year medical students and medical residents doing an inpatient service rotation were alternatingly assigned to either the active test (Tablet PC with custom multimedia education software package) or traditional education (control) group, respectively. All completed an extensive questionnaire to collect their socio-demographic data, evaluate educational status, computer affinity and skills, problem solving, eLearning knowledge and self-rated medical knowledge. Both groups were MKSAP® tested at the beginning and the end of their rotation. The MKSAP® score at the final exam was the primary endpoint. Data of 55 (tablet n = 24, controls n = 31) male 36.4%, median age 28 years, 65.5% students, were evaluable. The mean MKSAP® score improved in the tablet PC (score Δ + 8 SD: 11), but not the control group (score Δ- 7, SD: 11), respectively. After adjustment for baseline score and confounders the Tablet PC group showed on average 11% better MKSAP® test results compared to the control group (plearning to their respective training programs.

  10. Current configuration and performance of the TFTR computer system

    International Nuclear Information System (INIS)

    Sauthoff, N.R.; Barnes, D.J.; Daniels, R.; Davis, S.; Reid, A.; Snyder, T.; Oliaro, G.; Stark, W.; Thompson, J.R. Jr.

    1986-01-01

    Developments in the TFTR (Tokamak Fusion Test Reactor) computer support system since its startup phases are described. Early emphasis on tokamak process control have been augmented by improved physics data handling, both on-line and off-line. Data acquisition volume and rate have been increased, and data is transmitted automatically to a new VAX-based off-line data reduction system. The number of interface points has increased dramatically, as has the number of man-machine interfaces. The graphics system performance has been accelerated by the introduction of parallelism, and new features such as shadowing and device independence have been added. To support multicycle operation for neutral beam conditioning and independence, the program control system has been generalized. A status and alarm system, including calculated variables, is in the installation phase. System reliability has been enhanced by both the redesign of weaker components and installation of a system status monitor. Development productivity has been enhanced by the addition of tools

  11. Lightweight Provenance Service for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

    2017-09-09

    Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

  12. Building a High Performance Computing Infrastructure for Novosibirsk Scientific Center

    International Nuclear Information System (INIS)

    Adakin, A; Chubarov, D; Nikultsev, V; Belov, S; Kaplin, V; Sukharev, A; Zaytsev, A; Kalyuzhny, V; Kuchin, N; Lomakin, S

    2011-01-01

    Novosibirsk Scientific Center (NSC), also known worldwide as Akademgorodok, is one of the largest Russian scientific centers hosting Novosibirsk State University (NSU) and more than 35 research organizations of the Siberian Branch of Russian Academy of Sciences including Budker Institute of Nuclear Physics (BINP), Institute of Computational Technologies (ICT), and Institute of Computational Mathematics and Mathematical Geophysics (ICM and MG). Since each institute has specific requirements on the architecture of the computing farms involved in its research field, currently we've got several computing facilities hosted by NSC institutes, each optimized for the particular set of tasks, of which the largest are the NSU Supercomputer Center, Siberian Supercomputer Center (ICM and MG), and a Grid Computing Facility of BINP. Recently a dedicated optical network with the initial bandwidth of 10 Gbps connecting these three facilities was built in order to make it possible to share the computing resources among the research communities of participating institutes, thus providing a common platform for building the computing infrastructure for various scientific projects. Unification of the computing infrastructure is achieved by extensive use of virtualization technologies based on XEN and KVM platforms. The solution implemented was tested thoroughly within the computing environment of KEDR detector experiment which is being carried out at BINP, and foreseen to be applied to the use cases of other HEP experiments in the upcoming future.

  13. Big Data and High-Performance Computing in Global Seismology

    Science.gov (United States)

    Bozdag, Ebru; Lefebvre, Matthieu; Lei, Wenjie; Peter, Daniel; Smith, James; Komatitsch, Dimitri; Tromp, Jeroen

    2014-05-01

    Much of our knowledge of Earth's interior is based on seismic observations and measurements. Adjoint methods provide an efficient way of incorporating 3D full wave propagation in iterative seismic inversions to enhance tomographic images and thus our understanding of processes taking place inside the Earth. Our aim is to take adjoint tomography, which has been successfully applied to regional and continental scale problems, further to image the entire planet. This is one of the extreme imaging challenges in seismology, mainly due to the intense computational requirements and vast amount of high-quality seismic data that can potentially be assimilated. We have started low-resolution inversions (T > 30 s and T > 60 s for body and surface waves, respectively) with a limited data set (253 carefully selected earthquakes and seismic data from permanent and temporary networks) on Oak Ridge National Laboratory's Cray XK7 "Titan" system. Recent improvements in our 3D global wave propagation solvers, such as a GPU version of the SPECFEM3D_GLOBE package, will enable us perform higher-resolution (T > 9 s) and longer duration (~180 m) simulations to take the advantage of high-frequency body waves and major-arc surface waves, thereby improving imbalanced ray coverage as a result of the uneven global distribution of sources and receivers. Our ultimate goal is to use all earthquakes in the global CMT catalogue within the magnitude range of our interest and data from all available seismic networks. To take the full advantage of computational resources, we need a solid framework to manage big data sets during numerical simulations, pre-processing (i.e., data requests and quality checks, processing data, window selection, etc.) and post-processing (i.e., pre-conditioning and smoothing kernels, etc.). We address the bottlenecks in our global seismic workflow, which are mainly coming from heavy I/O traffic during simulations and the pre- and post-processing stages, by defining new data

  14. Performance studies of four-dimensional cone beam computed tomography

    International Nuclear Information System (INIS)

    Qi Zhihua; Chen Guanghong

    2011-01-01

    Four-dimensional cone beam computed tomography (4DCBCT) has been proposed to characterize the breathing motion of tumors before radiotherapy treatment. However, when the acquired cone beam projection data are retrospectively gated into several respiratory phases, the available data to reconstruct each phase is under-sampled and thus causes streaking artifacts in the reconstructed images. To solve the under-sampling problem and improve image quality in 4DCBCT, various methods have been developed. This paper presents performance studies of three different 4DCBCT methods based on different reconstruction algorithms. The aims of this paper are to study (1) the relationship between the accuracy of the extracted motion trajectories and the data acquisition time of a 4DCBCT scan and (2) the relationship between the accuracy of the extracted motion trajectories and the number of phase bins used to sort projection data. These aims will be applied to three different 4DCBCT methods: conventional filtered backprojection reconstruction (FBP), FBP with McKinnon-Bates correction (MB) and prior image constrained compressed sensing (PICCS) reconstruction. A hybrid phantom consisting of realistic chest anatomy and a moving elliptical object with known 3D motion trajectories was constructed by superimposing the analytical projection data of the moving object to the simulated projection data from a chest CT volume dataset. CBCT scans with gantry rotation times from 1 to 4 min were simulated, and the generated projection data were sorted into 5, 10 and 20 phase bins before different methods were used to reconstruct 4D images. The motion trajectories of the moving object were extracted using a fast free-form deformable registration algorithm. The root mean square errors (RMSE) of the extracted motion trajectories were evaluated for all simulated cases to quantitatively study the performance. The results demonstrate (1) longer acquisition times result in more accurate motion delineation

  15. Graphics processing unit based computation for NDE applications

    Science.gov (United States)

    Nahas, C. A.; Rajagopal, Prabhu; Balasubramaniam, Krishnan; Krishnamurthy, C. V.

    2012-05-01

    Advances in parallel processing in recent years are helping to improve the cost of numerical simulation. Breakthroughs in Graphical Processing Unit (GPU) based computation now offer the prospect of further drastic improvements. The introduction of 'compute unified device architecture' (CUDA) by NVIDIA (the global technology company based in Santa Clara, California, USA) has made programming GPUs for general purpose computing accessible to the average programmer. Here we use CUDA to develop parallel finite difference schemes as applicable to two problems of interest to NDE community, namely heat diffusion and elastic wave propagation. The implementations are for two-dimensions. Performance improvement of the GPU implementation against serial CPU implementation is then discussed.

  16. Department of Energy research in utilization of high-performance computers

    International Nuclear Information System (INIS)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-08-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure

  17. A performance model for the communication in fast multipole methods on high-performance computing platforms

    KAUST Repository

    Ibeid, Huda

    2016-03-04

    Exascale systems are predicted to have approximately 1 billion cores, assuming gigahertz cores. Limitations on affordable network topologies for distributed memory systems of such massive scale bring new challenges to the currently dominant parallel programing model. Currently, there are many efforts to evaluate the hardware and software bottlenecks of exascale designs. It is therefore of interest to model application performance and to understand what changes need to be made to ensure extrapolated scalability. The fast multipole method (FMM) was originally developed for accelerating N-body problems in astrophysics and molecular dynamics but has recently been extended to a wider range of problems. Its high arithmetic intensity combined with its linear complexity and asynchronous communication patterns make it a promising algorithm for exascale systems. In this paper, we discuss the challenges for FMM on current parallel computers and future exascale architectures, with a focus on internode communication. We focus on the communication part only; the efficiency of the computational kernels are beyond the scope of the present study. We develop a performance model that considers the communication patterns of the FMM and observe a good match between our model and the actual communication time on four high-performance computing (HPC) systems, when latency, bandwidth, network topology, and multicore penalties are all taken into account. To our knowledge, this is the first formal characterization of internode communication in FMM that validates the model against actual measurements of communication time. The ultimate communication model is predictive in an absolute sense; however, on complex systems, this objective is often out of reach or of a difficulty out of proportion to its benefit when there exists a simpler model that is inexpensive and sufficient to guide coding decisions leading to improved scaling. The current model provides such guidance.

  18. The European computer model for optronic system performance prediction (ECOMOS)

    NARCIS (Netherlands)

    Kessler, S.; Bijl, P.; Labarre, L.; Repasi, E.; Wittenstein, W.; Bürsing, H.

    2017-01-01

    ECOMOS is a multinational effort within the framework of an EDA Project Arrangement. Its aim is to provide a generally accepted and harmonized European computer model for computing nominal Target Acquisition (TA) ranges of optronic imagers operating in the Visible or thermal Infrared (IR). The

  19. Impact of computer advances on future finite elements computations. [for aircraft and spacecraft design

    Science.gov (United States)

    Fulton, Robert E.

    1985-01-01

    Research performed over the past 10 years in engineering data base management and parallel computing is discussed, and certain opportunities for research toward the next generation of structural analysis capability are proposed. Particular attention is given to data base management associated with the IPAD project and parallel processing associated with the Finite Element Machine project, both sponsored by NASA, and a near term strategy for a distributed structural analysis capability based on relational data base management software and parallel computers for a future structural analysis system.

  20. High Performance Computing Facility Operational Assessment, FY 2011 Oak Ridge Leadership Computing Facility

    Energy Technology Data Exchange (ETDEWEB)

    Baker, Ann E [ORNL; Bland, Arthur S Buddy [ORNL; Hack, James J [ORNL; Barker, Ashley D [ORNL; Boudwin, Kathlyn J. [ORNL; Kendall, Ricky A [ORNL; Messer, Bronson [ORNL; Rogers, James H [ORNL; Shipman, Galen M [ORNL; Wells, Jack C [ORNL; White, Julia C [ORNL

    2011-08-01

    Oak Ridge National Laboratory's Leadership Computing Facility (OLCF) continues to deliver the most powerful resources in the U.S. for open science. At 2.33 petaflops peak performance, the Cray XT Jaguar delivered more than 1.5 billion core hours in calendar year (CY) 2010 to researchers around the world for computational simulations relevant to national and energy security; advancing the frontiers of knowledge in physical sciences and areas of biological, medical, environmental, and computer sciences; and providing world-class research facilities for the nation's science enterprise. Scientific achievements by OLCF users range from collaboration with university experimentalists to produce a working supercapacitor that uses atom-thick sheets of carbon materials to finely determining the resolution requirements for simulations of coal gasifiers and their components, thus laying the foundation for development of commercial-scale gasifiers. OLCF users are pushing the boundaries with software applications sustaining more than one petaflop of performance in the quest to illuminate the fundamental nature of electronic devices. Other teams of researchers are working to resolve predictive capabilities of climate models, to refine and validate genome sequencing, and to explore the most fundamental materials in nature - quarks and gluons - and their unique properties. Details of these scientific endeavors - not possible without access to leadership-class computing resources - are detailed in Section 4 of this report and in the INCITE in Review. Effective operations of the OLCF play a key role in the scientific missions and accomplishments of its users. This Operational Assessment Report (OAR) will delineate the policies, procedures, and innovations implemented by the OLCF to continue delivering a petaflop-scale resource for cutting-edge research. The 2010 operational assessment of the OLCF yielded recommendations that have been addressed (Reference Section 1) and

  1. Performance management of high performance computing for medical image processing in Amazon Web Services

    Science.gov (United States)

    Bao, Shunxing; Damon, Stephen M.; Landman, Bennett A.; Gokhale, Aniruddha

    2016-03-01

    Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical- Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for- use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.

  2. Performance Management of High Performance Computing for Medical Image Processing in Amazon Web Services.

    Science.gov (United States)

    Bao, Shunxing; Damon, Stephen M; Landman, Bennett A; Gokhale, Aniruddha

    2016-02-27

    Adopting high performance cloud computing for medical image processing is a popular trend given the pressing needs of large studies. Amazon Web Services (AWS) provide reliable, on-demand, and inexpensive cloud computing services. Our research objective is to implement an affordable, scalable and easy-to-use AWS framework for the Java Image Science Toolkit (JIST). JIST is a plugin for Medical-Image Processing, Analysis, and Visualization (MIPAV) that provides a graphical pipeline implementation allowing users to quickly test and develop pipelines. JIST is DRMAA-compliant allowing it to run on portable batch system grids. However, as new processing methods are implemented and developed, memory may often be a bottleneck for not only lab computers, but also possibly some local grids. Integrating JIST with the AWS cloud alleviates these possible restrictions and does not require users to have deep knowledge of programming in Java. Workflow definition/management and cloud configurations are two key challenges in this research. Using a simple unified control panel, users have the ability to set the numbers of nodes and select from a variety of pre-configured AWS EC2 nodes with different numbers of processors and memory storage. Intuitively, we configured Amazon S3 storage to be mounted by pay-for-use Amazon EC2 instances. Hence, S3 storage is recognized as a shared cloud resource. The Amazon EC2 instances provide pre-installs of all necessary packages to run JIST. This work presents an implementation that facilitates the integration of JIST with AWS. We describe the theoretical cost/benefit formulae to decide between local serial execution versus cloud computing and apply this analysis to an empirical diffusion tensor imaging pipeline.

  3. The ongoing investigation of high performance parallel computing in HEP

    CERN Document Server

    Peach, Kenneth J; Böck, R K; Dobinson, Robert W; Hansroul, M; Norton, Alan Robert; Willers, Ian Malcolm; Baud, J P; Carminati, F; Gagliardi, F; McIntosh, E; Metcalf, M; Robertson, L; CERN. Geneva. Detector Research and Development Committee

    1993-01-01

    Past and current exploitation of parallel computing in High Energy Physics is summarized and a list of R & D projects in this area is presented. The applicability of new parallel hardware and software to physics problems is investigated, in the light of the requirements for computing power of LHC experiments and the current trends in the computer industry. Four main themes are discussed (possibilities for a finer grain of parallelism; fine-grain communication mechanism; usable parallel programming environment; different programming models and architectures, using standard commercial products). Parallel computing technology is potentially of interest for offline and vital for real time applications in LHC. A substantial investment in applications development and evaluation of state of the art hardware and software products is needed. A solid development environment is required at an early stage, before mainline LHC program development begins.

  4. Distributed metadata in a high performance computing environment

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Zhang, Zhenhua; Liu, Xuezhao; Tang, Haiying

    2017-07-11

    A computer-executable method, system, and computer program product for managing meta-data in a distributed storage system, wherein the distributed storage system includes one or more burst buffers enabled to operate with a distributed key-value store, the co computer-executable method, system, and computer program product comprising receiving a request for meta-data associated with a block of data stored in a first burst buffer of the one or more burst buffers in the distributed storage system, wherein the meta data is associated with a key-value, determining which of the one or more burst buffers stores the requested metadata, and upon determination that a first burst buffer of the one or more burst buffers stores the requested metadata, locating the key-value in a portion of the distributed key-value store accessible from the first burst buffer.

  5. Software Applications on the Peregrine System | High-Performance Computing

    Science.gov (United States)

    Algebraic Modeling System (GAMS) Statistics and analysis High-level modeling system for mathematical reactivity. Gurobi Optimizer Statistics and analysis Solver for mathematical programming LAMMPS Chemistry and , reactivities, and vibrational, electronic and NMR spectra. R Statistical Computing Environment Statistics and

  6. Spying on real-time computers to improve performance

    International Nuclear Information System (INIS)

    Taff, L.M.

    1975-01-01

    The sampled program-counter histogram, an established technique for shortening the execution times of programs, is described for a real-time computer. The use of a real-time clock allows particularly easy implementation. (Auth.)

  7. 76 FR 60939 - Metal Fatigue Analysis Performed by Computer Software

    Science.gov (United States)

    2011-09-30

    ... Software AGENCY: Nuclear Regulatory Commission. ACTION: Regulatory issue summary; request for comment... computer software package, WESTEMS TM , to demonstrate compliance with Section III, ``Rules for... Software Addressees All holders of, and applicants for, a power reactor operating license or construction...

  8. Benchmark Numerical Toolkits for High Performance Computing, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Computational codes in physics and engineering often use implicit solution algorithms that require linear algebra tools such as Ax=b solvers, eigenvalue,...

  9. Performance Evaluation of a Mobile Wireless Computational Grid ...

    African Journals Online (AJOL)

    PROF. OLIVER OSUAGWA

    2015-12-01

    Dec 1, 2015 ... Abstract. This work developed and simulated a mathematical model for a mobile wireless computational Grid ... which mobile modes will process the tasks .... evaluation are analytical modelling, simulation ... MATLAB 7.10.0.

  10. Introducing remarks upon the analysis of computer systems performance

    International Nuclear Information System (INIS)

    Baum, D.

    1980-05-01

    Some of the basis ideas of analytical techniques to study the behaviour of computer systems are presented. Single systems as well as networks of computers are viewed as stochastic dynamical systems which may be modelled by queueing networks. Therefore this report primarily serves as an introduction to probabilistic methods for qualitative analysis of systems. It is supplemented by an application example of Chandy's collapsing method. (orig.) [de

  11. FY 1992 Blue Book: Grand Challenges: High Performance Computing and Communications

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

  12. FY 1993 Blue Book: Grand Challenges 1993: High Performance Computing and Communications

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — High performance computing and computer communications networks are becoming increasingly important to scientific advancement, economic competition, and national...

  13. A study of kinematic cues and anticipatory performance in tennis using computational manipulation and computer graphics.

    Science.gov (United States)

    Ida, Hirofumi; Fukuhara, Kazunobu; Kusubori, Seiji; Ishii, Motonobu

    2011-09-01

    Computer graphics of digital human models can be used to display human motions as visual stimuli. This study presents our technique for manipulating human motion with a forward kinematics calculation without violating anatomical constraints. A motion modulation of the upper extremity was conducted by proportionally modulating the anatomical joint angular velocity calculated by motion analysis. The effect of this manipulation was examined in a tennis situation--that is, the receiver's performance of predicting ball direction when viewing a digital model of the server's motion derived by modulating the angular velocities of the forearm or that of the elbow during the forward swing. The results showed that the faster the server's forearm pronated, the more the receiver's anticipation of the ball direction tended to the left side of the serve box. In contrast, the faster the server's elbow extended, the more the receiver's anticipation of the ball direction tended to the right. This suggests that tennis players are sensitive to the motion modulation of their opponent's racket-arm.

  14. Assessment of Substance Abuse Behaviors in Adolescents’: Integration of Self-Control into Extended Parallel Process Model

    Directory of Open Access Journals (Sweden)

    K Witte

    2005-04-01

    Full Text Available Introduction: An effective preventive health education program on drug abuse can be delivered by applying behavior change theories in a complementary fashion. Methods: The aim of this study was to assess the effectiveness of integrating self-control into Extended Parallel Process Model in drug substance abuse behaviors. A sample of 189 governmental high school students participated in this survey. Information was collected individually by completing researcher designed questionnaire and a urinary rapid immuno-chromatography test for opium and marijuana. Results: The results of the study show that 6.9% of students used drugs (especially opium and marijuana and also peer pressure was determinant factor for using drugs. Moreover the EPPM theoretical variables of perceived severity and perceived self-efficacy with self-control are predictive factors to behavior intention against substance abuse. In this manner, self-control had a significant effect on protective motivation and perceived efficacy. Low self- control was a predictive factor of drug abuse and low self-control students had drug abuse experience. Conclusion: The results of this study suggest that an integration of self-control into EPPM can be effective in expressing and designing primary preventive programs against drug abuse, and assessing abused behavior and deviance behaviors among adolescent population, especially risk seekers

  15. Serial and parallel processing in reading: investigating the effects of parafoveal orthographic information on nonisolated word recognition.

    Science.gov (United States)

    Dare, Natasha; Shillcock, Richard

    2013-01-01

    We present a novel lexical decision task and three boundary paradigm eye-tracking experiments that clarify the picture of parallel processing in word recognition in context. First, we show that lexical decision is facilitated by associated letter information to the left and right of the word, with no apparent hemispheric specificity. Second, we show that parafoveal preview of a repeat of word n at word n + 1 facilitates reading of word n relative to a control condition with an unrelated word at word n + 1. Third, using a version of the boundary paradigm that allowed for a regressive eye movement, we show no parafoveal "postview" effect on reading word n of repeating word n at word n - 1. Fourth, we repeat the second experiment but compare the effects of parafoveal previews consisting of a repeated word n with a transposed central bigram (e.g., caot for coat) and a substituted central bigram (e.g., ceit for coat), showing the latter to have a deleterious effect on processing word n, thereby demonstrating that the parafoveal preview effect is at least orthographic and not purely visual.

  16. Combining self-affirmation with the extended parallel process model: the consequences for motivation to eat more fruit and vegetables.

    Science.gov (United States)

    Napper, Lucy E; Harris, Peter R; Klein, William M P

    2014-01-01

    There is potential for fruitful integration of research using the Extended Parallel Process Model (EPPM) with research using Self-affirmation Theory. However, to date no studies have attempted to do this. This article reports an experiment that tests whether (a) the effects of a self-affirmation manipulation add to those of EPPM variables in predicting intentions to improve a health behavior and (b) self-affirmation moderates the relationship between EPPM variables and intentions. Participants (N = 80) were randomized to either a self-affirmation or control condition prior to receiving personally relevant health information about the risks of not eating at least five portions of fruit and vegetables per day. A hierarchical regression model revealed that efficacy, threat × efficacy, self-affirmation, and self-affirmation × efficacy all uniquely contributed to the prediction of intentions to eat at least five portions per day. Self-affirmed participants and those with higher efficacy reported greater motivation to change. Threat predicted intentions at low levels of efficacy, but not at high levels. Efficacy had a stronger relationship with intentions in the nonaffirmed condition than in the self-affirmed condition. The findings indicate that self-affirmation processes can moderate the impact of variables in the EPPM and also add to the variance explained. We argue that there is potential for integration of the two traditions of research, to the benefit of both.

  17. Multi-target parallel processing approach for gene-to-structure determination of the influenza polymerase PB2 subunit.

    Science.gov (United States)

    Armour, Brianna L; Barnes, Steve R; Moen, Spencer O; Smith, Eric; Raymond, Amy C; Fairman, James W; Stewart, Lance J; Staker, Bart L; Begley, Darren W; Edwards, Thomas E; Lorimer, Donald D

    2013-06-28

    Pandemic outbreaks of highly virulent influenza strains can cause widespread morbidity and mortality in human populations worldwide. In the United States alone, an average of 41,400 deaths and 1.86 million hospitalizations are caused by influenza virus infection each year (1). Point mutations in the polymerase basic protein 2 subunit (PB2) have been linked to the adaptation of the viral infection in humans (2). Findings from such studies have revealed the biological significance of PB2 as a virulence factor, thus highlighting its potential as an antiviral drug target. The structural genomics program put forth by the National Institute of Allergy and Infectious Disease (NIAID) provides funding to Emerald Bio and three other Pacific Northwest institutions that together make up the Seattle Structural Genomics Center for Infectious Disease (SSGCID). The SSGCID is dedicated to providing the scientific community with three-dimensional protein structures of NIAID category A-C pathogens. Making such structural information available to the scientific community serves to accelerate structure-based drug design. Structure-based drug design plays an important role in drug development. Pursuing multiple targets in parallel greatly increases the chance of success for new lead discovery by targeting a pathway or an entire protein family. Emerald Bio has developed a high-throughput, multi-target parallel processing pipeline (MTPP) for gene-to-structure determination to support the consortium. Here we describe the protocols used to determine the structure of the PB2 subunit from four different influenza A strains.

  18. COMPUTING

    CERN Multimedia

    I. Fisk

    2011-01-01

    Introduction CMS distributed computing system performed well during the 2011 start-up. The events in 2011 have more pile-up and are more complex than last year; this results in longer reconstruction times and harder events to simulate. Significant increases in computing capacity were delivered in April for all computing tiers, and the utilisation and load is close to the planning predictions. All computing centre tiers performed their expected functionalities. Heavy-Ion Programme The CMS Heavy-Ion Programme had a very strong showing at the Quark Matter conference. A large number of analyses were shown. The dedicated heavy-ion reconstruction facility at the Vanderbilt Tier-2 is still involved in some commissioning activities, but is available for processing and analysis. Facilities and Infrastructure Operations Facility and Infrastructure operations have been active with operations and several important deployment tasks. Facilities participated in the testing and deployment of WMAgent and WorkQueue+Request...

  19. On the impact of quantum computing technology on future developments in high-performance scientific computing

    OpenAIRE

    Möller, Matthias; Vuik, Cornelis

    2017-01-01

    Quantum computing technologies have become a hot topic in academia and industry receiving much attention and financial support from all sides. Building a quantum computer that can be used practically is in itself an outstanding challenge that has become the ‘new race to the moon’. Next to researchers and vendors of future computing technologies, national authorities are showing strong interest in maturing this technology due to its known potential to break many of today’s encryption technique...

  20. Reducing power consumption while performing collective operations on a plurality of compute nodes

    Science.gov (United States)

    Archer, Charles J [Rochester, MN; Blocksome, Michael A [Rochester, MN; Peters, Amanda E [Rochester, MN; Ratterman, Joseph D [Rochester, MN; Smith, Brian E [Rochester, MN

    2011-10-18

    Methods, apparatus, and products are disclosed for reducing power consumption while performing collective operations on a plurality of compute nodes that include: receiving, by each compute node, instructions to perform a type of collective operation; selecting, by each compute node from a plurality of collective operations for the collective operation type, a particular collective operation in dependence upon power consumption characteristics for each of the plurality of collective operations; and executing, by each compute node, the selected collective operation.

  1. Performance analysis of cloud computing services for many-tasks scientific computing

    NARCIS (Netherlands)

    Iosup, A.; Ostermann, S.; Yigitbasi, M.N.; Prodan, R.; Fahringer, T.; Epema, D.H.J.

    2011-01-01

    Cloud computing is an emerging commercial infrastructure paradigm that promises to eliminate the need for maintaining expensive computing facilities by companies and institutes alike. Through the use of virtualization and resource time sharing, clouds serve with a single set of physical resources a

  2. A performance analysis of EC2 cloud computing services for scientific computing

    NARCIS (Netherlands)

    Ostermann, S.; Iosup, A.; Yigitbasi, M.N.; Prodan, R.; Fahringer, T.; Epema, D.H.J.; Avresky, D.; Diaz, M.; Bode, A.; Bruno, C.; Dekel, E.

    2010-01-01

    Cloud Computing is emerging today as a commercial infrastructure that eliminates the need for maintaining expensive computing hardware. Through the use of virtualization, clouds promise to address with the same shared set of physical resources a large user base with different needs. Thus, clouds

  3. Computer Access and Computer Use for Science Performance of Racial and Linguistic Minority Students

    Science.gov (United States)

    Chang, Mido; Kim, Sunha

    2009-01-01

    This study examined the effects of computer access and computer use on the science achievement of elementary school students, with focused attention on the effects for racial and linguistic minority students. The study used the Early Childhood Longitudinal Study (ECLS-K) database and conducted statistical analyses with proper weights and…

  4. On the impact of quantum computing technology on future developments in high-performance scientific computing

    NARCIS (Netherlands)

    Möller, M.; Vuik, C.

    2017-01-01

    Quantum computing technologies have become a hot topic in academia and industry receiving much attention and financial support from all sides. Building a quantum computer that can be used practically is in itself an outstanding challenge that has become the ‘new race to the moon’. Next to

  5. Automated procedure for performing computer security risk analysis

    International Nuclear Information System (INIS)

    Smith, S.T.; Lim, J.J.

    1984-05-01

    Computers, the invisible backbone of nuclear safeguards, monitor and control plant operations and support many materials accounting systems. Our automated procedure to assess computer security effectiveness differs from traditional risk analysis methods. The system is modeled as an interactive questionnaire, fully automated on a portable microcomputer. A set of modular event trees links the questionnaire to the risk assessment. Qualitative scores are obtained for target vulnerability, and qualitative impact measures are evaluated for a spectrum of threat-target pairs. These are then combined by a linguistic algebra to provide an accurate and meaningful risk measure. 12 references, 7 figures

  6. Validation and computing and performance studies for the ATLAS simulation

    CERN Document Server

    Marshall, Z; The ATLAS collaboration

    2009-01-01

    We present the validation of the ATLAS simulation software pro ject. Software development is controlled by nightly builds and several levels of automatic tests to ensure stability. Computing validation, including CPU time, memory, and disk space required per event, is benchmarked for all software releases. Several different physics processes and event types are checked to thoroughly test all aspects of the detector simulation. The robustness of the simulation software is demonstrated by the production of 500 million events on the World-wide LHC Computing Grid in the last year.

  7. Performance predictions for solar-chemical convertors by computer simulation

    Energy Technology Data Exchange (ETDEWEB)

    Luttmer, J.D.; Trachtenberg, I.

    1985-08-01

    A computer model which simulates the operation of Texas Instruments solar-chemical convertor (SCC) was developed. The model allows optimization of SCC processes, material, and configuration by facilitating decisions on tradeoffs among ease of manufacturing, power conversion efficiency, and cost effectiveness. The model includes various algorithms which define the electrical, electrochemical, and resistance parameters and which describ the operation of the discrete components of the SCC. Results of the model which depict the effect of material and geometric changes on various parameters are presented. The computer-calculated operation is compared with experimentall observed hydrobromic acid electrolysis rates.

  8. Performance comparison between Java and JNI for optimal implementation of computational micro-kernels

    OpenAIRE

    Halli , Nassim; Charles , Henri-Pierre; Méhaut , Jean-François

    2015-01-01

    International audience; General purpose CPUs used in high performance computing (HPC) support a vector instruction set and an out-of-order engine dedicated to increase the instruction level parallelism. Hence, related optimizations are currently critical to improve the performance of applications requiring numerical computation. Moreover, the use of a Java run-time environment such as the HotSpot Java Virtual Machine (JVM) in high performance computing is a promising alternative. It benefits ...

  9. Performance of Cloud Computing Centers with Multiple Priority Classes

    NARCIS (Netherlands)

    Ellens, W.; Zivkovic, Miroslav; Akkerboom, J.; Litjens, R.; van den Berg, Hans Leo

    In this paper we consider the general problem of resource provisioning within cloud computing. We analyze the problem of how to allocate resources to different clients such that the service level agreements (SLAs) for all of these clients are met. A model with multiple service request classes

  10. Running Batch Jobs on Peregrine | High-Performance Computing | NREL

    Science.gov (United States)

    and run your application. Users typically create or edit job scripts using a text editor such as vi Using Resource Feature to Request Different Node Types Peregrine has several types of compute nodes , which differ in the amount of memory and number of processor cores. The majority of the nodes have 24

  11. Understanding and Improving the Performance Consistency of Distributed Computing Systems

    NARCIS (Netherlands)

    Yigitbasi, M.N.

    2012-01-01

    With the increasing adoption of distributed systems in both academia and industry, and with the increasing computational and storage requirements of distributed applications, users inevitably demand more from these systems. Moreover, users also depend on these systems for latency and throughput

  12. Simulating elastic light scattering using high performance computing methods

    NARCIS (Netherlands)

    Hoekstra, A.G.; Sloot, P.M.A.; Verbraeck, A.; Kerckhoffs, E.J.H.

    1993-01-01

    The Coupled Dipole method, as originally formulated byPurcell and Pennypacker, is a very powerful method tosimulate the Elastic Light Scattering from arbitraryparticles. This method, which is a particle simulationmodel for Computational Electromagnetics, has one majordrawback: if the size of the

  13. Computer-Supported Instruction in Enhancing the Performance of Dyscalculics

    Science.gov (United States)

    Kumar, S. Praveen; Raja, B. William Dharma

    2010-01-01

    The use of instructional media is an essential component of teaching-learning process which contributes to the efficiency as well as effectiveness of the teaching-learning process. Computer-supported instruction has a very important role to play as an advanced technological instruction as it employs different instructional techniques like…

  14. Investigation of the applicability of a functional programming model to fault-tolerant parallel processing for knowledge-based systems

    Science.gov (United States)

    Harper, Richard

    1989-01-01

    In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper Fault-Tolerant Parallel Processor (FTPP). When used in conjunction with the FTPP's fault detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms have been implemented and are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence and recovery. This user interface is described and its use demonstrated. The applicability of the functional programming style to the Activation Framework, a paradigm for intelligent systems, is then briefly described.

  15. 'Research and development of research information infrastructure'. Achievement report on development of parallel processing software technology for discrete value solving methods; Kenkyu joho kiban kenkyu kaihatsu seika hokokusho. Risanka suchi kaiho no tame no heiretsu shori software gijutsu kaihatsu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-09-01

    Research and development has been performed on a general purpose parallel processing software that can be utilized for value solving methods, such as the finite element method, finite volume method and finite difference method. The achievements of the research and development may be summarized as follows: this parallel platform is parallelized in the concept of the domain division method for the elements (calculation cells), and is applicable to any of the finite element method, finite volume method and finite difference method; a researcher who has developed a program can easily perform the parallelization work to have the parallelizing performance displayed; the platform can be utilized in agreement with several parallel levels that are required by the user; with regard to the parallelization efficiency in large-size problems, it has become possible to execute at an efficiency of higher than 70% for the solver parts by using 32 processors of SR8000 at the computation center of the Agency of Industrial Science and Technology; the rigidity matrix preparing part shows an efficiency close to 100%W; and the developed parallel platform is under continued evaluation at the Machine Technology Research Institute and the Material Engineering Research Institute. (NEDO)

  16. Co-development of Problem Gambling and Depression Symptoms in Emerging Adults: A Parallel-Process Latent Class Growth Model.

    Science.gov (United States)

    Edgerton, Jason D; Keough, Matthew T; Roberts, Lance W

    2018-02-21

    This study examines whether there are multiple joint trajectories of depression and problem gambling co-development in a sample of emerging adults. Data were from the Manitoba Longitudinal Study of Young Adults (n = 679), which was collected in 4 waves across 5 years (age 18-20 at baseline). Parallel process latent class growth modeling was used to identified 5 joint trajectory classes: low decreasing gambling, low increasing depression (81%); low stable gambling, moderate decreasing depression (9%); low stable gambling, high decreasing depression (5%); low stable gambling, moderate stable depression (3%); moderate stable problem gambling, no depression (2%). There was no evidence of reciprocal growth in problem gambling and depression in any of the joint classes. Multinomial logistic regression analyses of baseline risk and protective factors found that only neuroticism, escape-avoidance coping, and perceived level of family social support were significant predictors of joint trajectory class membership. Consistent with the pathways model framework, we observed that individuals in the problem gambling only class were more likely using gambling as a stable way to cope with negative emotions. Similarly, high levels of neuroticism and low levels of family support were associated with increased odds of being in a class with moderate to high levels of depressive symptoms (but low gambling problems). The results suggest that interventions for problem gambling and/or depression need to focus on promoting more adaptive coping skills among more "at-risk" young adults, and such interventions should be tailored in relation to specific subtypes of comorbid mental illness.

  17. Does the extended parallel process model fear appeal theory explain fears and barriers to prenatal physical activity?

    Science.gov (United States)

    Redmond, Michelle L; Dong, Fanglong; Frazier, Linda M

    2015-01-01

    Few studies have looked at the impact of fear on exercise behavior during pregnancy using a fear appeal theory. It is beneficial to understand how women receive the message of safe exercise during pregnancy and whether established guidelines have any influence on their decision to exercise. Using the extended parallel process model (EPPM), we explored women's fears about prenatal physical activity. We conducted a prospective, cross-sectional study on the fears and barriers to prenatal exercise among a racially/ethnically diverse population of pregnant women. Participants were recruited from local prenatal clinics. Ninety females with a singleton pregnancy between 16 and 30 weeks gestation were enrolled in the study. The primary outcome measure was classification of risk behavior based on the EPPM theory. Women who scored high on self-efficacy for exercising safely were more likely to exercise during pregnancy (adjusted odds ratio, 5.95; 95% CI, 1.39-25.39; P=.016) for at least 90 minutes per week. Participants who exercised at least 90 minutes per week during pregnancy scored higher on their perceived ability to control danger to the baby, as well as less susceptibility of harm and threat to baby of moderate exercise from prenatal exercise. More education and counseling on specific guidelines for safely exercising during pregnancy are needed. The EPPM framework has the potential to help improve health communications about exercise safety and guidelines between patients and health care professionals during pregnancy. Copyright © 2015 Jacobs Institute of Women's Health. Published by Elsevier Inc. All rights reserved.

  18. Soft Computing Techniques for the Protein Folding Problem on High Performance Computing Architectures.

    Science.gov (United States)

    Llanes, Antonio; Muñoz, Andrés; Bueno-Crespo, Andrés; García-Valverde, Teresa; Sánchez, Antonia; Arcas-Túnez, Francisco; Pérez-Sánchez, Horacio; Cecilia, José M

    2016-01-01

    The protein-folding problem has been extensively studied during the last fifty years. The understanding of the dynamics of global shape of a protein and the influence on its biological function can help us to discover new and more effective drugs to deal with diseases of pharmacological relevance. Different computational approaches have been developed by different researchers in order to foresee the threedimensional arrangement of atoms of proteins from their sequences. However, the computational complexity of this problem makes mandatory the search for new models, novel algorithmic strategies and hardware platforms that provide solutions in a reasonable time frame. We present in this revision work the past and last tendencies regarding protein folding simulations from both perspectives; hardware and software. Of particular interest to us are both the use of inexact solutions to this computationally hard problem as well as which hardware platforms have been used for running this kind of Soft Computing techniques.

  19. COMPUTING

    CERN Multimedia

    M. Kasemann

    Overview In autumn the main focus was to process and handle CRAFT data and to perform the Summer08 MC production. The operational aspects were well covered by regular Computing Shifts, experts on duty and Computing Run Coordination. At the Computing Resource Board (CRB) in October a model to account for service work at Tier 2s was approved. The computing resources for 2009 were reviewed for presentation at the C-RRB. The quarterly resource monitoring is continuing. Facilities/Infrastructure operations Operations during CRAFT data taking ran fine. This proved to be a very valuable experience for T0 workflows and operations. The transfers of custodial data to most T1s went smoothly. A first round of reprocessing started at the Tier-1 centers end of November; it will take about two weeks. The Computing Shifts procedure was tested full scale during this period and proved to be very efficient: 30 Computing Shifts Persons (CSP) and 10 Computing Resources Coordinators (CRC). The shift program for the shut down w...

  20. Performing a local reduction operation on a parallel computer

    Science.gov (United States)

    Blocksome, Michael A.; Faraj, Daniel A.

    2012-12-11

    A parallel computer including compute nodes, each including two reduction processing cores, a network write processing core, and a network read processing core, each processing core assigned an input buffer. Copying, in interleaved chunks by the reduction processing cores, contents of the reduction processing cores' input buffers to an interleaved buffer in shared memory; copying, by one of the reduction processing cores, contents of the network write processing core's input buffer to shared memory; copying, by another of the reduction processing cores, contents of the network read processing core's input buffer to shared memory; and locally reducing in parallel by the reduction processing cores: the contents of the reduction processing core's input buffer; every other interleaved chunk of the interleaved buffer; the copied contents of the network write processing core's input buffer; and the copied contents of the network read processing core's input buffer.

  1. The European computer model for optronic system performance prediction (ECOMOS)

    Science.gov (United States)

    Keßler, Stefan; Bijl, Piet; Labarre, Luc; Repasi, Endre; Wittenstein, Wolfgang; Bürsing, Helge

    2017-10-01

    ECOMOS is a multinational effort within the framework of an EDA Project Arrangement. Its aim is to provide a generally accepted and harmonized European computer model for computing nominal Target Acquisition (TA) ranges of optronic imagers operating in the Visible or thermal Infrared (IR). The project involves close co-operation of defence and security industry and public research institutes from France, Germany, Italy, The Netherlands and Sweden. ECOMOS uses and combines well-accepted existing European tools to build up a strong competitive position. This includes two TA models: the analytical TRM4 model and the image-based TOD model. In addition, it uses the atmosphere model MATISSE. In this paper, the central idea of ECOMOS is exposed. The overall software structure and the underlying models are shown and elucidated. The status of the project development is given as well as a short discussion of validation tests and an outlook on the future potential of simulation for sensor assessment.

  2. A high level language for a high performance computer

    Science.gov (United States)

    Perrott, R. H.

    1978-01-01

    The proposed computational aerodynamic facility will join the ranks of the supercomputers due to its architecture and increased execution speed. At present, the languages used to program these supercomputers have been modifications of programming languages which were designed many years ago for sequential machines. A new programming language should be developed based on the techniques which have proved valuable for sequential programming languages and incorporating the algorithmic techniques required for these supercomputers. The design objectives for such a language are outlined.

  3. WinHPC System Configuration | High-Performance Computing | NREL

    Science.gov (United States)

    ), login node (WinHPC02) and worker/compute nodes. The head node acts as the file, DNS, and license server . The login node is where the users connect to access the cluster. Node 03 has dual Intel Xeon E5530 2008 R2 HPC Edition. The login node, WinHPC02, is where users login to access the system. This is where

  4. Architecture and Programming Models for High Performance Intensive Computation

    Science.gov (United States)

    2016-06-29

    commands from the data processing center to the sensors is needed. It has been noted that the ubiquity of mobile communication devices offers the...commands from a Processing Facility by way of mobile Relay Stations. The activity of each component of this model other than the Merge module can be...evaluation of the initial system implementation. Gao also was in charge of the development of Fresh Breeze architecture backend on new many-core computers

  5. Effects of Task Performance and Task Complexity on the Validity of Computational Models of Attention

    NARCIS (Netherlands)

    Koning, L. de; Maanen, P.P. van; Dongen, K. van

    2008-01-01

    Computational models of attention can be used as a component of decision support systems. For accurate support, a computational model of attention has to be valid and robust. The effects of task performance and task complexity on the validity of three different computational models of attention were

  6. Gender Differences in Attitudes toward Computers and Performance in the Accounting Information Systems Class

    Science.gov (United States)

    Lenard, Mary Jane; Wessels, Susan; Khanlarian, Cindi

    2010-01-01

    Using a model developed by Young (2000), this paper explores the relationship between performance in the Accounting Information Systems course, self-assessed computer skills, and attitudes toward computers. Results show that after taking the AIS course, students experience a change in perception about their use of computers. Females'…

  7. An urban energy performance evaluation system and its computer implementation.

    Science.gov (United States)

    Wang, Lei; Yuan, Guan; Long, Ruyin; Chen, Hong

    2017-12-15

    To improve the urban environment and effectively reflect and promote urban energy performance, an urban energy performance evaluation system was constructed, thereby strengthening urban environmental management capabilities. From the perspectives of internalization and externalization, a framework of evaluation indicators and key factors that determine urban energy performance and explore the reasons for differences in performance was proposed according to established theory and previous studies. Using the improved stochastic frontier analysis method, an urban energy performance evaluation and factor analysis model was built that brings performance evaluation and factor analysis into the same stage for study. According to data obtained for the Chinese provincial capitals from 2004 to 2013, the coefficients of the evaluation indicators and key factors were calculated by the urban energy performance evaluation and factor analysis model. These coefficients were then used to compile the program file. The urban energy performance evaluation system developed in this study was designed in three parts: a database, a distributed component server, and a human-machine interface. Its functions were designed as login, addition, edit, input, calculation, analysis, comparison, inquiry, and export. On the basis of these contents, an urban energy performance evaluation system was developed using Microsoft Visual Studio .NET 2015. The system can effectively reflect the status of and any changes in urban energy performance. Beijing was considered as an example to conduct an empirical study, which further verified the applicability and convenience of this evaluation system. Copyright © 2017 Elsevier Ltd. All rights reserved.

  8. Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

    Science.gov (United States)

    Ivanovic, Pavle; Richter, Harald

    2018-01-01

    High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.

  9. Parallel processing method for high-speed real time digital pulse processing for gamma-ray spectroscopy

    International Nuclear Information System (INIS)

    Fernandes, A.M.; Pereira, R.C.; Sousa, J.; Neto, A.; Carvalho, P.; Batista, A.J.N.; Carvalho, B.B.; Varandas, C.A.F.; Tardocchi, M.; Gorini, G.

    2010-01-01

    A new data acquisition (DAQ) system was developed to fulfil the requirements of the gamma-ray spectrometer (GRS) JET-EP2 (joint European Torus enhancement project 2), providing high-resolution spectroscopy at very high-count rate (up to few MHz). The system is based on the Advanced Telecommunications Computing Architecture TM (ATCA TM ) and includes a transient record (TR) module with 8 channels of 14 bits resolution at 400 MSamples/s (MSPS) sampling rate, 4 GB of local memory, and 2 field programmable gate array (FPGA) able to perform real time algorithms for data reduction and digital pulse processing. Although at 400 MSPS only fast programmable devices such as FPGAs can be used either for data processing and data transfer, FPGA resources also present speed limitation at some specific tasks, leading to an unavoidable data lost when demanding algorithms are applied. To overcome this problem and foreseeing an increase of the algorithm complexity, a new digital parallel filter was developed, aiming to perform real time pulse processing in the FPGAs of the TR module at the presented sampling rate. The filter is based on the conventional digital time-invariant trapezoidal shaper operating with parallelized data while performing pulse height analysis (PHA) and pile up rejection (PUR). The incoming sampled data is successively parallelized and fed into the processing algorithm block at one fourth of the sampling rate. The following data processing and data transfer is also performed at one fourth of the sampling rate. The algorithm based on data parallelization technique was implemented and tested at JET facilities, where a spectrum was obtained. Attending to the observed results, the PHA algorithm will be improved by implementing the pulse pile up discrimination.

  10. High Performance Computing (HPC) Challenge (HPCC) Benchmark Suite Development

    National Research Council Canada - National Science Library

    Dongarra, J. J

    2005-01-01

    .... The applications of performance modeling are numerous, including evaluation of algorithms, optimization of code implementation, parallel library development, and comparison of system architectures...

  11. COMPUTER-IMPLEMENTED METHOD OF PERFORMING A SEARCH USING SIGNATURES

    DEFF Research Database (Denmark)

    2017-01-01

    A computer-implemented method of processing a query vector and a data vector), comprising: generating a set of masks and a first set of multiple signatures and a second set of multiple signatures by applying the set of masks to the query vector and the data vector, respectively, and generating...... candidate pairs, of a first signature and a second signature, by identifying matches of a first signature and a second signature. The set of masks comprises a configuration of the elements that is a Hadamard code; a permutation of a Hadamard code; or a code that deviates from a Hadamard code...

  12. Computational performance of a projection and rescaling algorithm

    OpenAIRE

    Pena, Javier; Soheili, Negar

    2018-01-01

    This paper documents a computational implementation of a {\\em projection and rescaling algorithm} for finding most interior solutions to the pair of feasibility problems \\[ \\text{find} \\; x\\in L\\cap\\mathbb{R}^n_{+} \\;\\;\\;\\; \\text{ and } \\; \\;\\;\\;\\; \\text{find} \\; \\hat x\\in L^\\perp\\cap\\mathbb{R}^n_{+}, \\] where $L$ denotes a linear subspace in $\\mathbb{R}^n$ and $L^\\perp$ denotes its orthogonal complement. The projection and rescaling algorithm is a recently developed method that combines a {\\...

  13. Using High Performance Computing to Support Water Resource Planning

    Energy Technology Data Exchange (ETDEWEB)

    Groves, David G. [RAND Corporation, Santa Monica, CA (United States); Lembert, Robert J. [RAND Corporation, Santa Monica, CA (United States); May, Deborah W. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Leek, James R. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Syme, James [RAND Corporation, Santa Monica, CA (United States)

    2015-10-22

    In recent years, decision support modeling has embraced deliberation-withanalysis— an iterative process in which decisionmakers come together with experts to evaluate a complex problem and alternative solutions in a scientifically rigorous and transparent manner. Simulation modeling supports decisionmaking throughout this process; visualizations enable decisionmakers to assess how proposed strategies stand up over time in uncertain conditions. But running these simulation models over standard computers can be slow. This, in turn, can slow the entire decisionmaking process, interrupting valuable interaction between decisionmakers and analytics.

  14. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...

  15. Failure detection in high-performance clusters and computers using chaotic map computations

    Science.gov (United States)

    Rao, Nageswara S.

    2015-09-01

    A programmable media includes a processing unit capable of independent operation in a machine that is capable of executing 10.sup.18 floating point operations per second. The processing unit is in communication with a memory element and an interconnect that couples computing nodes. The programmable media includes a logical unit configured to execute arithmetic functions, comparative functions, and/or logical functions. The processing unit is configured to detect computing component failures, memory element failures and/or interconnect failures by executing programming threads that generate one or more chaotic map trajectories. The central processing unit or graphical processing unit is configured to detect a computing component failure, memory element failure and/or an interconnect failure through an automated comparison of signal trajectories generated by the chaotic maps.

  16. Cooperative storage of shared files in a parallel computing system with dynamic block size

    Science.gov (United States)

    Bent, John M.; Faibish, Sorin; Grider, Gary

    2015-11-10

    Improved techniques are provided for parallel writing of data to a shared object in a parallel computing system. A method is provided for storing data generated by a plurality of parallel processes to a shared object in a parallel computing system. The method is performed by at least one of the processes and comprises: dynamically determining a block size for storing the data; exchanging a determined amount of the data with at least one additional process to achieve a block of the data having the dynamically determined block size; and writing the block of the data having the dynamically determined block size to a file system. The determined block size comprises, e.g., a total amount of the data to be stored divided by the number of parallel processes. The file system comprises, for example, a log structured virtual parallel file system, such as a Parallel Log-Structured File System (PLFS).

  17. Computational Modeling of Human Multiple-Task Performance

    National Research Council Canada - National Science Library

    Kieras, David E; Meyer, David

    2005-01-01

    This is the final report for a project that was a continuation of an earlier, long-term project on the development and validation of the EPIC cognitive architecture for modeling human cognition and performance...

  18. Performance Evaluation of the Myrias SPS-2 Computer

    National Research Council Canada - National Science Library

    McBryan, Oliver A; Pozo, Roldan

    1990-01-01

    .... The highlight of the software environment is the virtual shared memory environment. We have analyzed the performance of the shared memory environment by studying system efficiency as a function of both number of processors in use and of paging activity...

  19. Computer analysis of sodium cold trap design and performance

    International Nuclear Information System (INIS)

    McPheeters, C.C.; Raue, D.J.

    1983-11-01

    Normal steam-side corrosion of steam-generator tubes in Liquid Metal Fast Breeder Reactors (LMFBRs) results in liberation of hydrogen, and most of this hydrogen diffuses through the tubes into the heat-transfer sodium and must be removed by the purification system. Cold traps are normally used to purify sodium, and they operate by cooling the sodium to temperatures near the melting point, where soluble impurities including hydrogen and oxygen precipitate as NaH and Na 2 O, respectively. A computer model was developed to simulate the processes that occur in sodium cold traps. The Model for Analyzing Sodium Cold Traps (MASCOT) simulates any desired configuration of mesh arrangements and dimensions and calculates pressure drops and flow distributions, temperature profiles, impurity concentration profiles, and impurity mass distributions

  20. Computational study of performance characteristics for truncated conical aerospike nozzles

    Science.gov (United States)

    Nair, Prasanth P.; Suryan, Abhilash; Kim, Heuy Dong

    2017-12-01

    Aerospike nozzles are advanced rocket nozzles that can maintain its aerodynamic efficiency over a wide range of altitudes. It belongs to class of altitude compensating nozzles. A vehicle with an aerospike nozzle uses less fuel at low altitudes due to its altitude adaptability, where most missions have the greatest need for thrust. Aerospike nozzles are better suited to Single Stage to Orbit (SSTO) missions compared to conventional nozzles. In the current study, the flow through 20% and 40% aerospike nozzle is analyzed in detail using computational fluid dynamics technique. Steady state analysis with implicit formulation is carried out. Reynolds averaged Navier-Stokes equations are solved with the Spalart-Allmaras turbulence model. The results are compared with experimental results from previous work. The transition from open wake to closed wake happens in lower Nozzle Pressure Ratio for 20% as compared to 40% aerospike nozzle.

  1. Electromagnetic Modeling of Human Body Using High Performance Computing

    Science.gov (United States)

    Ng, Cho-Kuen; Beall, Mark; Ge, Lixin; Kim, Sanghoek; Klaas, Ottmar; Poon, Ada

    Realistic simulation of electromagnetic wave propagation in the actual human body can expedite the investigation of the phenomenon of harvesting implanted devices using wireless powering coupled from external sources. The parallel electromagnetics code suite ACE3P developed at SLAC National Accelerator Laboratory is based on the finite element method for high fidelity accelerator simulation, which can be enhanced to model electromagnetic wave propagation in the human body. Starting with a CAD model of a human phantom that is characterized by a number of tissues, a finite element mesh representing the complex geometries of the individual tissues is built for simulation. Employing an optimal power source with a specific pattern of field distribution, the propagation and focusing of electromagnetic waves in the phantom has been demonstrated. Substantial speedup of the simulation is achieved by using multiple compute cores on supercomputers.

  2. Causal Analysis for Performance Modeling of Computer Programs

    Directory of Open Access Journals (Sweden)

    Jan Lemeire

    2007-01-01

    Full Text Available Causal modeling and the accompanying learning algorithms provide useful extensions for in-depth statistical investigation and automation of performance modeling. We enlarged the scope of existing causal structure learning algorithms by using the form-free information-theoretic concept of mutual information and by introducing the complexity criterion for selecting direct relations among equivalent relations. The underlying probability distribution of experimental data is estimated by kernel density estimation. We then reported on the benefits of a dependency analysis and the decompositional capacities of causal models. Useful qualitative models, providing insight into the role of every performance factor, were inferred from experimental data. This paper reports on the results for a LU decomposition algorithm and on the study of the parameter sensitivity of the Kakadu implementation of the JPEG-2000 standard. Next, the analysis was used to search for generic performance characteristics of the applications.

  3. Proceedings of ISCIS III, the third international symposium on computer and information sciences

    International Nuclear Information System (INIS)

    Gelenbe, E.; Orhun, E.; Basar, E.

    1988-01-01

    The symposium presented in this book covered the following topics in computer and information sciences: computer networks, computers in education, software engineering, modelling and simulation, concurrency, artificial intelligence, image and signal processing, data base systems, operating systems, parallel processing, robotics, reliability, computer architecture, CAD/CAM, and social and technical applications. Many of the papers presented are studies on computer networks, computers in education, artificial intelligence, software engineering, concurrency, data base systems, image processing, and parallel processing

  4. XOP, a fast versatile processor, as a building block for parallel processing in high energy physics experiments

    International Nuclear Information System (INIS)

    Baehler, P.; Bosco, N.; Lingjaerde, T.; Ljuslin, C.; Van Praag, A.; Werner, P.

    1986-01-01

    The XOP processor has been designed for trigger calculation and data compression in high energy physics experiments. Therefore, emphasis has been placed upon fast execution and high input/output rate. The fast execution is achieved by a wide instruction word holding operations which are executed concurrently. Thus, the arithmetic operations, data address calculations, data accessing, condition checking, loop count checking and next instruction evaluation all overlap in time. In conventional micro-processors these operations are performed sequentially. In addition, the instruction set comprises not only the classical computer instructions, but also specialized instructions suitable for trigger calculations, such as bit search, population count, loose compare and vector instructions. In order to achieve a high input/output rate, each XOP ECLine interface board is equipped with an input and an output port which fulfil the LeCroy ECLine specifications. The autonomous input port allows a data rate of 40 Mbytes/sec, while the program controlled output port allows 20 Mbytes/sec. For Fastbus based systems a dual Fastbus master interface is under design which allows to build up a Fastbus multi-processor system. This design is being done in collaboration with LAPP Annecy for the CERN Lep L3 experiment. Their scheme comprises 4-5 XOP processors, each of them with a master interface on a data input segment and a master interface on a data output segment. This paper describes the structure of the XOP processor, the interface capabilities and the software development and debugging tools. (Auth.)

  5. 8051 microcontroller to FPGA and ADC interface design for high speed parallel processing systems – Application in ultrasound scanners

    Directory of Open Access Journals (Sweden)

    J. Jean Rossario Raj

    2016-09-01

    Full Text Available Microcontrollers perform the hardware control in many instruments. Instruments requiring huge data throughput and parallel computing use FPGA’s for data processing. The microcontroller in turn configures the application hardware devices such as FPGA’s, ADC’s and Ethernet chips etc. The interfacing of these devices uses address/data bus interface, serial interface or serial peripheral interface. The choice of the interface depends upon the input/output pins available with different devices, programming ease and proprietary interfaces supported by devices such as ADC’s. The novelty of this paper is to describe the programming logic used for various types of interface scenarios from microcontroller to different programmable devices. The study presented describes the methods and logic flowcharts for different interfaces. The implementation of the interface logics were in prototype hardware for ultrasound scanner. The internal devices were controlled from the graphical user interface in a laptop and the scan results are taken. It is seen that the optimum solution of the hardware design can be achieved by using a common serial interface towards all the devices.

  6. Replica-Based High-Performance Tuple Space Computing

    DEFF Research Database (Denmark)

    Andric, Marina; De Nicola, Rocco; Lluch Lafuente, Alberto

    2015-01-01

    of concurrency and data access. We investigate issues related to replica consistency, provide an operational semantics that guides the implementation of the language, and discuss the main synchronization mechanisms of our prototypical run-time framework. Finally, we provide a performance analysis, which includes...

  7. PAPIRUS - a computer code for FBR fuel performance analysis

    International Nuclear Information System (INIS)

    Kobayashi, Y.; Tsuboi, Y.; Sogame, M.

    1991-01-01

    The FBR fuel performance analysis code PAPIRUS has been developed to design fuels for demonstration and future commercial reactors. A pellet structural model was developed to describe the generation, depletion and transport of vacancies and atomic elements in unified fashion. PAPIRUS results in comparison with the power - to - melt test data from HEDL showed validity of the code at the initial reactor startup. (author)

  8. Advanced Modulation Techniques for High-Performance Computing Optical Interconnects

    DEFF Research Database (Denmark)

    Karinou, Fotini; Borkowski, Robert; Zibar, Darko

    2013-01-01

    We experimentally assess the performance of a 64 × 64 optical switch fabric used for ns-speed optical cell switching in supercomputer optical interconnects. More specifically, we study four alternative modulation formats and detection schemes, namely, 10-Gb/s nonreturn-to-zero differential phase-...

  9. HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY

    International Nuclear Information System (INIS)

    FENG, H.; JONES, K.W.; MCGUIGAN, M.; SMITH, G.J.; SPILETIC, J.

    2001-01-01

    Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data

  10. HIGH-PERFORMANCE COMPUTING FOR THE STUDY OF EARTH AND ENVIRONMENTAL SCIENCE MATERIALS USING SYNCHROTRON X-RAY COMPUTED MICROTOMOGRAPHY.

    Energy Technology Data Exchange (ETDEWEB)

    FENG,H.; JONES,K.W.; MCGUIGAN,M.; SMITH,G.J.; SPILETIC,J.

    2001-10-12

    Synchrotron x-ray computed microtomography (CMT) is a non-destructive method for examination of rock, soil, and other types of samples studied in the earth and environmental sciences. The high x-ray intensities of the synchrotron source make possible the acquisition of tomographic volumes at a high rate that requires the application of high-performance computing techniques for data reconstruction to produce the three-dimensional volumes, for their visualization, and for data analysis. These problems are exacerbated by the need to share information between collaborators at widely separated locations over both local and tide-area networks. A summary of the CMT technique and examples of applications are given here together with a discussion of the applications of high-performance computing methods to improve the experimental techniques and analysis of the data.

  11. PERCON: A flexible computer code for detailed thermal performance studies

    International Nuclear Information System (INIS)

    Boardman, F.B.; Collier, W.D.

    1975-07-01

    PERCON is a computer code which evaluates temperatures in three dimensions for a block containing heat sources and having coolant flow in one dimension. The solution is obtained at successive planes perpendicular to the coolant flow and the progression from one plane to the next occurs by the heat to the coolant determining convective boundary conditions at the next plane after due allowance being made for any lateral mixing or mass transfer between coolants. It is also possible to calculate the diametral change along a radius as a function of irradiation shrinkage and thermal expansion. This is used in a 'through life' calculation which evalates interaction pressure in tubular fuel elements. Physical property data used by the code may be specified as functions of temperature. The coolant flow may be specified, or alternatively derived by the program to satisfy either a specified overall pressure drop or mixed mean temperature rise. The pressure drop through each coolant is calculated and the flow modified, followed by a repeat of the temperature calculation, until the pressure imbalance between chosen flow channels at chosen axial positions is less than the specified convergence limit. A detailed description of the facilities in the code is given and some cases which have been studied are discussed. (U.K.)

  12. Measuring Human Performance within Computer Security Incident Response Teams

    Energy Technology Data Exchange (ETDEWEB)

    McClain, Jonathan T. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Silva, Austin Ray [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Avina, Glory Emmanuel [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Forsythe, James C. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2015-09-01

    Human performance has become a pertinen t issue within cyber security. However, this research has been stymied by the limited availability of expert cyber security professionals. This is partly attributable to the ongoing workload faced by cyber security professionals, which is compound ed by the limited number of qualified personnel and turnover of p ersonnel across organizations. Additionally, it is difficult to conduct research, and particularly, openly published research, due to the sensitivity inherent to cyber ope rations at most orga nizations. As an alternative, the current research has focused on data collection during cyb er security training exercises. These events draw individuals with a range of knowledge and experience extending from seasoned professionals to recent college gradu ates to college students. The current paper describes research involving data collection at two separate cyber security exercises. This data collection involved multiple measures which included behavioral performance based on human - machine transactions and questionnaire - based assessments of cyber security experience.

  13. Computer-mediated communication: task performance and satisfaction.

    Science.gov (United States)

    Simon, Andrew F

    2006-06-01

    The author assessed satisfaction and performance on 3 tasks (idea generation, intellective, judgment) among 75 dyads (N = 150) working through 1 of 3 modes of communication (instant messaging, videoconferencing, face to face). The author based predictions on the Media Naturalness Theory (N. Kock, 2001, 2002) and on findings from past researchers (e.g., D. M. DeRosa, C. Smith, & D. A. Hantula, in press) of the interaction between tasks and media. The present author did not identify task performance differences, although satisfaction with the medium was lower among those dyads communicating through an instant-messaging system than among those interacting face to face or through videoconferencing. The findings support the Media Naturalness Theory. The author discussed them in relation to the participants' frequent use of instant messaging and their familiarity with new communication media.

  14. Computer versus paper--does it make any difference in test performance?

    Science.gov (United States)

    Karay, Yassin; Schauber, Stefan K; Stosch, Christoph; Schüttpelz-Brauns, Katrin

    2015-01-01

    CONSTRUCT: In this study, we examine the differences in test performance between the paper-based and the computer-based version of the Berlin formative Progress Test. In this context it is the first study that allows controlling for students' prior performance. Computer-based tests make possible a more efficient examination procedure for test administration and review. Although university staff will benefit largely from computer-based tests, the question arises if computer-based tests influence students' test performance. A total of 266 German students from the 9th and 10th semester of medicine (comparable with the 4th-year North American medical school schedule) participated in the study (paper = 132, computer = 134). The allocation of the test format was conducted as a randomized matched-pair design in which students were first sorted according to their prior test results. The organizational procedure, the examination conditions, the room, and seating arrangements, as well as the order of questions and answers, were identical in both groups. The sociodemographic variables and pretest scores of both groups were comparable. The test results from the paper and computer versions did not differ. The groups remained within the allotted time, but students using the computer version (particularly the high performers) needed significantly less time to complete the test. In addition, we found significant differences in guessing behavior. Low performers using the computer version guess significantly more than low-performing students in the paper-pencil version. Participants in computer-based tests are not at a disadvantage in terms of their test results. The computer-based test required less processing time. The reason for the longer processing time when using the paper-pencil version might be due to the time needed to write the answer down, controlling for transferring the answer correctly. It is still not known why students using the computer version (particularly low-performing

  15. Mapping the Conjugate Gradient Algorithm onto High Performance Heterogeneous Computers

    Science.gov (United States)

    2014-05-01

    Solution of sparse indefinite systems of linear equations. Society for Industrial and Applied Mathematis 12(4), 617 –629. Parker, M. ( 2009 ). Taking advantage...44 vii 11 LIST OF SYMBOLS, ABBREVIATIONS, AND NOMENCLATURE API Application Programming Interface ASIC Application Specific Integrated Circuit...FPGA designer, 1 16 2 thus, final implementations were nearly always performed using fixed-point or integer arithmetic (Parker 2009 ). With the recent

  16. 14th annual Results and Review Workshop on High Performance Computing in Science and Engineering

    CERN Document Server

    Nagel, Wolfgang E; Resch, Michael M; Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2011; High Performance Computing in Science and Engineering '11

    2012-01-01

    This book presents the state-of-the-art in simulation on supercomputers. Leading researchers present results achieved on systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2011. The reports cover all fields of computational science and engineering, ranging from CFD to computational physics and chemistry, to computer science, with a special emphasis on industrially relevant applications. Presenting results for both vector systems and microprocessor-based systems, the book allows readers to compare the performance levels and usability of various architectures. As HLRS

  17. The correlation between a passion for computer games and the school performance of younger schoolchildren.

    Directory of Open Access Journals (Sweden)

    Maliy D.V.

    2015-07-01

    Full Text Available Today computer games occupy a significant place in children’s lives and fundamentally affect the process of the formation and development of their personalities. A number of present-day researchers assert that computer games have a developmental effect on players. Others share the point of view that computer games have negative effects on the cognitive and emotional spheres of a child and claim that children with low self-esteem who neglect their schoolwork and have difficulties in communication are particularly passionate about computer games. This article reviews theoretical and experimental pedagogical and psychological studies of the nature of the correlation between a passion for computer games and the school performance of younger schoolchildren. Our analysis of foreign and Russian psychology studies regarding the problem of playing activities mediated by information and computer technologies allowed us to single out the main criteria for children’s passion for computer games and school performance. This article presents the results of a pilot study of the nature of the correlation between a passion for computer games and the school performance of younger schoolchildren. The research involved 32 pupils (12 girls and 20 boys aged 10-11 years in the 4th grade. The general hypothesis was that there are divergent correlations between the passion of younger schoolchildren for computer games and their school performance. A questionnaire survey administered to the pupils allowed us to obtain information about the amount of time they devoted to computer games, their preferences for computer-game genres, and the extent of their passion for games. To determine the level of school performance we analyzed class registers. To establish the correlation between a passion for computer games and the school performance of younger schoolchildren, as well as to determine the effect of a passion for computer games on the personal qualities of the children

  18. Performance evaluation for compressible flow calculations on five parallel computers of different architectures

    International Nuclear Information System (INIS)

    Kimura, Toshiya.

    1997-03-01

    A two-dimensional explicit Euler solver has been implemented for five MIMD parallel computers of different machine architectures in Center for Promotion of Computational Science and Engineering of Japan Atomic Energy Research Institute. These parallel computers are Fujitsu VPP300, NEC SX-4, CRAY T94, IBM SP2, and Hitachi SR2201. The code was parallelized by several parallelization methods, and a typical compressible flow problem has been calculated for different grid sizes changing the number of processors. Their effective performances for parallel calculations, such as calculation speed, speed-up ratio and parallel efficiency, have been investigated and evaluated. The communication time among processors has been also measured and evaluated. As a result, the differences on the performance and the characteristics between vector-parallel and scalar-parallel computers can be pointed, and it will present the basic data for efficient use of parallel computers and for large scale CFD simulations on parallel computers. (author)

  19. Generalized Portable SHMEM Library for High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Parzyszek, Krzysztof [Iowa State Univ., Ames, IA (United States)

    2003-01-01

    This dissertation describes the efforts to design and implement the Generalized Portable SHMEM library, GPSHMEM, as well as supplementary tools. There are two major components of the GPSHMEM project: the GPSHMEM library itself and the Fortran 77 source-to-source translator. The rest of this thesis is divided into two parts. Part I introduces the shared memory model and the distributed shared memory model. It explains the motivation behind GPSHMEM and presents its functionality and performance results. Part II is entirely devoted to the Fortran 77 translator call fgpp. The need for such a tool is demonstrated, functionality goals are stated, and the design issues are presented along with the development of the solutions.

  20. Spiking neural networks on high performance computer clusters

    Science.gov (United States)

    Chen, Chong; Taha, Tarek M.

    2011-09-01

    In this paper we examine the acceleration of two spiking neural network models on three clusters of multicore processors representing three categories of processors: x86, STI Cell, and NVIDIA GPGPUs. The x86 cluster utilized consists of 352 dualcore AMD Opterons, the Cell cluster consists of 320 Sony Playstation 3s, while the GPGPU cluster contains 32 NVIDIA Tesla S1070 systems. The results indicate that the GPGPU platform can dominate in performance compared to the Cell and x86 platforms examined. From a cost perspective, the GPGPU is more expensive in terms of neuron/s throughput. If the cost of GPGPUs go down in the future, this platform will become very cost effective for these models.

  1. Computational Human Performance Modeling For Alarm System Design

    Energy Technology Data Exchange (ETDEWEB)

    Jacques Hugo

    2012-07-01

    The introduction of new technologies like adaptive automation systems and advanced alarms processing and presentation techniques in nuclear power plants is already having an impact on the safety and effectiveness of plant operations and also the role of the control room operator. This impact is expected to escalate dramatically as more and more nuclear power utilities embark on upgrade projects in order to extend the lifetime of their plants. One of the most visible impacts in control rooms will be the need to replace aging alarm systems. Because most of these alarm systems use obsolete technologies, the methods, techniques and tools that were used to design the previous generation of alarm system designs are no longer effective and need to be updated. The same applies to the need to analyze and redefine operators’ alarm handling tasks. In the past, methods for analyzing human tasks and workload have relied on crude, paper-based methods that often lacked traceability. New approaches are needed to allow analysts to model and represent the new concepts of alarm operation and human-system interaction. State-of-the-art task simulation tools are now available that offer a cost-effective and efficient method for examining the effect of operator performance in different conditions and operational scenarios. A discrete event simulation system was used by human factors researchers at the Idaho National Laboratory to develop a generic alarm handling model to examine the effect of operator performance with simulated modern alarm system. It allowed analysts to evaluate alarm generation patterns as well as critical task times and human workload predicted by the system.

  2. A parallel process model of the development of positive smoking expectancies and smoking behavior during early adolescence in Caucasian and African American girls

    OpenAIRE

    Chung, Tammy; White, Helene R.; Hipwell, Alison E.; Stepp, Stephanie D.; Loeber, Rolf

    2010-01-01

    This study examined the development of positive smoking expectancies and smoking behavior in an urban cohort of girls followed annually over ages 11-14. Longitudinal data from the oldest cohort of the Pittsburgh Girls Study (N=566, 56% African American, 44% Caucasian) were used to estimate a parallel process growth model of positive smoking expectancies and smoking behavior. Average level of positive smoking expectancies was relatively stable over ages 11-14, although there was significant va...

  3. Unified, Cross-Platform, Open-Source Library Package for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Kozacik, Stephen [EM Photonics, Inc., Newark, DE (United States)

    2017-05-15

    Compute power is continually increasing, but this increased performance is largely found in sophisticated computing devices and supercomputer resources that are difficult to use, resulting in under-utilization. We developed a unified set of programming tools that will allow users to take full advantage of the new technology by allowing them to work at a level abstracted away from the platform specifics, encouraging the use of modern computing systems, including government-funded supercomputer facilities.

  4. Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

    International Nuclear Information System (INIS)

    Khaleel, Mohammad A.

    2009-01-01

    This report is an account of the deliberations and conclusions of the workshop on 'Forefront Questions in Nuclear Science and the Role of High Performance Computing' held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to (1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; (2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; (3) provide nuclear physicists the opportunity to influence the development of high performance computing; and (4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.

  5. Scientific Grand Challenges: Forefront Questions in Nuclear Science and the Role of High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Khaleel, Mohammad A.

    2009-10-01

    This report is an account of the deliberations and conclusions of the workshop on "Forefront Questions in Nuclear Science and the Role of High Performance Computing" held January 26-28, 2009, co-sponsored by the U.S. Department of Energy (DOE) Office of Nuclear Physics (ONP) and the DOE Office of Advanced Scientific Computing (ASCR). Representatives from the national and international nuclear physics communities, as well as from the high performance computing community, participated. The purpose of this workshop was to 1) identify forefront scientific challenges in nuclear physics and then determine which-if any-of these could be aided by high performance computing at the extreme scale; 2) establish how and why new high performance computing capabilities could address issues at the frontiers of nuclear science; 3) provide nuclear physicists the opportunity to influence the development of high performance computing; and 4) provide the nuclear physics community with plans for development of future high performance computing capability by DOE ASCR.

  6. Advanced Computational and Experimental Techniques for Nacelle Liner Performance Evaluation

    Science.gov (United States)

    Gerhold, Carl H.; Jones, Michael G.; Brown, Martha C.; Nark, Douglas

    2009-01-01

    The Curved Duct Test Rig (CDTR) has been developed to investigate sound propagation through a duct of size comparable to the aft bypass duct of typical aircraft engines. The axial dimension of the bypass duct is often curved and this geometric characteristic is captured in the CDTR. The semiannular bypass duct is simulated by a rectangular test section in which the height corresponds to the circumferential dimension and the width corresponds to the radial dimension. The liner samples are perforate over honeycomb core and are installed on the side walls of the test section. The top and bottom surfaces of the test section are acoustically rigid to simulate a hard wall bifurcation or pylon. A unique feature of the CDTR is the control system that generates sound incident on the liner test section in specific modes. Uniform air flow, at ambient temperature and flow speed Mach 0.275, is introduced through the duct. Experiments to investigate configuration effects such as curvature along the flow path on the acoustic performance of a sample liner are performed in the CDTR and reported in this paper. Combinations of treated and acoustically rigid side walls are investigated. The scattering of modes of the incident wave, both by the curvature and by the asymmetry of wall treatment, is demonstrated in the experimental results. The effect that mode scattering has on total acoustic effectiveness of the liner treatment is also shown. Comparisons of measured liner attenuation with numerical results predicted by an analytic model based on the parabolic approximation to the convected Helmholtz equation are reported. The spectra of attenuation produced by the analytic model are similar to experimental results for both walls treated, straight and curved flow path, with plane wave and higher order modes incident. The numerical model is used to define the optimized resistance and reactance of a liner that significantly improves liner attenuation in the frequency range 1900-2400 Hz. A

  7. Assessment of calcium scoring performance in cardiac computed tomography

    International Nuclear Information System (INIS)

    Ulzheimer, Stefan; Kalender, Willi A.

    2003-01-01

    Electron beam tomography (EBT) has been used for cardiac diagnosis and the quantitative assessment of coronary calcium since the late 1980s. The introduction of mechanical multi-slice spiral CT (MSCT) scanners with shorter rotation times opened new possibilities of cardiac imaging with conventional CT scanners. The purpose of this work was to qualitatively and quantitatively evaluate the performance for EBT and MSCT for the task of coronary artery calcium imaging as a function of acquisition protocol, heart rate, spiral reconstruction algorithm (where applicable) and calcium scoring method. A cardiac CT semi-anthropomorphic phantom was designed and manufactured for the investigation of all relevant image quality parameters in cardiac CT. This phantom includes various test objects, some of which can be moved within the anthropomorphic phantom in a manner that mimics realistic heart motion. These tools were used to qualitatively and quantitatively demonstrate the accuracy of coronary calcium imaging using typical protocols for an electron beam (Evolution C-150XP, Imatron, South San Francisco, Calif.) and a 0.5-s four-slice spiral CT scanner (Sensation 4, Siemens, Erlangen, Germany). A special focus was put on the method of quantifying coronary calcium, and three scoring systems were evaluated (Agatston, volume, and mass scoring). Good reproducibility in coronary calcium scoring is always the result of a combination of high temporal and spatial resolution; consequently, thin-slice protocols in combination with retrospective gating on MSCT scanners yielded the best results. The Agatston score was found to be the least reproducible scoring method. The hydroxyapatite mass, being better reproducible and comparable on different scanners and being a physical quantitative measure, appears to be the method of choice for future clinical studies. The hydroxyapatite mass is highly correlated to the Agatston score. The introduced phantoms can be used to quantitatively assess the

  8. Newmark local time stepping on high-performance computing architectures

    KAUST Repository

    Rietmann, Max

    2016-11-25

    In multi-scale complex media, finite element meshes often require areas of local refinement, creating small elements that can dramatically reduce the global time-step for wave-propagation problems due to the CFL condition. Local time stepping (LTS) algorithms allow an explicit time-stepping scheme to adapt the time-step to the element size, allowing near-optimal time-steps everywhere in the mesh. We develop an efficient multilevel LTS-Newmark scheme and implement it in a widely used continuous finite element seismic wave-propagation package. In particular, we extend the standard LTS formulation with adaptations to continuous finite element methods that can be implemented very efficiently with very strong element-size contrasts (more than 100×). Capable of running on large CPU and GPU clusters, we present both synthetic validation examples and large scale, realistic application examples to demonstrate the performance and applicability of the method and implementation on thousands of CPU cores and hundreds of GPUs.

  9. Newmark local time stepping on high-performance computing architectures

    Energy Technology Data Exchange (ETDEWEB)

    Rietmann, Max, E-mail: max.rietmann@erdw.ethz.ch [Institute for Computational Science, Università della Svizzera italiana, Lugano (Switzerland); Institute of Geophysics, ETH Zurich (Switzerland); Grote, Marcus, E-mail: marcus.grote@unibas.ch [Department of Mathematics and Computer Science, University of Basel (Switzerland); Peter, Daniel, E-mail: daniel.peter@kaust.edu.sa [Institute for Computational Science, Università della Svizzera italiana, Lugano (Switzerland); Institute of Geophysics, ETH Zurich (Switzerland); Schenk, Olaf, E-mail: olaf.schenk@usi.ch [Institute for Computational Science, Università della Svizzera italiana, Lugano (Switzerland)

    2017-04-01

    In multi-scale complex media, finite element meshes often require areas of local refinement, creating small elements that can dramatically reduce the global time-step for wave-propagation problems due to the CFL condition. Local time stepping (LTS) algorithms allow an explicit time-stepping scheme to adapt the time-step to the element size, allowing near-optimal time-steps everywhere in the mesh. We develop an efficient multilevel LTS-Newmark scheme and implement it in a widely used continuous finite element seismic wave-propagation package. In particular, we extend the standard LTS formulation with adaptations to continuous finite element methods that can be implemented very efficiently with very strong element-size contrasts (more than 100x). Capable of running on large CPU and GPU clusters, we present both synthetic validation examples and large scale, realistic application examples to demonstrate the performance and applicability of the method and implementation on thousands of CPU cores and hundreds of GPUs.

  10. A Review of Lightweight Thread Approaches for High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Castello, Adrian; Pena, Antonio J.; Seo, Sangmin; Mayo, Rafael; Balaji, Pavan; Quintana-Orti, Enrique S.

    2016-09-12

    High-level, directive-based solutions are becoming the programming models (PMs) of the multi/many-core architectures. Several solutions relying on operating system (OS) threads perfectly work with a moderate number of cores. However, exascale systems will spawn hundreds of thousands of threads in order to exploit their massive parallel architectures and thus conventional OS threads are too heavy for that purpose. Several lightweight thread (LWT) libraries have recently appeared offering lighter mechanisms to tackle massive concurrency. In order to examine the suitability of LWTs in high-level runtimes, we develop a set of microbenchmarks consisting of commonlyfound patterns in current parallel codes. Moreover, we study the semantics offered by some LWT libraries in order to expose the similarities between different LWT application programming interfaces. This study reveals that a reduced set of LWT functions can be sufficient to cover the common parallel code patterns and that those LWT libraries perform better than OS threads-based solutions in cases where task and nested parallelism are becoming more popular with new architectures.

  11. Newmark local time stepping on high-performance computing architectures

    KAUST Repository

    Rietmann, Max; Grote, Marcus; Peter, Daniel; Schenk, Olaf

    2016-01-01

    In multi-scale complex media, finite element meshes often require areas of local refinement, creating small elements that can dramatically reduce the global time-step for wave-propagation problems due to the CFL condition. Local time stepping (LTS) algorithms allow an explicit time-stepping scheme to adapt the time-step to the element size, allowing near-optimal time-steps everywhere in the mesh. We develop an efficient multilevel LTS-Newmark scheme and implement it in a widely used continuous finite element seismic wave-propagation package. In particular, we extend the standard LTS formulation with adaptations to continuous finite element methods that can be implemented very efficiently with very strong element-size contrasts (more than 100×). Capable of running on large CPU and GPU clusters, we present both synthetic validation examples and large scale, realistic application examples to demonstrate the performance and applicability of the method and implementation on thousands of CPU cores and hundreds of GPUs.

  12. Computational Studies on the Performance of Flow Distributor in Tank

    International Nuclear Information System (INIS)

    Shin, Soo Jai; Kim, Young In; Ryu, Seungyeob; Bae, Youngmin

    2014-01-01

    Core make-up tank (CMT) is full of borated water and provides makeup and boration to the reactor coolant system (RCS) for early stage of loss of coolant accident (LOCA) and non-LOCA. The top and bottom of CMT are connected to the RCS through the pressure balance line (PBL) and the safety injection line (SIL), respectively. Each PBL is normally open to maintain pressure of the CMT at RCS, and this arrangement enables the CMT to inject water to the RCS by gravity when the isolation valves of SIL are open. During CMT injection into the Reactor, the condensation and thermal stratification are observed in CMT and the rapid condensation disturbed the injection operation. The optimal design of the flow distributor is very important to ensure structural integrity of the reactor system and their safe operation during some transient or accident conditions. In the present study, we numerically investigated the performance of flow distributor in tank with different shape factor such as the total number of the holes, the pitch-to-hole diameter ratios (p/d), the diameter of the hole and the area ratios. These data will contribute to the design the flow distributor. In the present study, the model of the flow distributor in tank is simulated using the commercial CFD software, Fluent 13.0 with varying the different shape factor of the flow distributor such as the total number of the holes, the diameter of the holes and the area ratio. As the diameter of the hole is smaller, the velocity difference between holes, which is located at upper position and lower position of the flow distributor, also decreases. For larger area ratio, the velocity of the holes is slower. When the diameter of the hole is large enough for the velocity difference between holes to be large, however, the velocity of the holes is not in inverse proportional to the area ratio

  13. Computational Studies on the Performance of Flow Distributor in Tank

    Energy Technology Data Exchange (ETDEWEB)

    Shin, Soo Jai; Kim, Young In; Ryu, Seungyeob; Bae, Youngmin [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

    2014-05-15

    Core make-up tank (CMT) is full of borated water and provides makeup and boration to the reactor coolant system (RCS) for early stage of loss of coolant accident (LOCA) and non-LOCA. The top and bottom of CMT are connected to the RCS through the pressure balance line (PBL) and the safety injection line (SIL), respectively. Each PBL is normally open to maintain pressure of the CMT at RCS, and this arrangement enables the CMT to inject water to the RCS by gravity when the isolation valves of SIL are open. During CMT injection into the Reactor, the condensation and thermal stratification are observed in CMT and the rapid condensation disturbed the injection operation. The optimal design of the flow distributor is very important to ensure structural integrity of the reactor system and their safe operation during some transient or accident conditions. In the present study, we numerically investigated the performance of flow distributor in tank with different shape factor such as the total number of the holes, the pitch-to-hole diameter ratios (p/d), the diameter of the hole and the area ratios. These data will contribute to the design the flow distributor. In the present study, the model of the flow distributor in tank is simulated using the commercial CFD software, Fluent 13.0 with varying the different shape factor of the flow distributor such as the total number of the holes, the diameter of the holes and the area ratio. As the diameter of the hole is smaller, the velocity difference between holes, which is located at upper position and lower position of the flow distributor, also decreases. For larger area ratio, the velocity of the holes is slower. When the diameter of the hole is large enough for the velocity difference between holes to be large, however, the velocity of the holes is not in inverse proportional to the area ratio.

  14. Performance of a computer-based assessment of cognitive function measures in two cohorts of seniors

    Science.gov (United States)

    Computer-administered assessment of cognitive function is being increasingly incorporated in clinical trials, however its performance in these settings has not been systematically evaluated. The Seniors Health and Activity Research Program (SHARP) pilot trial (N=73) developed a computer-based tool f...

  15. Computer-aided Detection of Lung Cancer on Chest Radiographs: Effect on Observer Performance

    NARCIS (Netherlands)

    de Hoop, Bartjan; de Boo, Diederik W.; Gietema, Hester A.; van Hoorn, Frans; Mearadji, Banafsche; Schijf, Laura; van Ginneken, Bram; Prokop, Mathias; Schaefer-Prokop, Cornelia

    2010-01-01

    Purpose: To assess how computer-aided detection (CAD) affects reader performance in detecting early lung cancer on chest radiographs. Materials and Methods: In this ethics committee-approved study, 46 individuals with 49 computed tomographically (CT)-detected and histologically proved lung cancers

  16. Effectiveness of an Electronic Performance Support System on Computer Ethics and Ethical Decision-Making Education

    Science.gov (United States)

    Kert, Serhat Bahadir; Uz, Cigdem; Gecu, Zeynep

    2014-01-01

    This study examined the effectiveness of an electronic performance support system (EPSS) on computer ethics education and the ethical decision-making processes. There were five different phases to this ten month study: (1) Writing computer ethics scenarios, (2) Designing a decision-making framework (3) Developing EPSS software (4) Using EPSS in a…

  17. Computational Model-Based Prediction of Human Episodic Memory Performance Based on Eye Movements

    Science.gov (United States)

    Sato, Naoyuki; Yamaguchi, Yoko

    Subjects' episodic memory performance is not simply reflected by eye movements. We use a ‘theta phase coding’ model of the hippocampus to predict subjects' memory performance from their eye movements. Results demonstrate the ability of the model to predict subjects' memory performance. These studies provide a novel approach to computational modeling in the human-machine interface.

  18. COMPUTING

    CERN Multimedia

    M. Kasemann

    Introduction During the past six months, Computing participated in the STEP09 exercise, had a major involvement in the October exercise and has been working with CMS sites on improving open issues relevant for data taking. At the same time operations for MC production, real data reconstruction and re-reconstructions and data transfers at large scales were performed. STEP09 was successfully conducted in June as a joint exercise with ATLAS and the other experiments. It gave good indication about the readiness of the WLCG infrastructure with the two major LHC experiments stressing the reading, writing and processing of physics data. The October Exercise, in contrast, was conducted as an all-CMS exercise, where Physics, Computing and Offline worked on a common plan to exercise all steps to efficiently access and analyze data. As one of the major results, the CMS Tier-2s demonstrated to be fully capable for performing data analysis. In recent weeks, efforts were devoted to CMS Computing readiness. All th...

  19. The Computer in Performance and Instruction: Or, How to Tell the True Color of a Chameleon.

    Science.gov (United States)

    Davis, Richard

    1979-01-01

    Discusses such potential uses for the computer in employee training programs as management support for training and performance, training project control, improved design and development methods, training program implementation and delivery, and program evaluation, revision, and maintenance. (JEG)

  20. Spatial Processing of Urban Acoustic Wave Fields from High-Performance Computations

    National Research Council Canada - National Science Library

    Ketcham, Stephen A; Wilson, D. K; Cudney, Harley H; Parker, Michael W

    2007-01-01

    .... The objective of this work is to develop spatial processing techniques for acoustic wave propagation data from three-dimensional high-performance computations to quantify scattering due to urban...

  1. FY 1996 Blue Book: High Performance Computing and Communications: Foundations for America`s Information Future

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...

  2. Optical high-performance computing: introduction to the JOSA A and Applied Optics feature.

    Science.gov (United States)

    Caulfield, H John; Dolev, Shlomi; Green, William M J

    2009-08-01

    The feature issues in both Applied Optics and the Journal of the Optical Society of America A focus on topics of immediate relevance to the community working in the area of optical high-performance computing.

  3. High-performance computing on GPUs for resistivity logging of oil and gas wells

    Science.gov (United States)

    Glinskikh, V.; Dudaev, A.; Nechaev, O.; Surodina, I.

    2017-10-01

    We developed and implemented into software an algorithm for high-performance simulation of electrical logs from oil and gas wells using high-performance heterogeneous computing. The numerical solution of the 2D forward problem is based on the finite-element method and the Cholesky decomposition for solving a system of linear algebraic equations (SLAE). Software implementations of the algorithm used the NVIDIA CUDA technology and computing libraries are made, allowing us to perform decomposition of SLAE and find its solution on central processor unit (CPU) and graphics processor unit (GPU). The calculation time is analyzed depending on the matrix size and number of its non-zero elements. We estimated the computing speed on CPU and GPU, including high-performance heterogeneous CPU-GPU computing. Using the developed algorithm, we simulated resistivity data in realistic models.

  4. FY 1997 Blue Book: High Performance Computing and Communications: Advancing the Frontiers of Information Technology

    Data.gov (United States)

    Networking and Information Technology Research and Development, Executive Office of the President — The Federal High Performance Computing and Communications HPCC Program will celebrate its fifth anniversary in October 1996 with an impressive array of...

  5. Performance of a Sequential and Parallel Computational Fluid Dynamic (CFD) Solver on a Missile Body Configuration

    National Research Council Canada - National Science Library

    Hisley, Dixie

    1999-01-01

    .... The goals of this report are: (1) to investigate the performance of message passing and loop level parallelization techniques, as they were implemented in the computational fluid dynamics (CFD...

  6. Can We Build a Truly High Performance Computer Which is Flexible and Transparent?

    KAUST Repository

    Rojas, Jhonathan Prieto; Sevilla, Galo T.; Hussain, Muhammad Mustafa

    2013-01-01

    cost advantage. In that context, low-cost mono-crystalline bulk silicon (100) based high performance transistors are considered as the heart of today's computers. One limitation is silicon's rigidity and brittleness. Here we show a generic batch process

  7. Comprehensive Simulation Lifecycle Management for High Performance Computing Modeling and Simulation, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — There are significant logistical barriers to entry-level high performance computing (HPC) modeling and simulation (M IllinoisRocstar) sets up the infrastructure for...

  8. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...

  9. Analysis and modeling of social influence in high performance computing workloads

    KAUST Repository

    Zheng, Shuai; Shae, Zon Yin; Zhang, Xiangliang; Jamjoom, Hani T.; Fong, Liana

    2011-01-01

    Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies

  10. Export Controls: Implementation of the 1998 Legislative Mandate for High Performance Computers

    National Research Council Canada - National Science Library

    1999-01-01

    We found that most of the 938 proposed exports of high performance computers to civilian end users in countries of concern from February 3, 1998, when procedures implementing the 1998 authorization...

  11. Computational Performance Analysis of Nonlinear Dynamic Systems using Semi-infinite Programming

    Directory of Open Access Journals (Sweden)

    Tor A. Johansen

    2001-01-01

    Full Text Available For nonlinear systems that satisfy certain regularity conditions it is shown that upper and lower bounds on the performance (cost function can be computed using linear or quadratic programming. The performance conditions derived from Hamilton-Jacobi inequalities are formulated as linear inequalities defined pointwise by discretizing the state-space when assuming a linearly parameterized class of functions representing the candidate performance bounds. Uncertainty with respect to some system parameters can be incorporated by also gridding the parameter set. In addition to performance analysis, the method can also be used to compute Lyapunov functions that guarantees uniform exponential stability.

  12. High performance computing in science and engineering Garching/Munich 2016

    Energy Technology Data Exchange (ETDEWEB)

    Wagner, Siegfried; Bode, Arndt; Bruechle, Helmut; Brehm, Matthias (eds.)

    2016-11-01

    Computer simulations are the well-established third pillar of natural sciences along with theory and experimentation. Particularly high performance computing is growing fast and constantly demands more and more powerful machines. To keep pace with this development, in spring 2015, the Leibniz Supercomputing Centre installed the high performance computing system SuperMUC Phase 2, only three years after the inauguration of its sibling SuperMUC Phase 1. Thereby, the compute capabilities were more than doubled. This book covers the time-frame June 2014 until June 2016. Readers will find many examples of outstanding research in the more than 130 projects that are covered in this book, with each one of these projects using at least 4 million core-hours on SuperMUC. The largest scientific communities using SuperMUC in the last two years were computational fluid dynamics simulations, chemistry and material sciences, astrophysics, and life sciences.

  13. The application of cloud computing to scientific workflows: a study of cost and performance.

    Science.gov (United States)

    Berriman, G Bruce; Deelman, Ewa; Juve, Gideon; Rynge, Mats; Vöckler, Jens-S

    2013-01-28

    The current model of transferring data from data centres to desktops for analysis will soon be rendered impractical by the accelerating growth in the volume of science datasets. Processing will instead often take place on high-performance servers co-located with data. Evaluations of how new technologies such as cloud computing would support such a new distributed computing model are urgently needed. Cloud computing is a new way of purchasing computing and storage resources on demand through virtualization technologies. We report here the results of investigations of the applicability of commercial cloud computing to scientific computing, with an emphasis on astronomy, including investigations of what types of applications can be run cheaply and efficiently on the cloud, and an example of an application well suited to the cloud: processing a large dataset to create a new science product.

  14. Simulation of synchrotron white-beam topographs. An algorithm for parallel processing: application to the study of piezoelectric devices

    International Nuclear Information System (INIS)

    Epelboin, Y.

    1996-01-01

    This paper presents a new algorithm for the integration of Takagi-Taupin equations taking into account the fact that X-ray diffraction is a parallel phenomenon. The diffraction equations show that the propagation of the waves is independent in each incidence plane. It is thus possible to compute in parallel the propagation of the waves in different planes. Two algorithms are presented: the first one for multiprocessor machines where the processors share a common memory, the second one for massively parallel computers. The program is written to achieve a high vectorization ratio and to make it as efficient as possible with modern superscalar and array processors. The simulation of the image of a defect has been divided into two independent parts. In the first one, one computes the derivatives of the deformation inside the crystal; in the second one, these results are used to simulate the image. This allows one rapidly to change the model for a defect, something that was not feasible in all previously written simulation programs since the computation of the deformation was part of the simulation. The study of stroboscopic images of the propagation of acoustic waves in piezoelectric devices is given as an example of the possibilities of this new program. (orig.)

  15. Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

    KAUST Repository

    Danani, Bob K.

    2012-07-01

    Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.

  16. Systems, methods and computer-readable media to model kinetic performance of rechargeable electrochemical devices

    Science.gov (United States)

    Gering, Kevin L.

    2013-01-01

    A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics. The computing system also analyzes the cell information of the electrochemical cell with a Butler-Volmer (BV) expression modified to determine exchange current density of the electrochemical cell by including kinetic performance information related to pulse-time dependence, electrode surface availability, or a combination thereof. A set of sigmoid-based expressions may be included with the modified-BV expression to determine kinetic performance as a function of pulse time. The determined exchange current density may be used with the modified-BV expression, with or without the sigmoid expressions, to analyze other characteristics of the electrochemical cell. Model parameters can be defined in terms of cell aging, making the overall kinetics model amenable to predictive estimates of cell kinetic performance along the aging timeline.

  17. GPU-computing in econophysics and statistical physics

    Science.gov (United States)

    Preis, T.

    2011-03-01

    A recent trend in computer science and related fields is general purpose computing on graphics processing units (GPUs), which can yield impressive performance. With multiple cores connected by high memory bandwidth, today's GPUs offer resources for non-graphics parallel processing. This article provides a brief introduction into the field of GPU computing and includes examples. In particular computationally expensive analyses employed in financial market context are coded on a graphics card architecture which leads to a significant reduction of computing time. In order to demonstrate the wide range of possible applications, a standard model in statistical physics - the Ising model - is ported to a graphics card architecture as well, resulting in large speedup values.

  18. COMPUTING

    CERN Multimedia

    Matthias Kasemann

    Overview The main focus during the summer was to handle data coming from the detector and to perform Monte Carlo production. The lessons learned during the CCRC and CSA08 challenges in May were addressed by dedicated PADA campaigns lead by the Integration team. Big improvements were achieved in the stability and reliability of the CMS Tier1 and Tier2 centres by regular and systematic follow-up of faults and errors with the help of the Savannah bug tracking system. In preparation for data taking the roles of a Computing Run Coordinator and regular computing shifts monitoring the services and infrastructure as well as interfacing to the data operations tasks are being defined. The shift plan until the end of 2008 is being put together. User support worked on documentation and organized several training sessions. The ECoM task force delivered the report on “Use Cases for Start-up of pp Data-Taking” with recommendations and a set of tests to be performed for trigger rates much higher than the ...

  19. Modeling Students' Problem Solving Performance in the Computer-Based Mathematics Learning Environment

    Science.gov (United States)

    Lee, Young-Jin

    2017-01-01

    Purpose: The purpose of this paper is to develop a quantitative model of problem solving performance of students in the computer-based mathematics learning environment. Design/methodology/approach: Regularized logistic regression was used to create a quantitative model of problem solving performance of students that predicts whether students can…

  20. Comparing Postsecondary Marketing Student Performance on Computer-Based and Handwritten Essay Tests

    Science.gov (United States)

    Truell, Allen D.; Alexander, Melody W.; Davis, Rodney E.

    2004-01-01

    The purpose of this study was to determine if there were differences in postsecondary marketing student performance on essay tests based on test format (i.e., computer-based or handwritten). Specifically, the variables of performance, test completion time, and gender were explored for differences based on essay test format. Results of the study…

  1. The high performance cluster computing system for BES offline data analysis

    International Nuclear Information System (INIS)

    Sun Yongzhao; Xu Dong; Zhang Shaoqiang; Yang Ting

    2004-01-01

    A high performance cluster computing system (EPCfarm) is introduced, which used for BES offline data analysis. The setup and the characteristics of the hardware and software of EPCfarm are described. The PBS, a queue management package, and the performance of EPCfarm is presented also. (authors)

  2. Systems, methods and computer-readable media for modeling cell performance fade of rechargeable electrochemical devices

    Science.gov (United States)

    Gering, Kevin L

    2013-08-27

    A system includes an electrochemical cell, monitoring hardware, and a computing system. The monitoring hardware periodically samples performance characteristics of the electrochemical cell. The computing system determines cell information from the performance characteristics of the electrochemical cell. The computing system also develops a mechanistic level model of the electrochemical cell to determine performance fade characteristics of the electrochemical cell and analyzing the mechanistic level model to estimate performance fade characteristics over aging of a similar electrochemical cell. The mechanistic level model uses first constant-current pulses applied to the electrochemical cell at a first aging period and at three or more current values bracketing a first exchange current density. The mechanistic level model also is based on second constant-current pulses applied to the electrochemical cell at a second aging period and at three or more current values bracketing the second exchange current density.

  3. Computer simulation of steady-state performance of air-to-air heat pumps

    Energy Technology Data Exchange (ETDEWEB)

    Ellison, R D; Creswick, F A

    1978-03-01

    A computer model by which the performance of air-to-air heat pumps can be simulated is described. The intended use of the model is to evaluate analytically the improvements in performance that can be effected by various component improvements. The model is based on a trio of independent simulation programs originated at the Massachusetts Institute of Technology Heat Transfer Laboratory. The three programs have been combined so that user intervention and decision making between major steps of the simulation are unnecessary. The program was further modified by substituting a new compressor model and adding a capillary tube model, both of which are described. Performance predicted by the computer model is shown to be in reasonable agreement with performance data observed in our laboratory. Planned modifications by which the utility of the computer model can be enhanced in the future are described. User instructions and a FORTRAN listing of the program are included.

  4. A high performance parallel approach to medical imaging

    International Nuclear Information System (INIS)

    Frieder, G.; Frieder, O.; Stytz, M.R.

    1988-01-01

    Research into medical imaging using general purpose parallel processing architectures is described and a review of the performance of previous medical imaging machines is provided. Results demonstrating that general purpose parallel architectures can achieve performance comparable to other, specialized, medical imaging machine architectures is presented. A new back-to-front hidden-surface removal algorithm is described. Results demonstrating the computational savings obtained by using the modified back-to-front hidden-surface removal algorithm are presented. Performance figures for forming a full-scale medical image on a mesh interconnected multiprocessor are presented

  5. Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment

    Science.gov (United States)

    Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Usman, Mohammed Joda

    2017-01-01

    Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing. PMID:28467505

  6. High Performance Numerical Computing for High Energy Physics: A New Challenge for Big Data Science

    International Nuclear Information System (INIS)

    Pop, Florin

    2014-01-01

    Modern physics is based on both theoretical analysis and experimental validation. Complex scenarios like subatomic dimensions, high energy, and lower absolute temperature are frontiers for many theoretical models. Simulation with stable numerical methods represents an excellent instrument for high accuracy analysis, experimental validation, and visualization. High performance computing support offers possibility to make simulations at large scale, in parallel, but the volume of data generated by these experiments creates a new challenge for Big Data Science. This paper presents existing computational methods for high energy physics (HEP) analyzed from two perspectives: numerical methods and high performance computing. The computational methods presented are Monte Carlo methods and simulations of HEP processes, Markovian Monte Carlo, unfolding methods in particle physics, kernel estimation in HEP, and Random Matrix Theory used in analysis of particles spectrum. All of these methods produce data-intensive applications, which introduce new challenges and requirements for ICT systems architecture, programming paradigms, and storage capabilities.

  7. Performance comparison of heuristic algorithms for task scheduling in IaaS cloud computing environment.

    Science.gov (United States)

    Madni, Syed Hamid Hussain; Abd Latiff, Muhammad Shafie; Abdullahi, Mohammed; Abdulhamid, Shafi'i Muhammad; Usman, Mohammed Joda

    2017-01-01

    Cloud computing infrastructure is suitable for meeting computational needs of large task sizes. Optimal scheduling of tasks in cloud computing environment has been proved to be an NP-complete problem, hence the need for the application of heuristic methods. Several heuristic algorithms have been developed and used in addressing this problem, but choosing the appropriate algorithm for solving task assignment problem of a particular nature is difficult since the methods are developed under different assumptions. Therefore, six rule based heuristic algorithms are implemented and used to schedule autonomous tasks in homogeneous and heterogeneous environments with the aim of comparing their performance in terms of cost, degree of imbalance, makespan and throughput. First Come First Serve (FCFS), Minimum Completion Time (MCT), Minimum Execution Time (MET), Max-min, Min-min and Sufferage are the heuristic algorithms considered for the performance comparison and analysis of task scheduling in cloud computing.

  8. The QUANTUM I project: Parallel processing in a local area network work dedicated to ab initio calculation of potential hypersurfaces

    International Nuclear Information System (INIS)

    Lavenir, E.; Pic, J.M.; Alibran, P.; Leclercq, J.M.

    1987-01-01

    The QUANTUM I project is a three-stage device. The stages are respectively dedicated to particular steps of the ab initio determination of a point on the hypersurface. The first stage deals with the computation of the integrals between the basis functions, the second with the S.C.F. (or M.C.S.C.F.) process and the third with the C.I treatment. Each step is developed in terms of parallel mode (M.I.M.D.), the whole device working following a pipeline mode: the three stages works simultaneously for different points

  9. Effect of aging on performance, muscle activation and perceived stress during mentally demanding computer tasks

    DEFF Research Database (Denmark)

    Alkjaer, Tine; Pilegaard, Marianne; Bakke, Merete

    2005-01-01

    OBJECTIVES: This study examined the effects of age on performance, muscle activation, and perceived stress during computer tasks with different levels of mental demand. METHODS: Fifteen young and thirteen elderly women performed two computer tasks [color word test and reference task] with different...... levels of mental demand but similar physical demands. The performance (clicking frequency, percentage of correct answers, and response time for correct answers) and electromyography from the forearm, shoulder, and neck muscles were recorded. Visual analogue scales were used to measure the participants......' perception of the stress and difficulty related to the tasks. RESULTS: Performance decreased significantly in both groups during the color word test in comparison with performance on the reference task. However, the performance reduction was more pronounced in the elderly group than in the young group...

  10. Performing three-dimensional neutral particle transport calculations on tera scale computers

    International Nuclear Information System (INIS)

    Woodward, C.S.; Brown, P.N.; Chang, B.; Dorr, M.R.; Hanebutte, U.R.

    1999-01-01

    A scalable, parallel code system to perform neutral particle transport calculations in three dimensions is presented. To utilize the hyper-cluster architecture of emerging tera scale computers, the parallel code successfully combines the MPI message passing and paradigms. The code's capabilities are demonstrated by a shielding calculation containing over 14 billion unknowns. This calculation was accomplished on the IBM SP ''ASCI-Blue-Pacific computer located at Lawrence Livermore National Laboratory (LLNL)

  11. Evaluating Electronic Customer Relationship Management Performance: Case Studies from Persian Automotive and Computer Industry

    OpenAIRE

    Safari, Narges; Safari, Fariba; Olesen, Karin; Shahmehr, Fatemeh

    2016-01-01

    This research paper investigates the influence of industry on electronic customer relationship management (e-CRM) performance. A case study approach with two cases was applied to evaluate the influence of e-CRM on customer behavioral and attitudinal loyalty along with customer pyramid. The cases covered two industries consisting of computer and automotive industries. For investigating customer behavioral loyalty and customer pyramid companies database were computed while for examining custome...

  12. Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

    KAUST Repository

    Downes, Turlough P.

    2013-01-01

    As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.

  13. Thinking processes used by high-performing students in a computer programming task

    Directory of Open Access Journals (Sweden)

    Marietjie Havenga

    2011-07-01

    Full Text Available Computer programmers must be able to understand programming source code and write programs that execute complex tasks to solve real-world problems. This article is a trans- disciplinary study at the intersection of computer programming, education and psychology. It outlines the role of mental processes in the process of programming and indicates how successful thinking processes can support computer science students in writing correct and well-defined programs. A mixed methods approach was used to better understand the thinking activities and programming processes of participating students. Data collection involved both computer programs and students’ reflective thinking processes recorded in their journals. This enabled analysis of psychological dimensions of participants’ thinking processes and their problem-solving activities as they considered a programming problem. Findings indicate that the cognitive, reflective and psychological processes used by high-performing programmers contributed to their success in solving a complex programming problem. Based on the thinking processes of high performers, we propose a model of integrated thinking processes, which can support computer programming students. Keywords: Computer programming, education, mixed methods research, thinking processes.  Disciplines: Computer programming, education, psychology

  14. An object-oriented programming paradigm for parallelization of computational fluid dynamics

    International Nuclear Information System (INIS)

    Ohta, Takashi.

    1997-03-01

    We propose an object-oriented programming paradigm for parallelization of scientific computing programs, and show that the approach can be a very useful strategy. Generally, parallelization of scientific programs tends to be complicated and unportable due to the specific requirements of each parallel computer or compiler. In this paper, we show that the object-oriented programming design, which separates the parallel processing parts from the solver of the applications, can achieve the large improvement in the maintenance of the codes, as well as the high portability. We design the program for the two-dimensional Euler equations according to the paradigm, and evaluate the parallel performance on IBM SP2. (author)

  15. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

    Directory of Open Access Journals (Sweden)

    Anwar S. Shatil

    2015-01-01

    Full Text Available With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1 inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2 highlight their main advantages; 3 discuss when it may (and may not be advisable to use them; 4 review some of their potential problems and barriers to access; and finally 5 give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc., a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.

  16. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing.

    Science.gov (United States)

    Shatil, Anwar S; Younas, Sohail; Pourreza, Hossein; Figley, Chase R

    2015-01-01

    With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications.

  17. Heads in the Cloud: A Primer on Neuroimaging Applications of High Performance Computing

    Science.gov (United States)

    Shatil, Anwar S.; Younas, Sohail; Pourreza, Hossein; Figley, Chase R.

    2015-01-01

    With larger data sets and more sophisticated analyses, it is becoming increasingly common for neuroimaging researchers to push (or exceed) the limitations of standalone computer workstations. Nonetheless, although high-performance computing platforms such as clusters, grids and clouds are already in routine use by a small handful of neuroimaging researchers to increase their storage and/or computational power, the adoption of such resources by the broader neuroimaging community remains relatively uncommon. Therefore, the goal of the current manuscript is to: 1) inform prospective users about the similarities and differences between computing clusters, grids and clouds; 2) highlight their main advantages; 3) discuss when it may (and may not) be advisable to use them; 4) review some of their potential problems and barriers to access; and finally 5) give a few practical suggestions for how interested new users can start analyzing their neuroimaging data using cloud resources. Although the aim of cloud computing is to hide most of the complexity of the infrastructure management from end-users, we recognize that this can still be an intimidating area for cognitive neuroscientists, psychologists, neurologists, radiologists, and other neuroimaging researchers lacking a strong computational background. Therefore, with this in mind, we have aimed to provide a basic introduction to cloud computing in general (including some of the basic terminology, computer architectures, infrastructure and service models, etc.), a practical overview of the benefits and drawbacks, and a specific focus on how cloud resources can be used for various neuroimaging applications. PMID:27279746

  18. COMPUTING

    CERN Multimedia

    Contributions from I. Fisk

    2012-01-01

    Introduction The start of the 2012 run has been busy for Computing. We have reconstructed, archived, and served a larger sample of new data than in 2011, and we are in the process of producing an even larger new sample of simulations at 8 TeV. The running conditions and system performance are largely what was anticipated in the plan, thanks to the hard work and preparation of many people. Heavy ions Heavy Ions has been actively analysing data and preparing for conferences.  Operations Office Figure 6: Transfers from all sites in the last 90 days For ICHEP and the Upgrade efforts, we needed to produce and process record amounts of MC samples while supporting the very successful data-taking. This was a large burden, especially on the team members. Nevertheless the last three months were very successful and the total output was phenomenal, thanks to our dedicated site admins who keep the sites operational and the computing project members who spend countless hours nursing the...

  19. 8th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2015-01-01

    Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.

  20. COMPUTING

    CERN Multimedia

    I. Fisk

    2011-01-01

    Introduction The Computing Team successfully completed the storage, initial processing, and distribution for analysis of proton-proton data in 2011. There are still a variety of activities ongoing to support winter conference activities and preparations for 2012. Heavy ions The heavy-ion run for 2011 started in early November and has already demonstrated good machine performance and success of some of the more advanced workflows planned for 2011. Data collection will continue until early December. Facilities and Infrastructure Operations Operational and deployment support for WMAgent and WorkQueue+Request Manager components, routinely used in production by Data Operations, are provided. The GlideInWMS and components installation are now deployed at CERN, which is added to the GlideInWMS factory placed in the US. There has been new operational collaboration between the CERN team and the UCSD GlideIn factory operators, covering each others time zones by monitoring/debugging pilot jobs sent from the facto...