WorldWideScience

Sample records for non-mpi computation time

  1. Parallel Computing Characteristics of CUPID code under MPI and Hybrid environment

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jae Ryong; Yoon, Han Young [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of); Jeon, Byoung Jin; Choi, Hyoung Gwon [Seoul National Univ. of Science and Technology, Seoul (Korea, Republic of)

    2014-05-15

    In this paper, a characteristic of parallel algorithm is presented for solving an elliptic type equation of CUPID via domain decomposition method using the MPI and the parallel performance is estimated in terms of a scalability which shows the speedup ratio. In addition, the time-consuming pattern of major subroutines is studied. Two different grid systems are taken into account: 40,000 meshes for coarse system and 320,000 meshes for fine system. Since the matrix of the CUPID code differs according to whether the flow is single-phase or two-phase, the effect of matrix shape is evaluated. Finally, the effect of the preconditioner for matrix solver is also investigated. Finally, the hybrid (OpenMP+MPI) parallel algorithm is introduced and discussed in detail for solving pressure solver. Component-scale thermal-hydraulics code, CUPID has been developed for two-phase flow analysis, which adopts a three-dimensional, transient, three-field model, and parallelized to fulfill a recent demand for long-transient and highly resolved multi-phase flow behavior. In this study, the parallel performance of the CUPID code was investigated in terms of scalability. The CUPID code was parallelized with domain decomposition method. The MPI library was adopted to communicate the information at the neighboring domain. For managing the sparse matrix effectively, the CSR storage format is used. To take into account the characteristics of the pressure matrix which turns to be asymmetric for two-phase flow, both single-phase and two-phase calculations were run. In addition, the effect of the matrix size and preconditioning was also investigated. The fine mesh calculation shows better scalability than the coarse mesh because the number of coarse mesh does not need to decompose the computational domain excessively. The fine mesh can be present good scalability when dividing geometry with considering the ratio between computation and communication time. For a given mesh, single-phase flow

  2. Computational mathematics models, methods, and analysis with Matlab and MPI

    CERN Document Server

    White, Robert E

    2004-01-01

    Computational Mathematics: Models, Methods, and Analysis with MATLAB and MPI explores and illustrates this process. Each section of the first six chapters is motivated by a specific application. The author applies a model, selects a numerical method, implements computer simulations, and assesses the ensuing results. These chapters include an abundance of MATLAB code. By studying the code instead of using it as a "black box, " you take the first step toward more sophisticated numerical modeling. The last four chapters focus on multiprocessing algorithms implemented using message passing interface (MPI). These chapters include Fortran 9x codes that illustrate the basic MPI subroutines and revisit the applications of the previous chapters from a parallel implementation perspective. All of the codes are available for download from www4.ncsu.edu./~white.This book is not just about math, not just about computing, and not just about applications, but about all three--in other words, computational science. Whether us...

  3. Overlapping Communication and Computation with OpenMP and MPI

    Directory of Open Access Journals (Sweden)

    Timothy H. Kaiser

    2001-01-01

    Full Text Available Machines comprised of a distributed collection of shared memory or SMP nodes are becoming common for parallel computing. OpenMP can be combined with MPI on many such machines. Motivations for combing OpenMP and MPI are discussed. While OpenMP is typically used for exploiting loop-level parallelism it can also be used to enable coarse grain parallelism, potentially leading to less overhead. We show how coarse grain OpenMP parallelism can also be used to facilitate overlapping MPI communication and computation for stencil-based grid programs such as a program performing Gauss-Seidel iteration with red-black ordering. Spatial subdivision or domain decomposition is used to assign a portion of the grid to each thread. One thread is assigned a null calculation region so it was free to perform communication. Example calculations were run on an IBM SP using both the Kuck & Associates and IBM compilers.

  4. MPI_XSTAR: MPI-based Parallelization of the XSTAR Photoionization Program

    Science.gov (United States)

    Danehkar, Ashkbiz; Nowak, Michael A.; Lee, Julia C.; Smith, Randall K.

    2018-02-01

    We describe a program for the parallel implementation of multiple runs of XSTAR, a photoionization code that is used to predict the physical properties of an ionized gas from its emission and/or absorption lines. The parallelization program, called MPI_XSTAR, has been developed and implemented in the C++ language by using the Message Passing Interface (MPI) protocol, a conventional standard of parallel computing. We have benchmarked parallel multiprocessing executions of XSTAR, using MPI_XSTAR, against a serial execution of XSTAR, in terms of the parallelization speedup and the computing resource efficiency. Our experience indicates that the parallel execution runs significantly faster than the serial execution, however, the efficiency in terms of the computing resource usage decreases with increasing the number of processors used in the parallel computing.

  5. Hybrid MPI/OpenMP parallelization of the explicit Volterra integral equation solver for multi-core computer architectures

    KAUST Repository

    Al Jarro, Ahmed

    2011-08-01

    A hybrid MPI/OpenMP scheme for efficiently parallelizing the explicit marching-on-in-time (MOT)-based solution of the time-domain volume (Volterra) integral equation (TD-VIE) is presented. The proposed scheme equally distributes tested field values and operations pertinent to the computation of tested fields among the nodes using the MPI standard; while the source field values are stored in all nodes. Within each node, OpenMP standard is used to further accelerate the computation of the tested fields. Numerical results demonstrate that the proposed parallelization scheme scales well for problems involving three million or more spatial discretization elements. © 2011 IEEE.

  6. SBML-PET-MPI: a parallel parameter estimation tool for Systems Biology Markup Language based models.

    Science.gov (United States)

    Zi, Zhike

    2011-04-01

    Parameter estimation is crucial for the modeling and dynamic analysis of biological systems. However, implementing parameter estimation is time consuming and computationally demanding. Here, we introduced a parallel parameter estimation tool for Systems Biology Markup Language (SBML)-based models (SBML-PET-MPI). SBML-PET-MPI allows the user to perform parameter estimation and parameter uncertainty analysis by collectively fitting multiple experimental datasets. The tool is developed and parallelized using the message passing interface (MPI) protocol, which provides good scalability with the number of processors. SBML-PET-MPI is freely available for non-commercial use at http://www.bioss.uni-freiburg.de/cms/sbml-pet-mpi.html or http://sites.google.com/site/sbmlpetmpi/.

  7. Study on High Performance of MPI-Based Parallel FDTD from WorkStation to Super Computer Platform

    Directory of Open Access Journals (Sweden)

    Z. L. He

    2012-01-01

    Full Text Available Parallel FDTD method is applied to analyze the electromagnetic problems of the electrically large targets on super computer. It is well known that the more the number of processors the less computing time consumed. Nevertheless, with the same number of processors, computing efficiency is affected by the scheme of the MPI virtual topology. Then, the influence of different virtual topology schemes on parallel performance of parallel FDTD is studied in detail. The general rules are presented on how to obtain the highest efficiency of parallel FDTD algorithm by optimizing MPI virtual topology. To show the validity of the presented method, several numerical results are given in the later part. Various comparisons are made and some useful conclusions are summarized.

  8. DISP: Optimizations towards Scalable MPI Startup

    Energy Technology Data Exchange (ETDEWEB)

    Fu, Huansong [Florida State University, Tallahassee; Pophale, Swaroop S [ORNL; Gorentla Venkata, Manjunath [ORNL; Yu, Weikuan [Florida State University, Tallahassee

    2016-01-01

    Despite the popularity of MPI for high performance computing, the startup of MPI programs faces a scalability challenge as both the execution time and memory consumption increase drastically at scale. We have examined this problem using the collective modules of Cheetah and Tuned in Open MPI as representative implementations. Previous improvements for collectives have focused on algorithmic advances and hardware off-load. In this paper, we examine the startup cost of the collective module within a communicator and explore various techniques to improve its efficiency and scalability. Accordingly, we have developed a new scalable startup scheme with three internal techniques, namely Delayed Initialization, Module Sharing and Prediction-based Topology Setup (DISP). Our DISP scheme greatly benefits the collective initialization of the Cheetah module. At the same time, it helps boost the performance of non-collective initialization in the Tuned module. We evaluate the performance of our implementation on Titan supercomputer at ORNL with up to 4096 processes. The results show that our delayed initialization can speed up the startup of Tuned and Cheetah by an average of 32.0% and 29.2%, respectively, our module sharing can reduce the memory consumption of Tuned and Cheetah by up to 24.1% and 83.5%, respectively, and our prediction-based topology setup can speed up the startup of Cheetah by up to 80%.

  9. Power-aware load balancing of large scale MPI applications

    OpenAIRE

    Etinski, Maja; Corbalán González, Julita; Labarta Mancho, Jesús José; Valero Cortés, Mateo; Veidenbaum, Alex

    2009-01-01

    Power consumption is a very important issue for HPC community, both at the level of one application or at the level of whole workload. Load imbalance of a MPI application can be exploited to save CPU energy without penalizing the execution time. An application is load imbalanced when some nodes are assigned more computation than others. The nodes with less computation can be run at lower frequency since otherwise they have to wait for the nodes with more computation blocked in MPI calls. A te...

  10. OpenMPI and ExxonMobil Topics

    Energy Technology Data Exchange (ETDEWEB)

    Hjelm, Nathan Thomas [Los Alamos National Lab. (LANL), Los Alamos, NM (United States); Pritchard, Howard Porter [Los Alamos National Lab. (LANL), Los Alamos, NM (United States)

    2016-09-09

    These are a series of slides for a presentation for ExxonMobil's visit to Los Alamos National Laboratory. Topics covered are: Open MPI - The Release Story, MPI-3 RMA in Open MPI, MPI dynamic process management and Open MPI, and new options with CLE 6. Open MPI RMA features are: since v2.0.0 full support for the MPI-3.1 specification, support for non-contiguous datatypes, support for direct use of the RDMA capabilities of high performance networks (Cray Gemini/Aries, Infiniband), starting in v2.1.0 will have support for using network atomic operations for MPI_Fetch_and_op and MPI_Compare_and_swap, tested with MPI_THREAD_MULTIPLE.

  11. Processing MPI Datatypes Outside MPI

    Science.gov (United States)

    Ross, Robert; Latham, Robert; Gropp, William; Lusk, Ewing; Thakur, Rajeev

    The MPI datatype functionality provides a powerful tool for describing structured memory and file regions in parallel applications, enabling noncontiguous data to be operated on by MPI communication and I/O routines. However, no facilities are provided by the MPI standard to allow users to efficiently manipulate MPI datatypes in their own codes.

  12. GASFLOW-MPI. A scalable computational fluid dynamics code for gases, aerosols and combustion. Vol. 1. Theory and computational model (Revision 1.0)

    Energy Technology Data Exchange (ETDEWEB)

    Xiao, Jianjun; Travis, Jack; Royl, Peter; Necker, Gottfried; Svishchev, Anatoly; Jordan, Thomas

    2016-07-01

    Karlsruhe Institute of Technology (KIT) is developing the parallel computational fluid dynamics code GASFLOW-MPI as a best-estimate tool for predicting transport, mixing, and combustion of hydrogen and other gases in nuclear reactor containments and other facility buildings. GASFLOW-MPI is a finite-volume code based on proven computational fluid dynamics methodology that solves the compressible Navier-Stokes equations for three-dimensional volumes in Cartesian or cylindrical coordinates.

  13. Exploiting Efficient Transpacking for One-Sided Communication and MPI-IO

    Science.gov (United States)

    Mir, Faisal Ghias; Träff, Jesper Larsson

    Based on a construction of socalled input-output datatypes that define a mapping between non-consecutive input and output buffers, we outline an efficient method for copying of structured data. We term this operation transpacking, and show how transpacking can be applied for the MPI implementation of one-sided communication and MPI-IO. For one-sided communication via shared-memory, we demonstrate the expected performance improvements by up to a factor of two. For individual MPI-IO, the time to read or write from file dominates the overall time, but even here efficient transpacking can in some scenarios reduce file I/O time considerably. The reported results have been achieved on a single NEC SX-8 vector node.

  14. Benchmarking MILC code with OpenMP and MPI

    International Nuclear Information System (INIS)

    Gottlieb, Steven; Tamhankar, Sonali

    2001-01-01

    A trend in high performance computers that is becoming increasingly popular is the use of symmetric multi-processing (SMP) rather than the older paradigm of MPP. MPI codes that ran and scaled well on MPP machines can often be run on an SMP machine using the vendor's version of MPI. However, this approach may not make optimal use of the (expensive) SMP hardware. More significantly, there are machines like Blue Horizon, an IBM SP with 8-way SMP nodes at the San Diego Supercomputer Center that can only support 4 MPI processes per node (with the current switch). On such a machine it is imperative to be able to use OpenMP parallelism on the node, and MPI between nodes. We describe the challenges of converting MILC MPI code to using a second level of OpenMP parallelism, and benchmarks on IBM and Sun computers

  15. Magnetic Particle / Magnetic Resonance Imaging: In-Vitro MPI-Guided Real Time Catheter Tracking and 4D Angioplasty Using a Road Map and Blood Pool Tracer Approach.

    Science.gov (United States)

    Salamon, Johannes; Hofmann, Martin; Jung, Caroline; Kaul, Michael Gerhard; Werner, Franziska; Them, Kolja; Reimer, Rudolph; Nielsen, Peter; Vom Scheidt, Annika; Adam, Gerhard; Knopp, Tobias; Ittrich, Harald

    2016-01-01

    In-vitro evaluation of the feasibility of 4D real time tracking of endovascular devices and stenosis treatment with a magnetic particle imaging (MPI) / magnetic resonance imaging (MRI) road map approach and an MPI-guided approach using a blood pool tracer. A guide wire and angioplasty-catheter were labeled with a thin layer of magnetic lacquer. For real time MPI a custom made software framework was developed. A stenotic vessel phantom filled with saline or superparamagnetic iron oxide nanoparticles (MM4) was equipped with bimodal fiducial markers for co-registration in preclinical 7T MRI and MPI. In-vitro angioplasty was performed inflating the balloon with saline or MM4. MPI data were acquired using a field of view of 37.3×37.3×18.6 mm3 and a frame rate of 46 volumes/sec. Analysis of the magnetic lacquer-marks on the devices were performed with electron microscopy, atomic absorption spectrometry and micro-computed tomography. Magnetic marks allowed for MPI/MRI guidance of interventional devices. Bimodal fiducial markers enable MPI/MRI image fusion for MRI based roadmapping. MRI roadmapping and the blood pool tracer approach facilitate MPI real time monitoring of in-vitro angioplasty. Successful angioplasty was verified with MPI and MRI. Magnetic marks consist of micrometer sized ferromagnetic plates mainly composed of iron and iron oxide. 4D real time MP imaging, tracking and guiding of endovascular instruments and in-vitro angioplasty is feasible. In addition to an approach that requires a blood pool tracer, MRI based roadmapping might emerge as a promising tool for radiation free 4D MPI-guided interventions.

  16. Magnetic Particle / Magnetic Resonance Imaging: In-Vitro MPI-Guided Real Time Catheter Tracking and 4D Angioplasty Using a Road Map and Blood Pool Tracer Approach.

    Directory of Open Access Journals (Sweden)

    Johannes Salamon

    Full Text Available In-vitro evaluation of the feasibility of 4D real time tracking of endovascular devices and stenosis treatment with a magnetic particle imaging (MPI / magnetic resonance imaging (MRI road map approach and an MPI-guided approach using a blood pool tracer.A guide wire and angioplasty-catheter were labeled with a thin layer of magnetic lacquer. For real time MPI a custom made software framework was developed. A stenotic vessel phantom filled with saline or superparamagnetic iron oxide nanoparticles (MM4 was equipped with bimodal fiducial markers for co-registration in preclinical 7T MRI and MPI. In-vitro angioplasty was performed inflating the balloon with saline or MM4. MPI data were acquired using a field of view of 37.3×37.3×18.6 mm3 and a frame rate of 46 volumes/sec. Analysis of the magnetic lacquer-marks on the devices were performed with electron microscopy, atomic absorption spectrometry and micro-computed tomography.Magnetic marks allowed for MPI/MRI guidance of interventional devices. Bimodal fiducial markers enable MPI/MRI image fusion for MRI based roadmapping. MRI roadmapping and the blood pool tracer approach facilitate MPI real time monitoring of in-vitro angioplasty. Successful angioplasty was verified with MPI and MRI. Magnetic marks consist of micrometer sized ferromagnetic plates mainly composed of iron and iron oxide.4D real time MP imaging, tracking and guiding of endovascular instruments and in-vitro angioplasty is feasible. In addition to an approach that requires a blood pool tracer, MRI based roadmapping might emerge as a promising tool for radiation free 4D MPI-guided interventions.

  17. pupyMPI - MPI implemented in pure Python

    DEFF Research Database (Denmark)

    Bromer, Rune; Hantho, Frederik; Vinter, Brian

    2011-01-01

    As distributed memory systems have become common, the de facto standard for communication is still the Message Passing Interface (MPI). pupyMPI is a pure Python implementation of a broad subset of the MPI 1.3 specifications that allows Python programmers to utilize multiple CPUs with datatypes...

  18. SKaMPI: A Comprehensive Benchmark for Public Benchmarking of MPI

    Directory of Open Access Journals (Sweden)

    Ralf Reussner

    2002-01-01

    Full Text Available The main objective of the MPI communication library is to enable portable parallel programming with high performance within the message-passing paradigm. Since the MPI standard has no associated performance model, and makes no performance guarantees, comprehensive, detailed and accurate performance figures for different hardware platforms and MPI implementations are important for the application programmer, both for understanding and possibly improving the behavior of a given program on a given platform, as well as for assuring a degree of predictable behavior when switching to another hardware platform and/or MPI implementation. We term this latter goal performance portability, and address the problem of attaining performance portability by benchmarking. We describe the SKaMPI benchmark which covers a large fraction of MPI, and incorporates well-accepted mechanisms for ensuring accuracy and reliability. SKaMPI is distinguished among other MPI benchmarks by an effort to maintain a public performance database with performance data from different hardware platforms and MPI implementations.

  19. Scalable High Performance Message Passing over InfiniBand for Open MPI

    Energy Technology Data Exchange (ETDEWEB)

    Friedley, A; Hoefler, T; Leininger, M L; Lumsdaine, A

    2007-10-24

    InfiniBand (IB) is a popular network technology for modern high-performance computing systems. MPI implementations traditionally support IB using a reliable, connection-oriented (RC) transport. However, per-process resource usage that grows linearly with the number of processes, makes this approach prohibitive for large-scale systems. IB provides an alternative in the form of a connectionless unreliable datagram transport (UD), which allows for near-constant resource usage and initialization overhead as the process count increases. This paper describes a UD-based implementation for IB in Open MPI as a scalable alternative to existing RC-based schemes. We use the software reliability capabilities of Open MPI to provide the guaranteed delivery semantics required by MPI. Results show that UD not only requires fewer resources at scale, but also allows for shorter MPI startup times. A connectionless model also improves performance for applications that tend to send small messages to many different processes.

  20. GASFLOW-MPI. A scalable computational fluid dynamics code for gases, aerosols and combustion. Vol. 2. Users' manual (Revision 1.0)

    Energy Technology Data Exchange (ETDEWEB)

    Xiao, Jianjun; Travis, Jack; Royl, Peter; Necker, Gottfried; Svishchev, Anatoly; Jordan, Thomas

    2016-07-01

    Karlsruhe Institute of Technology (KIT) is developing the parallel computational fluid dynamics code GASFLOW-MPI as a best-estimate tool for predicting transport, mixing, and combustion of hydrogen and other gases in nuclear reactor containments and other facility buildings. GASFLOW-MPI is a finite-volume code based on proven computational fluid dynamics methodology that solves the compressible Navier-Stokes equations for three-dimensional volumes in Cartesian or cylindrical coordinates.

  1. MSAProbs-MPI: parallel multiple sequence aligner for distributed-memory systems.

    Science.gov (United States)

    González-Domínguez, Jorge; Liu, Yongchao; Touriño, Juan; Schmidt, Bertil

    2016-12-15

    MSAProbs is a state-of-the-art protein multiple sequence alignment tool based on hidden Markov models. It can achieve high alignment accuracy at the expense of relatively long runtimes for large-scale input datasets. In this work we present MSAProbs-MPI, a distributed-memory parallel version of the multithreaded MSAProbs tool that is able to reduce runtimes by exploiting the compute capabilities of common multicore CPU clusters. Our performance evaluation on a cluster with 32 nodes (each containing two Intel Haswell processors) shows reductions in execution time of over one order of magnitude for typical input datasets. Furthermore, MSAProbs-MPI using eight nodes is faster than the GPU-accelerated QuickProbs running on a Tesla K20. Another strong point is that MSAProbs-MPI can deal with large datasets for which MSAProbs and QuickProbs might fail due to time and memory constraints, respectively. Source code in C ++ and MPI running on Linux systems as well as a reference manual are available at http://msaprobs.sourceforge.net CONTACT: jgonzalezd@udc.esSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  2. Compiled MPI: Cost-Effective Exascale Applications Development

    Energy Technology Data Exchange (ETDEWEB)

    Bronevetsky, G; Quinlan, D; Lumsdaine, A; Hoefler, T

    2012-04-10

    The complexity of petascale and exascale machines makes it increasingly difficult to develop applications that can take advantage of them. Future systems are expected to feature billion-way parallelism, complex heterogeneous compute nodes and poor availability of memory (Peter Kogge, 2008). This new challenge for application development is motivating a significant amount of research and development on new programming models and runtime systems designed to simplify large-scale application development. Unfortunately, DoE has significant multi-decadal investment in a large family of mission-critical scientific applications. Scaling these applications to exascale machines will require a significant investment that will dwarf the costs of hardware procurement. A key reason for the difficulty in transitioning today's applications to exascale hardware is their reliance on explicit programming techniques, such as the Message Passing Interface (MPI) programming model to enable parallelism. MPI provides a portable and high performance message-passing system that enables scalable performance on a wide variety of platforms. However, it also forces developers to lock the details of parallelization together with application logic, making it very difficult to adapt the application to significant changes in the underlying system. Further, MPI's explicit interface makes it difficult to separate the application's synchronization and communication structure, reducing the amount of support that can be provided by compiler and run-time tools. This is in contrast to the recent research on more implicit parallel programming models such as Chapel, OpenMP and OpenCL, which promise to provide significantly more flexibility at the cost of reimplementing significant portions of the application. We are developing CoMPI, a novel compiler-driven approach to enable existing MPI applications to scale to exascale systems with minimal modifications that can be made incrementally over

  3. MPI Debugging with Handle Introspection

    DEFF Research Database (Denmark)

    Brock-Nannestad, Laust; DelSignore, John; Squyres, Jeffrey M.

    The Message Passing Interface, MPI, is the standard programming model for high performance computing clusters. However, debugging applications on large scale clusters is difficult. The widely used Message Queue Dumping interface enables inspection of message queue state but there is no general in...

  4. MPI_XSTAR: MPI-based parallelization of XSTAR program

    Science.gov (United States)

    Danehkar, A.

    2017-12-01

    MPI_XSTAR parallelizes execution of multiple XSTAR runs using Message Passing Interface (MPI). XSTAR (ascl:9910.008), part of the HEASARC's HEAsoft (ascl:1408.004) package, calculates the physical conditions and emission spectra of ionized gases. MPI_XSTAR invokes XSTINITABLE from HEASoft to generate a job list of XSTAR commands for given physical parameters. The job list is used to make directories in ascending order, where each individual XSTAR is spawned on each processor and outputs are saved. HEASoft's XSTAR2TABLE program is invoked upon the contents of each directory in order to produce table model FITS files for spectroscopy analysis tools.

  5. MPI support in the DIRAC Pilot Job Workload Management System

    International Nuclear Information System (INIS)

    Tsaregorodtsev, A; Hamar, V

    2012-01-01

    Parallel job execution in the grid environment using MPI technology presents a number of challenges for the sites providing this support. Multiple flavors of the MPI libraries, shared working directories required by certain applications, special settings for the batch systems make the MPI support difficult for the site managers. On the other hand the workload management systems with Pilot Jobs became ubiquitous although the support for the MPI applications in the Pilot frameworks was not available. This support was recently added in the DIRAC Project in the context of the GISELA Latin American Grid Initiative. Special services for dynamic allocation of virtual computer pools on the grid sites were developed in order to deploy MPI rings corresponding to the requirements of the jobs in the central task queue of the DIRAC Workload Management System. Pilot Jobs using user space file system techniques install the required MPI software automatically. The same technique is used to emulate shared working directories for the parallel MPI processes. This makes it possible to execute MPI jobs even on the sites not supporting them officially. Reusing so constructed MPI rings for execution of a series of parallel jobs increases dramatically their efficiency and turnaround. In this contribution we describe the design and implementation of the DIRAC MPI Service as well as its support for various types of MPI libraries. Advantages of coupling the MPI support with the Pilot frameworks are outlined and examples of usage with real applications are presented.

  6. What does fault tolerant Deep Learning need from MPI?

    Energy Technology Data Exchange (ETDEWEB)

    Amatya, Vinay C.; Vishnu, Abhinav; Siegel, Charles M.; Daily, Jeffrey A.

    2017-09-25

    Deep Learning (DL) algorithms have become the {\\em de facto} Machine Learning (ML) algorithm for large scale data analysis. DL algorithms are computationally expensive -- even distributed DL implementations which use MPI require days of training (model learning) time on commonly studied datasets. Long running DL applications become susceptible to faults -- requiring development of a fault tolerant system infrastructure, in addition to fault tolerant DL algorithms. This raises an important question: {\\em What is needed from MPI for designing fault tolerant DL implementations?} In this paper, we address this problem for permanent faults. We motivate the need for a fault tolerant MPI specification by an in-depth consideration of recent innovations in DL algorithms and their properties, which drive the need for specific fault tolerance features. We present an in-depth discussion on the suitability of different parallelism types (model, data and hybrid); a need (or lack thereof) for check-pointing of any critical data structures; and most importantly, consideration for several fault tolerance proposals (user-level fault mitigation (ULFM), Reinit) in MPI and their applicability to fault tolerant DL implementations. We leverage a distributed memory implementation of Caffe, currently available under the Machine Learning Toolkit for Extreme Scale (MaTEx). We implement our approaches by extending MaTEx-Caffe for using ULFM-based implementation. Our evaluation using the ImageNet dataset and AlexNet neural network topology demonstrates the effectiveness of the proposed fault tolerant DL implementation using OpenMPI based ULFM.

  7. An approach to computing discrete adjoints for MPI-parallelized models applied to Ice Sheet System Model 4.11

    Directory of Open Access Journals (Sweden)

    E. Larour

    2016-11-01

    Full Text Available Within the framework of sea-level rise projections, there is a strong need for hindcast validation of the evolution of polar ice sheets in a way that tightly matches observational records (from radar, gravity, and altimetry observations mainly. However, the computational requirements for making hindcast reconstructions possible are severe and rely mainly on the evaluation of the adjoint state of transient ice-flow models. Here, we look at the computation of adjoints in the context of the NASA/JPL/UCI Ice Sheet System Model (ISSM, written in C++ and designed for parallel execution with MPI. We present the adaptations required in the way the software is designed and written, but also generic adaptations in the tools facilitating the adjoint computations. We concentrate on the use of operator overloading coupled with the AdjoinableMPI library to achieve the adjoint computation of the ISSM. We present a comprehensive approach to (1 carry out type changing through the ISSM, hence facilitating operator overloading, (2 bind to external solvers such as MUMPS and GSL-LU, and (3 handle MPI-based parallelism to scale the capability. We demonstrate the success of the approach by computing sensitivities of hindcast metrics such as the misfit to observed records of surface altimetry on the northeastern Greenland Ice Stream, or the misfit to observed records of surface velocities on Upernavik Glacier, central West Greenland. We also provide metrics for the scalability of the approach, and the expected performance. This approach has the potential to enable a new generation of hindcast-validated projections that make full use of the wealth of datasets currently being collected, or already collected, in Greenland and Antarctica.

  8. Cell verification of parallel burnup calculation program MCBMPI based on MPI

    International Nuclear Information System (INIS)

    Yang Wankui; Liu Yaoguang; Ma Jimin; Wang Guanbo; Yang Xin; She Ding

    2014-01-01

    The parallel burnup calculation program MCBMPI was developed. The program was modularized. The parallel MCNP5 program MCNP5MPI was employed as neutron transport calculation module. And a composite of three solution methods was used to solve burnup equation, i.e. matrix exponential technique, TTA analytical solution, and Gauss Seidel iteration. MPI parallel zone decomposition strategy was concluded in the program. The program system only consists of MCNP5MPI and burnup subroutine. The latter achieves three main functions, i.e. zone decomposition, nuclide transferring and decaying, and data exchanging with MCNP5MPI. Also, the program was verified with the pressurized water reactor (PWR) cell burnup benchmark. The results show that it,s capable to apply the program to burnup calculation of multiple zones, and the computation efficiency could be significantly improved with the development of computer hardware. (authors)

  9. Exposing MPI Objects for Debugging

    DEFF Research Database (Denmark)

    Brock-Nannestad, Laust; DelSignore, John; Squyres, Jeffrey M.

    Developers rely on debuggers to inspect application state. In applications that use MPI, the Message Passing Interface, the MPI runtime contains an important part of this state. The MPI Tools Working Group has proposed an interface for MPI Handle Introspection. It allows debuggers and MPI impleme...

  10. Fault Tolerance Assistant (FTA): An Exception Handling Programming Model for MPI Applications

    Energy Technology Data Exchange (ETDEWEB)

    Fang, Aiman [Univ. of Chicago, IL (United States). Dept. of Computer Science; Laguna, Ignacio [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Sato, Kento [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Islam, Tanzima [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Mohror, Kathryn [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2016-05-23

    Future high-performance computing systems may face frequent failures with their rapid increase in scale and complexity. Resilience to faults has become a major challenge for large-scale applications running on supercomputers, which demands fault tolerance support for prevalent MPI applications. Among failure scenarios, process failures are one of the most severe issues as they usually lead to termination of applications. However, the widely used MPI implementations do not provide mechanisms for fault tolerance. We propose FTA-MPI (Fault Tolerance Assistant MPI), a programming model that provides support for failure detection, failure notification and recovery. Specifically, FTA-MPI exploits a try/catch model that enables failure localization and transparent recovery of process failures in MPI applications. We demonstrate FTA-MPI with synthetic applications and a molecular dynamics code CoMD, and show that FTA-MPI provides high programmability for users and enables convenient and flexible recovery of process failures.

  11. Performance of MPI parallel processing implemented by MCNP5/ MCNPX for criticality benchmark problems

    International Nuclear Information System (INIS)

    Mark Dennis Usang; Mohd Hairie Rabir; Mohd Amin Sharifuldin Salleh; Mohamad Puad Abu

    2012-01-01

    MPI parallelism are implemented on a SUN Workstation for running MCNPX and on the High Performance Computing Facility (HPC) for running MCNP5. 23 input less obtained from MCNP Criticality Validation Suite are utilized for the purpose of evaluating the amount of speed up achievable by using the parallel capabilities of MPI. More importantly, we will study the economics of using more processors and the type of problem where the performance gain are obvious. This is important to enable better practices of resource sharing especially for the HPC facilities processing time. Future endeavours in this direction might even reveal clues for best MCNP5/ MCNPX coding practices for optimum performance of MPI parallelisms. (author)

  12. The particular prediction of normal MPI in diabetic patients

    International Nuclear Information System (INIS)

    Wu, Z.-F.; Li, S.-J.; Liu, H.-Y.; Liu, J.Z.; Li, X.F.; Cheng, Y.; Zhang, Y.W.; Wang, J.

    2007-01-01

    Full text: Objectives: To explore the prognostic value of normal SPECT MPI in diabetic pts. Methods: 1371 consecutively registered pts suspected with CAD were studied using rest SPECT MPI, and 1047 cases (76.37%) were followed up successfully. The mean interval of following up was 33.25±14.95(1∼56) months, and even longer than 18 months for pts with no cardiac events (CE). Results: Of 1047 pts, 172 were diabetic. During the follow up period, there are 42 cardiac events in 172 diabetic patients, and 86 in 857 non-diabetics. Diabetic pts had significantly higher rates of cardiac events (24.4% versus 9.8%; chi-square 28.5, P<0.0001). In the 567 pts with normal MPI, there are 4 cardiac events in 54 diabetic pts and 6 cases in 513 nondiabetic pts. The diabetic pts had significantly higher rates of cardiac events compared with the non-diabetic pts (7.41% versus 1.17%, Fisher's Exact Test, P=0.01). Conclusions: A normal SPECT has a high negative predictive value, but diabetic patients had significantly higher cardiac events rate compared with non-diabetic patients, what ever the MPI is normal or abnormal. (author)

  13. Towards deductive verification of MPI programs against session types

    Directory of Open Access Journals (Sweden)

    Eduardo R. B. Marques

    2013-12-01

    Full Text Available The Message Passing Interface (MPI is the de facto standard message-passing infrastructure for developing parallel applications. Two decades after the first version of the library specification, MPI-based applications are nowadays routinely deployed on super and cluster computers. These applications, written in C or Fortran, exhibit intricate message passing behaviours, making it hard to statically verify important properties such as the absence of deadlocks. Our work builds on session types, a theory for describing protocols that provides for correct-by-construction guarantees in this regard. We annotate MPI primitives and C code with session type contracts, written in the language of a software verifier for C. Annotated code is then checked for correctness with the software verifier. We present preliminary results and discuss the challenges that lie ahead for verifying realistic MPI program compliance against session types.

  14. MPI-AMRVAC FOR SOLAR AND ASTROPHYSICS

    Energy Technology Data Exchange (ETDEWEB)

    Porth, O. [Department of Applied Mathematics, The University of Leeds, Leeds LS2 9JT (United Kingdom); Xia, C.; Hendrix, T.; Moschou, S. P.; Keppens, R., E-mail: o.porth@leeds.ac.uk [Centre for mathematical Plasma Astrophysics, Department of Mathematics, KU Leuven, Celestijnenlaan 200B, B-3001 Leuven (Belgium)

    2014-09-01

    In this paper, we present an update to the open source MPI-AMRVAC simulation toolkit where we focus on solar and non-relativistic astrophysical magnetofluid dynamics. We highlight recent developments in terms of physics modules, such as hydrodynamics with dust coupling and the conservative implementation of Hall magnetohydrodynamics. A simple conservative high-order finite difference scheme that works in combination with all available physics modules is introduced and demonstrated with the example of monotonicity-preserving fifth-order reconstruction. Strong stability-preserving high-order Runge-Kutta time steppers are used to obtain stable evolutions in multi-dimensional applications, realizing up to fourth-order accuracy in space and time. With the new distinction between active and passive grid cells, MPI-AMRVAC is ideally suited to simulate evolutions where parts of the solution are controlled analytically or have a tendency to progress into or out of a stationary state. Typical test problems and representative applications are discussed with an outlook toward follow-up research. Finally, we discuss the parallel scaling of the code and demonstrate excellent weak scaling up to 30, 000 processors, allowing us to exploit modern peta-scale infrastructure.

  15. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Directory of Open Access Journals (Sweden)

    Parichit Sharma

    Full Text Available The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture

  16. WImpiBLAST: web interface for mpiBLAST to help biologists perform large-scale annotation using high performance computing.

    Science.gov (United States)

    Sharma, Parichit; Mantri, Shrikant S

    2014-01-01

    The function of a newly sequenced gene can be discovered by determining its sequence homology with known proteins. BLAST is the most extensively used sequence analysis program for sequence similarity search in large databases of sequences. With the advent of next generation sequencing technologies it has now become possible to study genes and their expression at a genome-wide scale through RNA-seq and metagenome sequencing experiments. Functional annotation of all the genes is done by sequence similarity search against multiple protein databases. This annotation task is computationally very intensive and can take days to obtain complete results. The program mpiBLAST, an open-source parallelization of BLAST that achieves superlinear speedup, can be used to accelerate large-scale annotation by using supercomputers and high performance computing (HPC) clusters. Although many parallel bioinformatics applications using the Message Passing Interface (MPI) are available in the public domain, researchers are reluctant to use them due to lack of expertise in the Linux command line and relevant programming experience. With these limitations, it becomes difficult for biologists to use mpiBLAST for accelerating annotation. No web interface is available in the open-source domain for mpiBLAST. We have developed WImpiBLAST, a user-friendly open-source web interface for parallel BLAST searches. It is implemented in Struts 1.3 using a Java backbone and runs atop the open-source Apache Tomcat Server. WImpiBLAST supports script creation and job submission features and also provides a robust job management interface for system administrators. It combines script creation and modification features with job monitoring and management through the Torque resource manager on a Linux-based HPC cluster. Use case information highlights the acceleration of annotation analysis achieved by using WImpiBLAST. Here, we describe the WImpiBLAST web interface features and architecture, explain design

  17. MPI to Coarray Fortran: Experiences with a CFD Solver for Unstructured Meshes

    Directory of Open Access Journals (Sweden)

    Anuj Sharma

    2017-01-01

    Full Text Available High-resolution numerical methods and unstructured meshes are required in many applications of Computational Fluid Dynamics (CFD. These methods are quite computationally expensive and hence benefit from being parallelized. Message Passing Interface (MPI has been utilized traditionally as a parallelization strategy. However, the inherent complexity of MPI contributes further to the existing complexity of the CFD scientific codes. The Partitioned Global Address Space (PGAS parallelization paradigm was introduced in an attempt to improve the clarity of the parallel implementation. We present our experiences of converting an unstructured high-resolution compressible Navier-Stokes CFD solver from MPI to PGAS Coarray Fortran. We present the challenges, methodology, and performance measurements of our approach using Coarray Fortran. With the Cray compiler, we observe Coarray Fortran as a viable alternative to MPI. We are hopeful that Intel and open-source implementations could be utilized in the future.

  18. Coupling Computer Codes for The Analysis of Severe Accident Using A Pseudo Shared Memory Based on MPI

    International Nuclear Information System (INIS)

    Cho, Young Chul; Park, Chang-Hwan; Kim, Dong-Min

    2016-01-01

    As there are four codes in-vessel analysis code (CSPACE), ex-vessel analysis code (SACAP), corium behavior analysis code (COMPASS), and fission product behavior analysis code, for the analysis of severe accident, it is complex to implement the coupling of codes with the similar methodologies for RELAP and CONTEMPT or SPACE and CAP. Because of that, an efficient coupling so called Pseudo shared memory architecture was introduced. In this paper, coupling methodologies will be compared and the methodology used for the analysis of severe accident will be discussed in detail. The barrier between in-vessel and ex-vessel has been removed for the analysis of severe accidents with the implementation of coupling computer codes with pseudo shared memory architecture based on MPI. The remaining are proper choice and checking of variables and values for the selected severe accident scenarios, e.g., TMI accident. Even though it is possible to couple more than two computer codes with pseudo shared memory architecture, the methodology should be revised to couple parallel codes especially when they are programmed using MPI

  19. Coupling Computer Codes for The Analysis of Severe Accident Using A Pseudo Shared Memory Based on MPI

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Young Chul; Park, Chang-Hwan; Kim, Dong-Min [FNC Technology Co., Yongin (Korea, Republic of)

    2016-10-15

    As there are four codes in-vessel analysis code (CSPACE), ex-vessel analysis code (SACAP), corium behavior analysis code (COMPASS), and fission product behavior analysis code, for the analysis of severe accident, it is complex to implement the coupling of codes with the similar methodologies for RELAP and CONTEMPT or SPACE and CAP. Because of that, an efficient coupling so called Pseudo shared memory architecture was introduced. In this paper, coupling methodologies will be compared and the methodology used for the analysis of severe accident will be discussed in detail. The barrier between in-vessel and ex-vessel has been removed for the analysis of severe accidents with the implementation of coupling computer codes with pseudo shared memory architecture based on MPI. The remaining are proper choice and checking of variables and values for the selected severe accident scenarios, e.g., TMI accident. Even though it is possible to couple more than two computer codes with pseudo shared memory architecture, the methodology should be revised to couple parallel codes especially when they are programmed using MPI.

  20. Conflict Detection Algorithm to Minimize Locking for MPI-IO Atomicity

    Science.gov (United States)

    Sehrish, Saba; Wang, Jun; Thakur, Rajeev

    Many scientific applications require high-performance concurrent I/O accesses to a file by multiple processes. Those applications rely indirectly on atomic I/O capabilities in order to perform updates to structured datasets, such as those stored in HDF5 format files. Current support for atomicity in MPI-IO is provided by locking around the operations, imposing lock overhead in all situations, even though in many cases these operations are non-overlapping in the file. We propose to isolate non-overlapping accesses from overlapping ones in independent I/O cases, allowing the non-overlapping ones to proceed without imposing lock overhead. To enable this, we have implemented an efficient conflict detection algorithm in MPI-IO using MPI file views and datatypes. We show that our conflict detection scheme incurs minimal overhead on I/O operations, making it an effective mechanism for avoiding locks when they are not needed.

  1. Hybrid x-space: a new approach for MPI reconstruction.

    Science.gov (United States)

    Tateo, A; Iurino, A; Settanni, G; Andrisani, A; Stifanelli, P F; Larizza, P; Mazzia, F; Mininni, R M; Tangaro, S; Bellotti, R

    2016-06-07

    Magnetic particle imaging (MPI) is a new medical imaging technique capable of recovering the distribution of superparamagnetic particles from their measured induced signals. In literature there are two main MPI reconstruction techniques: measurement-based (MB) and x-space (XS). The MB method is expensive because it requires a long calibration procedure as well as a reconstruction phase that can be numerically costly. On the other side, the XS method is simpler than MB but the exact knowledge of the field free point (FFP) motion is essential for its implementation. Our simulation work focuses on the implementation of a new approach for MPI reconstruction: it is called hybrid x-space (HXS), representing a combination of the previous methods. Specifically, our approach is based on XS reconstruction because it requires the knowledge of the FFP position and velocity at each time instant. The difference with respect to the original XS formulation is how the FFP velocity is computed: we estimate it from the experimental measurements of the calibration scans, typical of the MB approach. Moreover, a compressive sensing technique is applied in order to reduce the calibration time, setting a fewer number of sampling positions. Simulations highlight that HXS and XS methods give similar results. Furthermore, an appropriate use of compressive sensing is crucial for obtaining a good balance between time reduction and reconstructed image quality. Our proposal is suitable for open geometry configurations of human size devices, where incidental factors could make the currents, the fields and the FFP trajectory irregular.

  2. Hybrid MPI/OpenMP parallelization of the explicit Volterra integral equation solver for multi-core computer architectures

    KAUST Repository

    Al Jarro, Ahmed; Bagci, Hakan

    2011-01-01

    A hybrid MPI/OpenMP scheme for efficiently parallelizing the explicit marching-on-in-time (MOT)-based solution of the time-domain volume (Volterra) integral equation (TD-VIE) is presented. The proposed scheme equally distributes tested field values

  3. Relationship Between Coronary Contrast-Flow Quantitative Flow Ratio and Myocardial Ischemia Assessed by SPECT MPI.

    Science.gov (United States)

    Smit, Jeff M; Koning, Gerhard; van Rosendael, Alexander R; Dibbets-Schneider, Petra; Mertens, Bart J; Jukema, J Wouter; Delgado, Victoria; Reiber, Johan H C; Bax, Jeroen J; Scholte, Arthur J

    2017-10-01

    A new method has been developed to calculate fractional flow reserve (FFR) from invasive coronary angiography, the so-called "contrast-flow quantitative flow ratio (cQFR)". Recently, cQFR was compared to invasive FFR in intermediate coronary lesions showing an overall diagnostic accuracy of 85%. The purpose of this study was to investigate the relationship between cQFR and myocardial ischemia assessed by single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI). Patients who underwent SPECT MPI and coronary angiography within 3 months were included. The cQFR computation was performed offline, using dedicated software. The cQFR computation was based on 3-dimensional quantitative coronary angiography (QCA) and computational fluid dynamics. The standard 17-segment model was used to determine the vascular territories. Myocardial ischemia was defined as a summed difference score ≥2 in a vascular territory. A cQFR of ≤0.80 was considered abnormal. Two hundred and twenty-four coronary arteries were analysed in 85 patients. Overall accuracy of cQFR to detect ischemia on SPECT MPI was 90%. In multivariable analysis, cQFR was independently associated with ischemia on SPECT MPI (OR per 0.01 decrease of cQFR: 1.10; 95% CI 1.04-1.18, p = 0.002), whereas clinical and QCA parameters were not. Furthermore, cQFR showed incremental value for the detection of ischemia compared to clinical and QCA parameters (global chi square 48.7 to 62.6; p relationship between cQFR and SPECT MPI was found. cQFR was independently associated with ischemia on SPECT MPI and showed incremental value to detect ischemia compared to clinical and QCA parameters.

  4. Parallelizing AT with MatlabMPI

    International Nuclear Information System (INIS)

    2011-01-01

    The Accelerator Toolbox (AT) is a high-level collection of tools and scripts specifically oriented toward solving problems dealing with computational accelerator physics. It is integrated into the MATLAB environment, which provides an accessible, intuitive interface for accelerator physicists, allowing researchers to focus the majority of their efforts on simulations and calculations, rather than programming and debugging difficulties. Efforts toward parallelization of AT have been put in place to upgrade its performance to modern standards of computing. We utilized the packages MatlabMPI and pMatlab, which were developed by MIT Lincoln Laboratory, to set up a message-passing environment that could be called within MATLAB, which set up the necessary pre-requisites for multithread processing capabilities. On local quad-core CPUs, we were able to demonstrate processor efficiencies of roughly 95% and speed increases of nearly 380%. By exploiting the efficacy of modern-day parallel computing, we were able to demonstrate incredibly efficient speed increments per processor in AT's beam-tracking functions. Extrapolating from prediction, we can expect to reduce week-long computation runtimes to less than 15 minutes. This is a huge performance improvement and has enormous implications for the future computing power of the accelerator physics group at SSRL. However, one of the downfalls of parringpass is its current lack of transparency; the pMatlab and MatlabMPI packages must first be well-understood by the user before the system can be configured to run the scripts. In addition, the instantiation of argument parameters requires internal modification of the source code. Thus, parringpass, cannot be directly run from the MATLAB command line, which detracts from its flexibility and user-friendliness. Future work in AT's parallelization will focus on development of external functions and scripts that can be called from within MATLAB and configured on multiple nodes, while

  5. Relationship between coronary contrast-flow quantitative flow ratio and myocardial ischemia assessed by SPECT MPI

    Energy Technology Data Exchange (ETDEWEB)

    Smit, Jeff M.; Rosendael, Alexander R. van; Jukema, J.W.; Delgado, Victoria; Bax, Jeroen J.; Scholte, Arthur J. [Leiden University Medical Center, Department of Cardiology, Leiden (Netherlands); Koning, Gerhard [Medis Medical Imaging Systems B.V., Leiden (Netherlands); Dibbets-Schneider, Petra [Leiden University Medical Center, Department of Nuclear Medicine, Leiden (Netherlands); Mertens, Bart J. [Leiden University Medical Center, Department of Medical Statistics, Leiden (Netherlands); Reiber, Johan H.C. [Medis Medical Imaging Systems B.V., Leiden (Netherlands); Leiden University Medical Center, Department of Radiology, Leiden (Netherlands)

    2017-10-15

    A new method has been developed to calculate fractional flow reserve (FFR) from invasive coronary angiography, the so-called ''contrast-flow quantitative flow ratio (cQFR)''. Recently, cQFR was compared to invasive FFR in intermediate coronary lesions showing an overall diagnostic accuracy of 85%. The purpose of this study was to investigate the relationship between cQFR and myocardial ischemia assessed by single-photon emission computed tomography myocardial perfusion imaging (SPECT MPI). Patients who underwent SPECT MPI and coronary angiography within 3 months were included. The cQFR computation was performed offline, using dedicated software. The cQFR computation was based on 3-dimensional quantitative coronary angiography (QCA) and computational fluid dynamics. The standard 17-segment model was used to determine the vascular territories. Myocardial ischemia was defined as a summed difference score ≥2 in a vascular territory. A cQFR of ≤0.80 was considered abnormal. Two hundred and twenty-four coronary arteries were analysed in 85 patients. Overall accuracy of cQFR to detect ischemia on SPECT MPI was 90%. In multivariable analysis, cQFR was independently associated with ischemia on SPECT MPI (OR per 0.01 decrease of cQFR: 1.10; 95% CI 1.04-1.18, p = 0.002), whereas clinical and QCA parameters were not. Furthermore, cQFR showed incremental value for the detection of ischemia compared to clinical and QCA parameters (global chi square 48.7 to 62.6; p <0.001). A good relationship between cQFR and SPECT MPI was found. cQFR was independently associated with ischemia on SPECT MPI and showed incremental value to detect ischemia compared to clinical and QCA parameters. (orig.)

  6. Safety and efficacy of Regadenoson in myocardial perfusion imaging (MPI) stress tests: A review

    Science.gov (United States)

    Ahmed, Ambereen

    2018-02-01

    Myocardial perfusion imaging (MPI) tests are often used to help diagnose coronary heart disease (CAD). The tests usually involve applying stress, such as hard physical exercise together with administration of vasodilators, to the patients. To date, many of these tests use non-selective A2A adenosine receptor agonists which, however, can be associated with highly undesirable and life-threatening side effects such as chest pain, dyspnea, severe bronchoconstriction and atrioventricular conduction anomalies. Regadenoson is a relatively new, highly selective A2A adenosine receptor agonist, suitable for use in MPI tests which exhibits far fewer adverse side effects and, unlike others testing agents, can be used without the necessity of excessive concomitant exercise. Also, the dose of regadenoson required is not dependent upon patient weight or renal impairment, and it can be rapidly administered by i.v. Injection. Regadenoson use in MPI testing thus has the potential as a simplified, relatively safe, time-saving and cost-effective method for helping diagnose CAD. The present study was designed to review several articles on the safety, efficacy, and suitability of regadenoson in MPI testing for CAD. Overall, the combined studies demonstrated that use of regadenoson in conjunction with low-level exercise in MPI is a highly efficient and relatively safe test for CAD, especially for more severe health-compromised patients.

  7. Development of an MPI benchmark program library

    Energy Technology Data Exchange (ETDEWEB)

    Uehara, Hitoshi

    2001-03-01

    Distributed parallel simulation software with message passing interfaces has been developed to realize large-scale and high performance numerical simulations. The most popular API for message communication is an MPI. The MPI will be provided on the Earth Simulator. It is known that performance of message communication using the MPI libraries gives a significant influence on a whole performance of simulation programs. We developed an MPI benchmark program library named MBL in order to measure the performance of message communication precisely. The MBL measures the performance of major MPI functions such as point-to-point communications and collective communications and the performance of major communication patterns which are often found in application programs. In this report, the description of the MBL and the performance analysis of the MPI/SX measured on the SX-4 are presented. (author)

  8. X-space MPI: magnetic nanoparticles for safe medical imaging.

    Science.gov (United States)

    Goodwill, Patrick William; Saritas, Emine Ulku; Croft, Laura Rose; Kim, Tyson N; Krishnan, Kannan M; Schaffer, David V; Conolly, Steven M

    2012-07-24

    One quarter of all iodinated contrast X-ray clinical imaging studies are now performed on Chronic Kidney Disease (CKD) patients. Unfortunately, the iodine contrast agent used in X-ray is often toxic to CKD patients' weak kidneys, leading to significant morbidity and mortality. Hence, we are pioneering a new medical imaging method, called Magnetic Particle Imaging (MPI), to replace X-ray and CT iodinated angiography, especially for CKD patients. MPI uses magnetic nanoparticle contrast agents that are much safer than iodine for CKD patients. MPI already offers superb contrast and extraordinary sensitivity. The iron oxide nanoparticle tracers required for MPI are also used in MRI, and some are already approved for human use, but the contrast agents are far more effective at illuminating blood vessels when used in the MPI modality. We have recently developed a systems theoretic framework for MPI called x-space MPI, which has already dramatically improved the speed and robustness of MPI image reconstruction. X-space MPI has allowed us to optimize the hardware for fi ve MPI scanners. Moreover, x-space MPI provides a powerful framework for optimizing the size and magnetic properties of the iron oxide nanoparticle tracers used in MPI. Currently MPI nanoparticles have diameters in the 10-20 nanometer range, enabling millimeter-scale resolution in small animals. X-space MPI theory predicts that larger nanoparticles could enable up to 250 micrometer resolution imaging, which would represent a major breakthrough in safe imaging for CKD patients.

  9. Fortran code for SU(3) lattice gauge theory with and without MPI checkerboard parallelization

    Science.gov (United States)

    Berg, Bernd A.; Wu, Hao

    2012-10-01

    We document plain Fortran and Fortran MPI checkerboard code for Markov chain Monte Carlo simulations of pure SU(3) lattice gauge theory with the Wilson action in D dimensions. The Fortran code uses periodic boundary conditions and is suitable for pedagogical purposes and small scale simulations. For the Fortran MPI code two geometries are covered: the usual torus with periodic boundary conditions and the double-layered torus as defined in the paper. Parallel computing is performed on checkerboards of sublattices, which partition the full lattice in one, two, and so on, up to D directions (depending on the parameters set). For updating, the Cabibbo-Marinari heatbath algorithm is used. We present validations and test runs of the code. Performance is reported for a number of currently used Fortran compilers and, when applicable, MPI versions. For the parallelized code, performance is studied as a function of the number of processors. Program summary Program title: STMC2LSU3MPI Catalogue identifier: AEMJ_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEMJ_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 26666 No. of bytes in distributed program, including test data, etc.: 233126 Distribution format: tar.gz Programming language: Fortran 77 compatible with the use of Fortran 90/95 compilers, in part with MPI extensions. Computer: Any capable of compiling and executing Fortran 77 or Fortran 90/95, when needed with MPI extensions. Operating system: Red Hat Enterprise Linux Server 6.1 with OpenMPI + pgf77 11.8-0, Centos 5.3 with OpenMPI + gfortran 4.1.2, Cray XT4 with MPICH2 + pgf90 11.2-0. Has the code been vectorised or parallelized?: Yes, parallelized using MPI extensions. Number of processors used: 2 to 11664 RAM: 200 Mega bytes per process. Classification: 11

  10. Multiphoton ionization of H+2 at critical internuclear separations: non-Hermitian Floquet analysis

    International Nuclear Information System (INIS)

    Likhatov, P V; Telnov, D A

    2009-01-01

    We present ab initio time-dependent non-Hermitian Floquet calculations of multiphoton ionization (MPI) rates of hydrogen molecular ions subject to an intense linearly polarized monochromatic laser field with a wavelength of 800 nm. The orientation of the molecular axis is parallel to the polarization vector of the laser field. The MPI rates are computed for a wide range of internuclear separations R with high resolution in R and reproduce resonance and near-threshold structures. We show that enhancement of ionization at critical internuclear separations is related to resonance series with higher electronic states. The effect of two-centre interference on the MPI signal is discussed.

  11. Development of a parallel genetic algorithm using MPI and its application in a nuclear reactor core. Design optimization

    International Nuclear Information System (INIS)

    Waintraub, Marcel; Pereira, Claudio M.N.A.; Baptista, Rafael P.

    2005-01-01

    This work presents the development of a distributed parallel genetic algorithm applied to a nuclear reactor core design optimization. In the implementation of the parallelism, a 'Message Passing Interface' (MPI) library, standard for parallel computation in distributed memory platforms, has been used. Another important characteristic of MPI is its portability for various architectures. The main objectives of this paper are: validation of the results obtained by the application of this algorithm in a nuclear reactor core optimization problem, through comparisons with previous results presented by Pereira et al.; and performance test of the Brazilian Nuclear Engineering Institute (IEN) cluster in reactors physics optimization problems. The experiments demonstrated that the developed parallel genetic algorithm using the MPI library presented significant gains in the obtained results and an accentuated reduction of the processing time. Such results ratify the use of the parallel genetic algorithms for the solution of nuclear reactor core optimization problems. (author)

  12. High-Level Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms

    KAUST Repository

    Hasanov, Khalid; Quintin, Jean-Noë l; Lastovetsky, Alexey

    2014-01-01

    by taking into account either their topology or platform parameters. In this work we propose a very simple and at the same time general approach to optimize legacy MPI broadcast algorithms, which are widely used in MPICH and OpenMPI. Theoretical analysis

  13. Design of Superparamagnetic Nanoparticles for Magnetic Particle Imaging (MPI

    Directory of Open Access Journals (Sweden)

    Philip W. T. Pong

    2013-09-01

    Full Text Available Magnetic particle imaging (MPI is a promising medical imaging technique producing quantitative images of the distribution of tracer materials (superparamagnetic nanoparticles without interference from the anatomical background of the imaging objects (either phantoms or lab animals. Theoretically, the MPI platform can image with relatively high temporal and spatial resolution and sensitivity. In practice, the quality of the MPI images hinges on both the applied magnetic field and the properties of the tracer nanoparticles. Langevin theory can model the performance of superparamagnetic nanoparticles and predict the crucial influence of nanoparticle core size on the MPI signal. In addition, the core size distribution, anisotropy of the magnetic core and surface modification of the superparamagnetic nanoparticles also determine the spatial resolution and sensitivity of the MPI images. As a result, through rational design of superparamagnetic nanoparticles, the performance of MPI could be effectively optimized. In this review, the performance of superparamagnetic nanoparticles in MPI is investigated. Rational synthesis and modification of superparamagnetic nanoparticles are discussed and summarized. The potential medical application areas for MPI, including cardiovascular system, oncology, stem cell tracking and immune related imaging are also analyzed and forecasted.

  14. How to use MPI communication in highly parallel climate simulations more easily and more efficiently.

    Science.gov (United States)

    Behrens, Jörg; Hanke, Moritz; Jahns, Thomas

    2014-05-01

    In this talk we present a way to facilitate efficient use of MPI communication for developers of climate models. Exploitation of the performance potential of today's highly parallel supercomputers with real world simulations is a complex task. This is partly caused by the low level nature of the MPI communication library which is the dominant communication tool at least for inter-node communication. In order to manage the complexity of the task, climate simulations with non-trivial communication patterns often use an internal abstraction layer above MPI without exploiting the benefits of communication aggregation or MPI-datatypes. The solution for the complexity and performance problem we propose is the communication library YAXT. This library is built on top of MPI and takes high level descriptions of arbitrary domain decompositions and automatically derives an efficient collective data exchange. Several exchanges can be aggregated in order to reduce latency costs. Examples are given which demonstrate the simplicity and the performance gains for selected climate applications.

  15. Development of Mixed Mode MPI / OpenMP Applications

    Directory of Open Access Journals (Sweden)

    Lorna Smith

    2001-01-01

    Full Text Available MPI / OpenMP mixed mode codes could potentially offer the most effective parallelisation strategy for an SMP cluster, as well as allowing the different characteristics of both paradigms to be exploited to give the best performance on a single SMP. This paper discusses the implementation, development and performance of mixed mode MPI / OpenMP applications. The results demonstrate that this style of programming will not always be the most effective mechanism on SMP systems and cannot be regarded as the ideal programming model for all codes. In some situations, however, significant benefit may be obtained from a mixed mode implementation. For example, benefit may be obtained if the parallel (MPI code suffers from: poor scaling with MPI processes due to load imbalance or too fine a grain problem size, memory limitations due to the use of a replicated data strategy, or a restriction on the number of MPI processes combinations. In addition, if the system has a poorly optimised or limited scaling MPI implementation then a mixed mode code may increase the code performance.

  16. Static Deadlock Detection in MPI Synchronization Communication

    OpenAIRE

    Ming-Xue, Liao; Xiao-Xin, He; Zhi-Hua, Fan

    2007-01-01

    It is very common to use dynamic methods to detect deadlocks in MPI programs for the reason that static methods have some restrictions. To guarantee high reliability of some important MPI-based application software, a model of MPI synchronization communication is abstracted and a type of static method is devised to examine deadlocks in such modes. The model has three forms with different complexity: sequential model, single-loop model and nested-loop model. Sequential model is a base for all ...

  17. Hybrid MPI-OpenMP Parallelism in the ONETEP Linear-Scaling Electronic Structure Code: Application to the Delamination of Cellulose Nanofibrils.

    Science.gov (United States)

    Wilkinson, Karl A; Hine, Nicholas D M; Skylaris, Chris-Kriton

    2014-11-11

    We present a hybrid MPI-OpenMP implementation of Linear-Scaling Density Functional Theory within the ONETEP code. We illustrate its performance on a range of high performance computing (HPC) platforms comprising shared-memory nodes with fast interconnect. Our work has focused on applying OpenMP parallelism to the routines which dominate the computational load, attempting where possible to parallelize different loops from those already parallelized within MPI. This includes 3D FFT box operations, sparse matrix algebra operations, calculation of integrals, and Ewald summation. While the underlying numerical methods are unchanged, these developments represent significant changes to the algorithms used within ONETEP to distribute the workload across CPU cores. The new hybrid code exhibits much-improved strong scaling relative to the MPI-only code and permits calculations with a much higher ratio of cores to atoms. These developments result in a significantly shorter time to solution than was possible using MPI alone and facilitate the application of the ONETEP code to systems larger than previously feasible. We illustrate this with benchmark calculations from an amyloid fibril trimer containing 41,907 atoms. We use the code to study the mechanism of delamination of cellulose nanofibrils when undergoing sonification, a process which is controlled by a large number of interactions that collectively determine the structural properties of the fibrils. Many energy evaluations were needed for these simulations, and as these systems comprise up to 21,276 atoms this would not have been feasible without the developments described here.

  18. High-Level Topology-Oblivious Optimization of MPI Broadcast Algorithms on Extreme-Scale Platforms

    KAUST Repository

    Hasanov, Khalid

    2014-01-01

    There has been a significant research in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research works are done to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a very simple and at the same time general approach to optimize legacy MPI broadcast algorithms, which are widely used in MPICH and OpenMPI. Theoretical analysis and experimental results on IBM BlueGene/P and a cluster of Grid’5000 platform are presented.

  19. LUCKY-TD code for solving the time-dependent transport equation with the use of parallel computations

    Energy Technology Data Exchange (ETDEWEB)

    Moryakov, A. V., E-mail: sailor@orc.ru [National Research Centre Kurchatov Institute (Russian Federation)

    2016-12-15

    An algorithm for solving the time-dependent transport equation in the P{sub m}S{sub n} group approximation with the use of parallel computations is presented. The algorithm is implemented in the LUCKY-TD code for supercomputers employing the MPI standard for the data exchange between parallel processes.

  20. Final report: Compiled MPI. Cost-Effective Exascale Application Development

    Energy Technology Data Exchange (ETDEWEB)

    Gropp, William Douglas [Univ. of Illinois, Urbana-Champaign, IL (United States)

    2015-12-21

    This is the final report on Compiled MPI: Cost-Effective Exascale Application Development, and summarizes the results under this project. The project investigated runtime enviroments that improve the performance of MPI (Message-Passing Interface) programs; work at Illinois in the last period of this project looked at optimizing data access optimizations expressed with MPI datatypes.

  1. High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster

    International Nuclear Information System (INIS)

    Komatitsch, Dimitri; Erlebacher, Gordon; Goeddeke, Dominik; Michea, David

    2010-01-01

    We implement a high-order finite-element application, which performs the numerical simulation of seismic wave propagation resulting for instance from earthquakes at the scale of a continent or from active seismic acquisition experiments in the oil industry, on a large cluster of NVIDIA Tesla graphics cards using the CUDA programming environment and non-blocking message passing based on MPI. Contrary to many finite-element implementations, ours is implemented successfully in single precision, maximizing the performance of current generation GPUs. We discuss the implementation and optimization of the code and compare it to an existing very optimized implementation in C language and MPI on a classical cluster of CPU nodes. We use mesh coloring to efficiently handle summation operations over degrees of freedom on an unstructured mesh, and non-blocking MPI messages in order to overlap the communications across the network and the data transfer to and from the device via PCIe with calculations on the GPU. We perform a number of numerical tests to validate the single-precision CUDA and MPI implementation and assess its accuracy. We then analyze performance measurements and depending on how the problem is mapped to the reference CPU cluster, we obtain a speedup of 20x or 12x.

  2. Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2013-01-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel's MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.

  3. Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu

    2013-12-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore supercomputers: IBM POWER4, POWER5+ and BlueGene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks and Intel\\'s MPI benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore supercomputers because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyrokinetic Toroidal Code (GTC) in magnetic fusion to validate our performance model of the hybrid application on these multicore supercomputers. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore supercomputers. © 2013 Elsevier Inc.

  4. A comprehensive study of MPI parallelism in three-dimensional discrete element method (DEM) simulation of complex-shaped granular particles

    Science.gov (United States)

    Yan, Beichuan; Regueiro, Richard A.

    2018-02-01

    A three-dimensional (3D) DEM code for simulating complex-shaped granular particles is parallelized using message-passing interface (MPI). The concepts of link-block, ghost/border layer, and migration layer are put forward for design of the parallel algorithm, and theoretical scalability function of 3-D DEM scalability and memory usage is derived. Many performance-critical implementation details are managed optimally to achieve high performance and scalability, such as: minimizing communication overhead, maintaining dynamic load balance, handling particle migrations across block borders, transmitting C++ dynamic objects of particles between MPI processes efficiently, eliminating redundant contact information between adjacent MPI processes. The code executes on multiple US Department of Defense (DoD) supercomputers and tests up to 2048 compute nodes for simulating 10 million three-axis ellipsoidal particles. Performance analyses of the code including speedup, efficiency, scalability, and granularity across five orders of magnitude of simulation scale (number of particles) are provided, and they demonstrate high speedup and excellent scalability. It is also discovered that communication time is a decreasing function of the number of compute nodes in strong scaling measurements. The code's capability of simulating a large number of complex-shaped particles on modern supercomputers will be of value in both laboratory studies on micromechanical properties of granular materials and many realistic engineering applications involving granular materials.

  5. CORBA and MPI-based 'backbone' for coupling advanced simulation tools

    International Nuclear Information System (INIS)

    Seydaliev, M.; Caswell, D.

    2014-01-01

    There is a growing international interest in using coupled, multidisciplinary computer simulations for a variety of purposes, including nuclear reactor safety analysis. Reactor behaviour can be modeled using a suite of computer programs simulating phenomena or predicting parameters that can be categorized into disciplines such as Thermalhydraulics, Neutronics, Fuel, Fuel Channels, Fission Product Release and Transport, Containment and Atmospheric Dispersion, and Severe Accident Analysis. Traditionally, simulations used for safety analysis individually addressed only the behaviour within a single discipline, based upon static input data from other simulation programs. The limitation of using a suite of stand-alone simulations is that phenomenological interdependencies or temporal feedback between the parameters calculated within individual simulations cannot be adequately captured. To remove this shortcoming, multiple computer simulations for different disciplines must exchange data during runtime to address these interdependencies. This article describes the concept of a new framework, which we refer to as the 'Backbone', to provide the necessary runtime exchange of data. The Backbone, currently under development at AECL for a preliminary feasibility study, is a hybrid design using features taken from the Common Object Request Broker Architecture (CORBA), a standard defined by the Object Management Group, and the Message Passing Interface (MPI), a standard developed by a group of researchers from academia and industry. Both have well-tested and efficient implementations, including some that are freely available under the GNU public licenses. The CORBA component enables individual programs written in different languages and running on different platforms within a network to exchange data with each other, thus behaving like a single application. MPI provides the process-to-process intercommunication between these programs. This paper outlines the different CORBA and

  6. Time Domain Terahertz Axial Computed Tomography Non Destructive Evaluation, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — We propose to demonstrate key elements of feasibility for a high speed automated time domain terahertz computed axial tomography (TD-THz CT) non destructive...

  7. Automatic Migration from PARMACS to MPI in Parallel Fortran Applications

    Directory of Open Access Journals (Sweden)

    Rolf Hempel

    1999-01-01

    Full Text Available The PARMACS message passing interface has been in widespread use by application projects, especially in Europe. With the new MPI standard for message passing, many projects face the problem of replacing PARMACS with MPI. An automatic translation tool has been developed which replaces all PARMACS 6.0 calls in an application program with their corresponding MPI calls. In this paper we describe the mapping of the PARMACS programming model onto MPI. We then present some implementation details of the converter tool.

  8. Time Domain Terahertz Axial Computed Tomography Non Destructive Evaluation, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — In this Phase 2 project, we propose to develop, construct, and deliver to NASA a computed axial tomography time-domain terahertz (CT TD-THz) non destructive...

  9. Parallel Fortran-MPI software for numerical inversion of the Laplace transform and its application to oscillatory water levels in groundwater environments

    Science.gov (United States)

    Zhan, X.

    2005-01-01

    A parallel Fortran-MPI (Message Passing Interface) software for numerical inversion of the Laplace transform based on a Fourier series method is developed to meet the need of solving intensive computational problems involving oscillatory water level's response to hydraulic tests in a groundwater environment. The software is a parallel version of ACM (The Association for Computing Machinery) Transactions on Mathematical Software (TOMS) Algorithm 796. Running 38 test examples indicated that implementation of MPI techniques with distributed memory architecture speedups the processing and improves the efficiency. Applications to oscillatory water levels in a well during aquifer tests are presented to illustrate how this package can be applied to solve complicated environmental problems involved in differential and integral equations. The package is free and is easy to use for people with little or no previous experience in using MPI but who wish to get off to a quick start in parallel computing. ?? 2004 Elsevier Ltd. All rights reserved.

  10. Psychometric evaluation of the Spanish version of the MPI-SCI.

    Science.gov (United States)

    Soler, M D; Cruz-Almeida, Y; Saurí, J; Widerström-Noga, E G

    2013-07-01

    Postal surveys. To confirm the factor structure of the Spanish version of the MPI-SCI (MPI-SCI-S, Multidimensional Pain Inventory in the SCI population) and to test its internal consistency and construct validity in a Spanish population. Guttmann Institute, Barcelona, Spain. The MPI-SCI-S along with Spanish measures of pain intensity (Numerical Rating Scale), pain interference (Brief Pain Inventory), functional independence (Functional Independence Measure), depression (Beck Depression Inventory), locus of control (Multidimensional health Locus of Control), support (Functional Social Support Questionnaire (Duke-UNC)), psychological well-being (Psychological Global Well-Being Index) and demographic/injury characteristics were assessed in persons with spinal cord injury (SCI) and chronic pain (n=126). Confirmatory factor analysis suggested an adequate factor structure for the MPI-SCI-S. The internal consistency of the MPI-SCI-S subscales ranged from acceptable (r=0.66, Life Control) to excellent (r=0.94, Life Interference). All MPI-SCI-S subscales showed adequate construct validity, with the exception of the Negative and Solicitous Responses subscales. The Spanish version of the MPI-SCI is adequate for evaluating chronic pain impact following SCI in a Spanish-speaking population. Future studies should include additional measures of pain-related support in the Spanish-speaking SCI population.

  11. Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems

    Directory of Open Access Journals (Sweden)

    Sol Ji Kang

    2015-01-01

    Full Text Available With problem size and complexity increasing, several parallel and distributed programming models and frameworks have been developed to efficiently handle such problems. This paper briefly reviews the parallel computing models and describes three widely recognized parallel programming frameworks: OpenMP, MPI, and MapReduce. OpenMP is the de facto standard for parallel programming on shared memory systems. MPI is the de facto industry standard for distributed memory systems. MapReduce framework has become the de facto standard for large scale data-intensive applications. Qualitative pros and cons of each framework are known, but quantitative performance indexes help get a good picture of which framework to use for the applications. As benchmark problems to compare those frameworks, two problems are chosen: all-pairs-shortest-path problem and data join problem. This paper presents the parallel programs for the problems implemented on the three frameworks, respectively. It shows the experiment results on a cluster of computers. It also discusses which is the right tool for the jobs by analyzing the characteristics and performance of the paradigms.

  12. Parallel computing in cluster of GPU applied to a problem of nuclear engineering

    International Nuclear Information System (INIS)

    Moraes, Sergio Ricardo S.; Heimlich, Adino; Resende, Pedro

    2013-01-01

    Cluster computing has been widely used as a low cost alternative for parallel processing in scientific applications. With the use of Message-Passing Interface (MPI) protocol development became even more accessible and widespread in the scientific community. A more recent trend is the use of Graphic Processing Unit (GPU), which is a powerful co-processor able to perform hundreds of instructions in parallel, reaching a capacity of hundreds of times the processing of a CPU. However, a standard PC does not allow, in general, more than two GPUs. Hence, it is proposed in this work development and evaluation of a hybrid low cost parallel approach to the solution to a nuclear engineering typical problem. The idea is to use clusters parallelism technology (MPI) together with GPU programming techniques (CUDA - Compute Unified Device Architecture) to simulate neutron transport through a slab using Monte Carlo method. By using a cluster comprised by four quad-core computers with 2 GPU each, it has been developed programs using MPI and CUDA technologies. Experiments, applying different configurations, from 1 to 8 GPUs has been performed and results were compared with the sequential (non-parallel) version. A speed up of about 2.000 times has been observed when comparing the 8-GPU with the sequential version. Results here presented are discussed and analyzed with the objective of outlining gains and possible limitations of the proposed approach. (author)

  13. Experiences Using Hybrid MPI/OpenMP in the Real World: Parallelization of a 3D CFD Solver for Multi-Core Node Clusters

    Directory of Open Access Journals (Sweden)

    Gabriele Jost

    2010-01-01

    Full Text Available Today most systems in high-performance computing (HPC feature a hierarchical hardware design: shared-memory nodes with several multi-core CPUs are connected via a network infrastructure. When parallelizing an application for these architectures it seems natural to employ a hierarchical programming model such as combining MPI and OpenMP. Nevertheless, there is the general lore that pure MPI outperforms the hybrid MPI/OpenMP approach. In this paper, we describe the hybrid MPI/OpenMP parallelization of IR3D (Incompressible Realistic 3-D code, a full-scale real-world application, which simulates the environmental effects on the evolution of vortices trailing behind control surfaces of underwater vehicles. We discuss performance, scalability and limitations of the pure MPI version of the code on a variety of hardware platforms and show how the hybrid approach can help to overcome certain limitations.

  14. Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters

    KAUST Repository

    Wu, X.; Taylor, V.

    2011-01-01

    The NAS Parallel Benchmarks (NPB) are well-known applications with fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node, and MPI can be used with the communication between nodes. In this paper, we use Scalar Pentadiagonal (SP) and Block Tridiagonal (BT) benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore clusters, Intrepid (BlueGene/P) at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76 %, and the hybrid BT outperforms the MPI BT by up to 8.58 % on up to 10 000 cores on Intrepid and Jaguar. We also use performance tools and MPI trace libraries available on these clusters to further investigate the performance characteristics of the hybrid SP and BT. © 2011 The Author. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

  15. Performance Characteristics of Hybrid MPI/OpenMP Implementations of NAS Parallel Benchmarks SP and BT on Large-Scale Multicore Clusters

    KAUST Repository

    Wu, X.

    2011-07-18

    The NAS Parallel Benchmarks (NPB) are well-known applications with fixed algorithms for evaluating parallel systems and tools. Multicore clusters provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node, and MPI can be used with the communication between nodes. In this paper, we use Scalar Pentadiagonal (SP) and Block Tridiagonal (BT) benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore clusters, Intrepid (BlueGene/P) at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76 %, and the hybrid BT outperforms the MPI BT by up to 8.58 % on up to 10 000 cores on Intrepid and Jaguar. We also use performance tools and MPI trace libraries available on these clusters to further investigate the performance characteristics of the hybrid SP and BT. © 2011 The Author. Published by Oxford University Press on behalf of The British Computer Society. All rights reserved.

  16. Memory Compression Techniques for Network Address Management in MPI

    Energy Technology Data Exchange (ETDEWEB)

    Guo, Yanfei; Archer, Charles J.; Blocksome, Michael; Parker, Scott; Bland, Wesley; Raffenetti, Ken; Balaji, Pavan

    2017-05-29

    MPI allows applications to treat processes as a logical collection of integer ranks for each MPI communicator, while internally translating these logical ranks into actual network addresses. In current MPI implementations the management and lookup of such network addresses use memory sizes that are proportional to the number of processes in each communicator. In this paper, we propose a new mechanism, called AV-Rankmap, for managing such translation. AV-Rankmap takes advantage of logical patterns in rank-address mapping that most applications naturally tend to have, and it exploits the fact that some parts of network address structures are naturally more performance critical than others. It uses this information to compress the memory used for network address management. We demonstrate that AV-Rankmap can achieve performance similar to or better than that of other MPI implementations while using significantly less memory.

  17. MPI@LHC Talk.

    CERN Document Server

    AUTHOR|(INSPIRE)INSPIRE-00392933; The ATLAS collaboration

    2016-01-01

    Draft version of talk for MPI@LHC, regarding the topic of "Monte Carlo Tuning @ ATLAS". The talk introduces the event generator chain, concepts of tuning, issues/problems with over tuning, and then proceeds to explain 3(4) tunes performed at ATLAS. A 4th tune known as A15-MG5aMC@NLO(-TTBAR) is also included, but is awaiting note approval.

  18. Practical Formal Verification of MPI and Thread Programs

    Science.gov (United States)

    Gopalakrishnan, Ganesh; Kirby, Robert M.

    Large-scale simulation codes in science and engineering are written using the Message Passing Interface (MPI). Shared memory threads are widely used directly, or to implement higher level programming abstractions. Traditional debugging methods for MPI or thread programs are incapable of providing useful formal guarantees about coverage. They get bogged down in the sheer number of interleavings (schedules), often missing shallow bugs. In this tutorial we will introduce two practical formal verification tools: ISP (for MPI C programs) and Inspect (for Pthread C programs). Unlike other formal verification tools, ISP and Inspect run directly on user source codes (much like a debugger). They pursue only the relevant set of process interleavings, using our own customized Dynamic Partial Order Reduction algorithms. For a given test harness, DPOR allows these tools to guarantee the absence of deadlocks, instrumented MPI object leaks and communication races (using ISP), and shared memory races (using Inspect). ISP and Inspect have been used to verify large pieces of code: in excess of 10,000 lines of MPI/C for ISP in under 5 seconds, and about 5,000 lines of Pthread/C code in a few hours (and much faster with the use of a cluster or by exploiting special cases such as symmetry) for Inspect. We will also demonstrate the Microsoft Visual Studio and Eclipse Parallel Tools Platform integrations of ISP (these will be available on the LiveCD).

  19. McMPI – a managed-code message passing interface library for high performance communication in C#

    OpenAIRE

    Holmes, Daniel John

    2012-01-01

    This work endeavours to achieve technology transfer between established best-practice in academic high-performance computing and current techniques in commercial high-productivity computing. It shows that a credible high-performance message-passing communication library, with semantics and syntax following the Message-Passing Interface (MPI) Standard, can be built in pure C# (one of the .Net suite of computer languages). Message-passing has been the dominant paradigm in high-pe...

  20. Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2011-01-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore clusters: IBM POWER4, POWER5+ and Blue Gene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore clusters because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyro kinetic Toroidal Code in magnetic fusion to validate our performance model of the hybrid application on these multicore clusters. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore clusters. © 2011 IEEE.

  1. Performance Modeling of Hybrid MPI/OpenMP Scientific Applications on Large-scale Multicore Cluster Systems

    KAUST Repository

    Wu, Xingfu

    2011-08-01

    In this paper, we present a performance modeling framework based on memory bandwidth contention time and a parameterized communication model to predict the performance of OpenMP, MPI and hybrid applications with weak scaling on three large-scale multicore clusters: IBM POWER4, POWER5+ and Blue Gene/P, and analyze the performance of these MPI, OpenMP and hybrid applications. We use STREAM memory benchmarks to provide initial performance analysis and model validation of MPI and OpenMP applications on these multicore clusters because the measured sustained memory bandwidth can provide insight into the memory bandwidth that a system should sustain on scientific applications with the same amount of workload per core. In addition to using these benchmarks, we also use a weak-scaling hybrid MPI/OpenMP large-scale scientific application: Gyro kinetic Toroidal Code in magnetic fusion to validate our performance model of the hybrid application on these multicore clusters. The validation results for our performance modeling method show less than 7.77% error rate in predicting the performance of hybrid MPI/OpenMP GTC on up to 512 cores on these multicore clusters. © 2011 IEEE.

  2. Numerical discrepancy between serial and MPI parallel computations

    Directory of Open Access Journals (Sweden)

    Sang Bong Lee

    2016-09-01

    Full Text Available Numerical simulations of 1D Burgers equation and 2D sloshing problem were carried out to study numerical discrepancy between serial and parallel computations. The numerical domain was decomposed into 2 and 4 subdomains for parallel computations with message passing interface. The numerical solution of Burgers equation disclosed that fully explicit boundary conditions used on subdomains of parallel computation was responsible for the numerical discrepancy of transient solution between serial and parallel computations. Two dimensional sloshing problems in a rectangular domain were solved using OpenFOAM. After a lapse of initial transient time sloshing patterns of water were significantly different in serial and parallel computations although the same numerical conditions were given. Based on the histograms of pressure measured at two points near the wall the statistical characteristics of numerical solution was not affected by the number of subdomains as much as the transient solution was dependent on the number of subdomains.

  3. Solution of finite element problems using hybrid parallelization with MPI and OpenMP Solution of finite element problems using hybrid parallelization with MPI and OpenMP

    Directory of Open Access Journals (Sweden)

    José Miguel Vargas-Félix

    2012-11-01

    Full Text Available The Finite Element Method (FEM is used to solve problems like solid deformation and heat diffusion in domains with complex geometries. This kind of geometries requires discretization with millions of elements; this is equivalent to solve systems of equations with sparse matrices and tens or hundreds of millions of variables. The aim is to use computer clusters to solve these systems. The solution method used is Schur substructuration. Using it is possible to divide a large system of equations into many small ones to solve them more efficiently. This method allows parallelization. MPI (Message Passing Interface is used to distribute the systems of equations to solve each one in a computer of a cluster. Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.The Finite Element Method (FEM is used to solve problems like solid deformation and heat diffusion in domains with complex geometries. This kind of geometries requires discretization with millions of elements; this is equivalent to solve systems of equations with sparse matrices and tens or hundreds of millions of variables. The aim is to use computer clusters to solve these systems. The solution method used is Schur substructuration. Using it is possible to divide a large system of equations into many small ones to solve them more efficiently. This method allows parallelization. MPI (Message Passing Interface is used to distribute the systems of equations to solve each one in a computer of a cluster. Each system of equations is solved using a solver implemented to use OpenMP as a local parallelization method.

  4. Magnetic particle imaging: advancements and perspectives for real-time in vivo monitoring and image-guided therapy

    Science.gov (United States)

    Pablico-Lansigan, Michele H.; Situ, Shu F.; Samia, Anna Cristina S.

    2013-05-01

    Magnetic particle imaging (MPI) is an emerging biomedical imaging technology that allows the direct quantitative mapping of the spatial distribution of superparamagnetic iron oxide nanoparticles. MPI's increased sensitivity and short image acquisition times foster the creation of tomographic images with high temporal and spatial resolution. The contrast and sensitivity of MPI is envisioned to transcend those of other medical imaging modalities presently used, such as magnetic resonance imaging (MRI), X-ray scans, ultrasound, computed tomography (CT), positron emission tomography (PET) and single photon emission computed tomography (SPECT). In this review, we present an overview of the recent advances in the rapidly developing field of MPI. We begin with a basic introduction of the fundamentals of MPI, followed by some highlights over the past decade of the evolution of strategies and approaches used to improve this new imaging technique. We also examine the optimization of iron oxide nanoparticle tracers used for imaging, underscoring the importance of size homogeneity and surface engineering. Finally, we present some future research directions for MPI, emphasizing the novel and exciting opportunities that it offers as an important tool for real-time in vivo monitoring. All these opportunities and capabilities that MPI presents are now seen as potential breakthrough innovations in timely disease diagnosis, implant monitoring, and image-guided therapeutics.

  5. Automatic translation of MPI source into a latency-tolerant, data-driven form

    International Nuclear Information System (INIS)

    Nguyen, Tan; Cicotti, Pietro; Bylaska, Eric; Quinlan, Dan; Baden, Scott

    2017-01-01

    Hiding communication behind useful computation is an important performance programming technique but remains an inscrutable programming exercise even for the expert. We present Bamboo, a code transformation framework that can realize communication overlap in applications written in MPI without the need to intrusively modify the source code. We reformulate MPI source into a task dependency graph representation, which partially orders the tasks, enabling the program to execute in a data-driven fashion under the control of an external runtime system. Experimental results demonstrate that Bamboo significantly reduces communication delays while requiring only modest amounts of programmer annotation for a variety of applications and platforms, including those employing co-processors and accelerators. Moreover, Bamboo’s performance meets or exceeds that of labor-intensive hand coding. As a result, the translator is more than a means of hiding communication costs automatically; it demonstrates the utility of semantic level optimization against a well-known library.

  6. The Research of the Parallel Computing Development from the Angle of Cloud Computing

    Science.gov (United States)

    Peng, Zhensheng; Gong, Qingge; Duan, Yanyu; Wang, Yun

    2017-10-01

    Cloud computing is the development of parallel computing, distributed computing and grid computing. The development of cloud computing makes parallel computing come into people’s lives. Firstly, this paper expounds the concept of cloud computing and introduces two several traditional parallel programming model. Secondly, it analyzes and studies the principles, advantages and disadvantages of OpenMP, MPI and Map Reduce respectively. Finally, it takes MPI, OpenMP models compared to Map Reduce from the angle of cloud computing. The results of this paper are intended to provide a reference for the development of parallel computing.

  7. Performance Analysis of Ivshmem for High-Performance Computing in Virtual Machines

    Science.gov (United States)

    Ivanovic, Pavle; Richter, Harald

    2018-01-01

    High-Performance computing (HPC) is rarely accomplished via virtual machines (VMs). In this paper, we present a remake of ivshmem which can change this. Ivshmem was a shared memory (SHM) between virtual machines on the same server, with SHM-access synchronization included, until about 5 years ago when newer versions of Linux and its virtualization library libvirt evolved. We restored that SHM-access synchronization feature because it is indispensable for HPC and made ivshmem runnable with contemporary versions of Linux, libvirt, KVM, QEMU and especially MPICH, which is an implementation of MPI - the standard HPC communication library. Additionally, MPICH was transparently modified by us to get ivshmem included, resulting in a three to ten times performance improvement compared to TCP/IP. Furthermore, we have transparently replaced MPI_PUT, a single-side MPICH communication mechanism, by an own MPI_PUT wrapper. As a result, our ivshmem even surpasses non-virtualized SHM data transfers for block lengths greater than 512 KBytes, showing the benefits of virtualization. All improvements were possible without using SR-IOV.

  8. Accelerating Climate Simulations Through Hybrid Computing

    Science.gov (United States)

    Zhou, Shujia; Sinno, Scott; Cruz, Carlos; Purcell, Mark

    2009-01-01

    Unconventional multi-core processors (e.g., IBM Cell B/E and NYIDIDA GPU) have emerged as accelerators in climate simulation. However, climate models typically run on parallel computers with conventional processors (e.g., Intel and AMD) using MPI. Connecting accelerators to this architecture efficiently and easily becomes a critical issue. When using MPI for connection, we identified two challenges: (1) identical MPI implementation is required in both systems, and; (2) existing MPI code must be modified to accommodate the accelerators. In response, we have extended and deployed IBM Dynamic Application Virtualization (DAV) in a hybrid computing prototype system (one blade with two Intel quad-core processors, two IBM QS22 Cell blades, connected with Infiniband), allowing for seamlessly offloading compute-intensive functions to remote, heterogeneous accelerators in a scalable, load-balanced manner. Currently, a climate solar radiation model running with multiple MPI processes has been offloaded to multiple Cell blades with approx.10% network overhead.

  9. Non-Causal Computation

    Directory of Open Access Journals (Sweden)

    Ämin Baumeler

    2017-07-01

    Full Text Available Computation models such as circuits describe sequences of computation steps that are carried out one after the other. In other words, algorithm design is traditionally subject to the restriction imposed by a fixed causal order. We address a novel computing paradigm beyond quantum computing, replacing this assumption by mere logical consistency: We study non-causal circuits, where a fixed time structure within a gate is locally assumed whilst the global causal structure between the gates is dropped. We present examples of logically consistent non-causal circuits outperforming all causal ones; they imply that suppressing loops entirely is more restrictive than just avoiding the contradictions they can give rise to. That fact is already known for correlations as well as for communication, and we here extend it to computation.

  10. Hybrid cloud and cluster computing paradigms for life science applications.

    Science.gov (United States)

    Qiu, Judy; Ekanayake, Jaliya; Gunarathne, Thilina; Choi, Jong Youl; Bae, Seung-Hee; Li, Hui; Zhang, Bingjing; Wu, Tak-Lon; Ruan, Yang; Ekanayake, Saliya; Hughes, Adam; Fox, Geoffrey

    2010-12-21

    Clouds and MapReduce have shown themselves to be a broadly useful approach to scientific computing especially for parallel data intensive applications. However they have limited applicability to some areas such as data mining because MapReduce has poor performance on problems with an iterative structure present in the linear algebra that underlies much data analysis. Such problems can be run efficiently on clusters using MPI leading to a hybrid cloud and cluster environment. This motivates the design and implementation of an open source Iterative MapReduce system Twister. Comparisons of Amazon, Azure, and traditional Linux and Windows environments on common applications have shown encouraging performance and usability comparisons in several important non iterative cases. These are linked to MPI applications for final stages of the data analysis. Further we have released the open source Twister Iterative MapReduce and benchmarked it against basic MapReduce (Hadoop) and MPI in information retrieval and life sciences applications. The hybrid cloud (MapReduce) and cluster (MPI) approach offers an attractive production environment while Twister promises a uniform programming environment for many Life Sciences applications. We used commercial clouds Amazon and Azure and the NSF resource FutureGrid to perform detailed comparisons and evaluations of different approaches to data intensive computing. Several applications were developed in MPI, MapReduce and Twister in these different environments.

  11. Experiences implementing the MPI standard on Sandia`s lightweight kernels

    Energy Technology Data Exchange (ETDEWEB)

    Brightwell, R.; Greenberg, D.S.

    1997-10-01

    This technical report describes some lessons learned from implementing the Message Passing Interface (MPI) standard, and some proposed extentions to MPI, at Sandia. The implementations were developed using Sandia-developed lightweight kernels running on the Intel Paragon and Intel TeraFLOPS platforms. The motivations for this research are discussed, and a detailed analysis of several implementation issues is presented.

  12. Magnetic Particle Imaging for Real-Time Perfusion Imaging in Acute Stroke.

    Science.gov (United States)

    Ludewig, Peter; Gdaniec, Nadine; Sedlacik, Jan; Forkert, Nils D; Szwargulski, Patryk; Graeser, Matthias; Adam, Gerhard; Kaul, Michael G; Krishnan, Kannan M; Ferguson, R Matthew; Khandhar, Amit P; Walczak, Piotr; Fiehler, Jens; Thomalla, Götz; Gerloff, Christian; Knopp, Tobias; Magnus, Tim

    2017-10-24

    The fast and accurate assessment of cerebral perfusion is fundamental for the diagnosis and successful treatment of stroke patients. Magnetic particle imaging (MPI) is a new radiation-free tomographic imaging method with a superior temporal resolution, compared to other conventional imaging methods. In addition, MPI scanners can be built as prehospital mobile devices, which require less complex infrastructure than computed tomography (CT) and magnetic resonance imaging (MRI). With these advantages, MPI could accelerate the stroke diagnosis and treatment, thereby improving outcomes. Our objective was to investigate the capabilities of MPI to detect perfusion deficits in a murine model of ischemic stroke. Cerebral ischemia was induced by inserting of a microfilament in the internal carotid artery in C57BL/6 mice, thereby blocking the blood flow into the medial cerebral artery. After the injection of a contrast agent (superparamagnetic iron oxide nanoparticles) specifically tailored for MPI, cerebral perfusion and vascular anatomy were assessed by the MPI scanner within seconds. To validate and compare our MPI data, we performed perfusion imaging with a small animal MRI scanner. MPI detected the perfusion deficits in the ischemic brain, which were comparable to those with MRI but in real-time. For the first time, we showed that MPI could be used as a diagnostic tool for relevant diseases in vivo, such as an ischemic stroke. Due to its shorter image acquisition times and increased temporal resolution compared to that of MRI or CT, we expect that MPI offers the potential to improve stroke imaging and treatment.

  13. Proceedings of the first international workshop on multiple partonic interactions at the LHC. MPI'08

    International Nuclear Information System (INIS)

    Bartalini, Paolo; Fano, Livio

    2009-06-01

    The objective of this first workshop on Multiple Partonic Interactions (MPI) at the LHC, that can be regarded as a continuation and extension of the dedicated meetings held at DESY in the years 2006 and 2007, is to raise the profile of MPI studies, summarizing the legacy from the older phenomenology at hadronic colliders and favouring further specific contacts between the theory and experimental communities. The MPI are experiencing a growing popularity and are currently widely invoked to account for observations that would not be explained otherwise: the activity of the Underlying Event, the cross sections for multiple heavy flavour production, the survival probability of large rapidity gaps in hard diffraction, etc. At the same time, the implementation of the MPI effects in the Monte Carlo models is quickly proceeding through an increasing level of sophistication and complexity that in perspective achieves deep general implications for the LHC physics. The ultimate ambition of this workshop is to promote the MPI as unification concept between seemingly heterogeneous research lines and to profit of the complete experimental picture in order to constrain their implementation in the models, evaluating the spin offs on the LHC physics program. The workshop is structured in five sections, with the first one dedicated to few selected hot highlights in the High Energy Physics and directly connected to the other ones: Multiple Parton Interactions (in both the soft and the hard regimes), Diffraction, Monte Carlo Generators and Heavy Ions. (orig.)

  14. Proceedings of the first international workshop on multiple partonic interactions at the LHC. MPI'08

    Energy Technology Data Exchange (ETDEWEB)

    Bartalini, Paolo [National Taiwan Univ., Taipei (China); Fano, Livio [Istituto Nazionale di Fisica Nucleare, Perugia (Italy)

    2009-06-15

    The objective of this first workshop on Multiple Partonic Interactions (MPI) at the LHC, that can be regarded as a continuation and extension of the dedicated meetings held at DESY in the years 2006 and 2007, is to raise the profile of MPI studies, summarizing the legacy from the older phenomenology at hadronic colliders and favouring further specific contacts between the theory and experimental communities. The MPI are experiencing a growing popularity and are currently widely invoked to account for observations that would not be explained otherwise: the activity of the Underlying Event, the cross sections for multiple heavy flavour production, the survival probability of large rapidity gaps in hard diffraction, etc. At the same time, the implementation of the MPI effects in the Monte Carlo models is quickly proceeding through an increasing level of sophistication and complexity that in perspective achieves deep general implications for the LHC physics. The ultimate ambition of this workshop is to promote the MPI as unification concept between seemingly heterogeneous research lines and to profit of the complete experimental picture in order to constrain their implementation in the models, evaluating the spin offs on the LHC physics program. The workshop is structured in five sections, with the first one dedicated to few selected hot highlights in the High Energy Physics and directly connected to the other ones: Multiple Parton Interactions (in both the soft and the hard regimes), Diffraction, Monte Carlo Generators and Heavy Ions. (orig.)

  15. Implementation and validation of a model of the MPI Stewart platform

    NARCIS (Netherlands)

    Nieuwenhuizen, F.M.; Van Paasen, M.M.; Mulder, M.; Bülthoff, H.H.

    2010-01-01

    A simulated model of the MPI Stewart platform can be used to identify the influence of motion system characteristics on human control behaviour in active closed-loop control experiments on the SIMONA Research Simulator. In this paper, a previously identified model of the MPI Stewart platform was

  16. A Non-Linear Digital Computer Model Requiring Short Computation Time for Studies Concerning the Hydrodynamics of the BWR

    Energy Technology Data Exchange (ETDEWEB)

    Reisch, F; Vayssier, G

    1969-05-15

    This non-linear model serves as one of the blocks in a series of codes to study the transient behaviour of BWR or PWR type reactors. This program is intended to be the hydrodynamic part of the BWR core representation or the hydrodynamic part of the PWR heat exchanger secondary side representation. The equations have been prepared for the CSMP digital simulation language. By using the most suitable integration routine available, the ratio of simulation time to real time is about one on an IBM 360/75 digital computer. Use of the slightly different language DSL/40 on an IBM 7044 computer takes about four times longer. The code has been tested against the Eindhoven loop with satisfactory agreement.

  17. Stampi: a message passing library for distributed parallel computing. User's guide

    International Nuclear Information System (INIS)

    Imamura, Toshiyuki; Koide, Hiroshi; Takemiya, Hiroshi

    1998-11-01

    A new message passing library, Stampi, has been developed to realize a computation with different kind of parallel computers arbitrarily and making MPI (Message Passing Interface) as an unique interface for communication. Stampi is based on MPI2 specification. It realizes dynamic process creation to different machines and communication between spawned one within the scope of MPI semantics. Vender implemented MPI as a closed system in one parallel machine and did not support both functions; process creation and communication to external machines. Stampi supports both functions and enables us distributed parallel computing. Currently Stampi has been implemented on COMPACS (COMplex PArallel Computer System) introduced in CCSE, five parallel computers and one graphic workstation, and any communication on them can be processed on. (author)

  18. Automatic Transformation of MPI Programs to Asynchronous, Graph-Driven Form

    Energy Technology Data Exchange (ETDEWEB)

    Baden, Scott B [University of California, San Diego; Weare, John H [University of California, San Diego; Bylaska, Eric J [Pacific Northwest National Laboratory

    2013-04-30

    The goals of this project are to develop new, scalable, high-fidelity algorithms for atomic-level simulations and program transformations that automatically restructure existing applications, enabling them to scale forward to Petascale systems and beyond. The techniques enable legacy MPI application code to exploit greater parallelism though increased latency hiding and improved workload assignment. The techniques were successfully demonstrated on high-end scalable systems located at DOE laboratories. Besides the automatic MPI program transformations efforts, the project also developed several new scalable algorithms for ab-initio molecular dynamics, including new massively parallel algorithms for hybrid DFT and new parallel in time algorithms for molecular dynamics and ab-initio molecular dynamics. These algorithms were shown to scale to very large number of cores, and they were designed to work in the latency hiding framework developed in this project. The effectiveness of the developments was enhanced by the direct application to real grand challenge simulation problems covering a wide range of technologically important applications, time scales and accuracies. These included the simulation of the electronic structure of mineral/fluid interfaces, the very accurate simulation of chemical reactions in microsolvated environments, and the simulation of chemical behavior in very large enzyme reactions.

  19. Proceedings of the first international workshop on multiple partonic interactions at the LHC. MPI'08

    Energy Technology Data Exchange (ETDEWEB)

    Bartalini, Paolo [National Taiwan Univ., Taipei (China); Fano, Livio (eds.) [Istituto Nazionale di Fisica Nucleare, Perugia (Italy)

    2009-06-15

    The objective of this first workshop on Multiple Partonic Interactions (MPI) at the LHC, that can be regarded as a continuation and extension of the dedicated meetings held at DESY in the years 2006 and 2007, is to raise the profile of MPI studies, summarizing the legacy from the older phenomenology at hadronic colliders and favouring further specific contacts between the theory and experimental communities. The MPI are experiencing a growing popularity and are currently widely invoked to account for observations that would not be explained otherwise: the activity of the Underlying Event, the cross sections for multiple heavy flavour production, the survival probability of large rapidity gaps in hard diffraction, etc. At the same time, the implementation of the MPI effects in the Monte Carlo models is quickly proceeding through an increasing level of sophistication and complexity that in perspective achieves deep general implications for the LHC physics. The ultimate ambition of this workshop is to promote the MPI as unification concept between seemingly heterogeneous research lines and to profit of the complete experimental picture in order to constrain their implementation in the models, evaluating the spin offs on the LHC physics program. The workshop is structured in five sections, with the first one dedicated to few selected hot highlights in the High Energy Physics and directly connected to the other ones: Multiple Parton Interactions (in both the soft and the hard regimes), Diffraction, Monte Carlo Generators and Heavy Ions. (orig.)

  20. Implementing the PM Programming Language using MPI and OpenMP - a New Tool for Programming Geophysical Models on Parallel Systems

    Science.gov (United States)

    Bellerby, Tim

    2015-04-01

    PM (Parallel Models) is a new parallel programming language specifically designed for writing environmental and geophysical models. The language is intended to enable implementers to concentrate on the science behind the model rather than the details of running on parallel hardware. At the same time PM leaves the programmer in control - all parallelisation is explicit and the parallel structure of any given program may be deduced directly from the code. This paper describes a PM implementation based on the Message Passing Interface (MPI) and Open Multi-Processing (OpenMP) standards, looking at issues involved with translating the PM parallelisation model to MPI/OpenMP protocols and considering performance in terms of the competing factors of finer-grained parallelisation and increased communication overhead. In order to maximise portability, the implementation stays within the MPI 1.3 standard as much as possible, with MPI-2 MPI-IO file handling the only significant exception. Moreover, it does not assume a thread-safe implementation of MPI. PM adopts a two-tier abstract representation of parallel hardware. A PM processor is a conceptual unit capable of efficiently executing a set of language tasks, with a complete parallel system consisting of an abstract N-dimensional array of such processors. PM processors may map to single cores executing tasks using cooperative multi-tasking, to multiple cores or even to separate processing nodes, efficiently sharing tasks using algorithms such as work stealing. While tasks may move between hardware elements within a PM processor, they may not move between processors without specific programmer intervention. Tasks are assigned to processors using a nested parallelism approach, building on ideas from Reyes et al. (2009). The main program owns all available processors. When the program enters a parallel statement then either processors are divided out among the newly generated tasks (number of new tasks number of processors

  1. Thermodynamic behavior of binary mixtures CnMpyNTf2 ionic liquids with primary and secondary alcohols

    International Nuclear Information System (INIS)

    Calvar, N.; Gómez, E.; Domínguez, Á.; Macedo, E.A.

    2012-01-01

    Highlights: ► Osmotic coefficients of alcohols with C n MpyNTf 2 (n = 2, 3, 4) are determined. ► Experimental data were correlated with Extended Pitzer model of Archer and MNRTL. ► Mean molal activity coefficients and excess Gibbs free energies were calculated. ► The results have been interpreted in terms of interactions. - Abstract: In this paper, the osmotic and activity coefficients and vapor pressures of the binary mixtures containing the ionic liquids 1-ethyl-3-methylpyridinium bis(trifluoromethylsulfonyl)imide, C 2 MpyNTf 2 , and 1-methyl-3-propylpyridinium bis(trifluoromethylsulfonyl)imide, C 3 MpyNTf 2 , with 1-propanol, or 2-propanol and the ionic liquid 1-butyl-3-methylpyridinium bis(trifluoromethylsulfonyl)imide, C 4 MpyNTf 2 , with 1-propanol or 2-propanol or 1-butanol or 2-butanol were determined at T = 323.15 K using the vapor pressure osmometry technique. The influence of the structure of the alcohol and of the ionic liquid on both coefficients and vapor pressures is discussed and a comparison with literature data on binary mixtures containing ionic liquids with different cations and anion is also performed. Besides, the results have been interpreted in terms of solute–solvent and ion–ion interactions. The experimental osmotic coefficients were correlated using the Extended Pitzer model of Archer and the Modified Non-Random Two Liquids model obtaining standard deviations lower than 0.059 and 0.102 respectively, and the mean molal activity coefficients and the excess Gibbs free energy for the studied mixtures were calculated.

  2. Topology-oblivious optimization of MPI broadcast algorithms on extreme-scale platforms

    KAUST Repository

    Hasanov, Khalid

    2015-11-01

    © 2015 Elsevier B.V. All rights reserved. Significant research has been conducted in collective communication operations, in particular in MPI broadcast, on distributed memory platforms. Most of the research efforts aim to optimize the collective operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a simple but general approach to optimization of the legacy MPI broadcast algorithms, which are widely used in MPICH and Open MPI. The proposed optimization technique is designed to address the challenge of extreme scale of future HPC platforms. It is based on hierarchical transformation of the traditionally flat logical arrangement of communicating processors. Theoretical analysis and experimental results on IBM BlueGene/P and a cluster of the Grid\\'5000 platform are presented.

  3. A computational test facility for distributed analysis of gravitational wave signals

    International Nuclear Information System (INIS)

    Amico, P; Bosi, L; Cattuto, C; Gammaitoni, L; Punturo, M; Travasso, F; Vocca, H

    2004-01-01

    In the gravitational wave detector Virgo, the in-time detection of a gravitational wave signal from a coalescing binary stellar system is an intensive computational task. A parallel computing scheme using the message passing interface (MPI) is described. Performance results on a small-scale cluster are reported

  4. Parallel MCNP Monte Carlo transport calculations with MPI

    International Nuclear Information System (INIS)

    Wagner, J.C.; Haghighat, A.

    1996-01-01

    The steady increase in computational performance has made Monte Carlo calculations for large/complex systems possible. However, in order to make these calculations practical, order of magnitude increases in performance are necessary. The Monte Carlo method is inherently parallel (particles are simulated independently) and thus has the potential for near-linear speedup with respect to the number of processors. Further, the ever-increasing accessibility of parallel computers, such as workstation clusters, facilitates the practical use of parallel Monte Carlo. Recognizing the nature of the Monte Carlo method and the trends in available computing, the code developers at Los Alamos National Laboratory implemented the message-passing general-purpose Monte Carlo radiation transport code MCNP (version 4A). The PVM package was chosen by the MCNP code developers because it supports a variety of communication networks, several UNIX platforms, and heterogeneous computer systems. This PVM version of MCNP has been shown to produce speedups that approach the number of processors and thus, is a very useful tool for transport analysis. Due to software incompatibilities on the local IBM SP2, PVM has not been available, and thus it is not possible to take advantage of this useful tool. Hence, it became necessary to implement an alternative message-passing library package into MCNP. Because the message-passing interface (MPI) is supported on the local system, takes advantage of the high-speed communication switches in the SP2, and is considered to be the emerging standard, it was selected

  5. Parallel Monte Carlo simulations on an ARC-enabled computing grid

    International Nuclear Information System (INIS)

    Nilsen, Jon K; Samset, Bjørn H

    2011-01-01

    Grid computing opens new possibilities for running heavy Monte Carlo simulations of physical systems in parallel. The presentation gives an overview of GaMPI, a system for running an MPI-based random walker simulation on grid resources. Integrating the ARC middleware and the new storage system Chelonia with the Ganga grid job submission and control system, we show that MPI jobs can be run on a world-wide computing grid with good performance and promising scaling properties. Results for relatively communication-heavy Monte Carlo simulations run on multiple heterogeneous, ARC-enabled computing clusters in several countries are presented.

  6. Novel magnetic multicore nanoparticles designed for MPI and other biomedical applications: From synthesis to first in vivo studies.

    Directory of Open Access Journals (Sweden)

    Harald Kratz

    Full Text Available Synthesis of novel magnetic multicore particles (MCP in the nano range, involves alkaline precipitation of iron(II chloride in the presence of atmospheric oxygen. This step yields green rust, which is oxidized to obtain magnetic nanoparticles, which probably consist of a magnetite/maghemite mixed-phase. Final growth and annealing at 90°C in the presence of a large excess of carboxymethyl dextran gives MCP very promising magnetic properties for magnetic particle imaging (MPI, an emerging medical imaging modality, and magnetic resonance imaging (MRI. The magnetic nanoparticles are biocompatible and thus potential candidates for future biomedical applications such as cardiovascular imaging, sentinel lymph node mapping in cancer patients, and stem cell tracking. The new MCP that we introduce here have three times higher magnetic particle spectroscopy performance at lower and middle harmonics and five times higher MPS signal strength at higher harmonics compared with Resovist®. In addition, the new MCP have also an improved in vivo MPI performance compared to Resovist®, and we here report the first in vivo MPI investigation of this new generation of magnetic nanoparticles.

  7. Coupling of THALES and FROST using MPI Method

    International Nuclear Information System (INIS)

    Park, Jin Woo; Ryu, Seok Hee; Jung, Chan Do; Jung, Jee Hoon; Um, Kil Sup; Lee, Jae Il

    2013-01-01

    This paper presents the coupling method between THALES and FROST and the simulation results with the coupled code system. In this study, subchannel analysis code THALES and transient fuel performance code FROST were coupled using MPI method as the first stage of the development of the multi-dimensional safety analysis methodology. As a part of the validation, the CEA ejection accident was simulated using the coupled THALES-FROST code and the results were compared with the ShinKori 3 and 4 FSAR. Comparison results revealed that CHASER using MPI method predicts fuel temperatures and heat flux quantitatively well. Thus it was confirmed that the THALES and FROST are properly coupled. In near future, ASTRA, multi-dimensional core neutron kinetics code, will be linked to THALESFROST code for the detailed three-dimensional CEA ejection analysis. The current safety analysis methodology for a CEA ejection accident based on numerous conservative assumptions with the point kinetics model results in quite adverse consequences. Thus, KNF is developing the multi-dimensional safety analysis methodology to enhance the consequences of the CEA ejection accident. For this purpose, three-dimensional core neutron kinetics code ASTRA, subchannel analysis code THALES, and transient fuel performance analysis code FROST are being coupled using message passing interface(MPI). For the first step, THALES and FROST are coupled and tested

  8. Evaluation of iron oxide nanoparticle micelles for Magnetic Particle Imaging (MPI) of thrombosis

    NARCIS (Netherlands)

    Starmans, L.W.E.; Moonen, R.P.M.; Aussems-Custers, E.; Daemen, M.J.A.P.; Strijkers, G. J.; Nicolay, K.; Grüll, H.

    2015-01-01

    Magnetic particle imaging (MPI) is an emerging medical imaging modality that directly visualizes magnetic particles in a hot-spot like fashion. We recently developed an iron oxide nanoparticle-micelle (ION-Micelle) platform that allows highly sensitive MPI. The goal of this study was to assess the

  9. Study on MPI/OpenMP hybrid parallelism for Monte Carlo neutron transport code

    International Nuclear Information System (INIS)

    Liang Jingang; Xu Qi; Wang Kan; Liu Shiwen

    2013-01-01

    Parallel programming with mixed mode of messages-passing and shared-memory has several advantages when used in Monte Carlo neutron transport code, such as fitting hardware of distributed-shared clusters, economizing memory demand of Monte Carlo transport, improving parallel performance, and so on. MPI/OpenMP hybrid parallelism was implemented based on a one dimension Monte Carlo neutron transport code. Some critical factors affecting the parallel performance were analyzed and solutions were proposed for several problems such as contention access, lock contention and false sharing. After optimization the code was tested finally. It is shown that the hybrid parallel code can reach good performance just as pure MPI parallel program, while it saves a lot of memory usage at the same time. Therefore hybrid parallel is efficient for achieving large-scale parallel of Monte Carlo neutron transport. (authors)

  10. 32 CFR 637.8 - Identification of MPI.

    Science.gov (United States)

    2010-07-01

    ... CRIMINAL INVESTIGATIONS MILITARY POLICE INVESTIGATION Investigations § 637.8 Identification of MPI. (a... referring to themselves as “INVESTIGATOR.” When signing military police records the title “Military Police Investigator” may be used in lieu of military titles. Civilian personnel will refer to themselves as...

  11. Specification of Fenix MPI Fault Tolerance library version 1.0.1

    Energy Technology Data Exchange (ETDEWEB)

    Gamble, Marc [Rutgers Univ., New Brunswick, NJ (United States); Van Der Wijngaart, Rob [Intel Corps., Mountain View, CA (United States); Teranishi, Keita [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Parashar, Manish [Rutgers Univ., New Brunswick, NJ (United States)

    2016-10-01

    This document provides a specification of Fenix, a software library compatible with the Message Passing Interface (MPI) to support fault recovery without application shutdown. The library consists of two modules. The first, termed process recovery , restores an application to a consistent state after it has suffered a loss of one or more MPI processes (ranks). The second specifies functions the user can invoke to store application data in Fenix managed redundant storage, and to retrieve it from that storage after process recovery.

  12. Analysis of parallel computing performance of the code MCNP

    International Nuclear Information System (INIS)

    Wang Lei; Wang Kan; Yu Ganglin

    2006-01-01

    Parallel computing can reduce the running time of the code MCNP effectively. With the MPI message transmitting software, MCNP5 can achieve its parallel computing on PC cluster with Windows operating system. Parallel computing performance of MCNP is influenced by factors such as the type, the complexity level and the parameter configuration of the computing problem. This paper analyzes the parallel computing performance of MCNP regarding with these factors and gives measures to improve the MCNP parallel computing performance. (authors)

  13. 3D streamers simulation in a pin to plane configuration using massively parallel computing

    Science.gov (United States)

    Plewa, J.-M.; Eichwald, O.; Ducasse, O.; Dessante, P.; Jacobs, C.; Renon, N.; Yousfi, M.

    2018-03-01

    This paper concerns the 3D simulation of corona discharge using high performance computing (HPC) managed with the message passing interface (MPI) library. In the field of finite volume methods applied on non-adaptive mesh grids and in the case of a specific 3D dynamic benchmark test devoted to streamer studies, the great efficiency of the iterative R&B SOR and BiCGSTAB methods versus the direct MUMPS method was clearly demonstrated in solving the Poisson equation using HPC resources. The optimization of the parallelization and the resulting scalability was undertaken as a function of the HPC architecture for a number of mesh cells ranging from 8 to 512 million and a number of cores ranging from 20 to 1600. The R&B SOR method remains at least about four times faster than the BiCGSTAB method and requires significantly less memory for all tested situations. The R&B SOR method was then implemented in a 3D MPI parallelized code that solves the classical first order model of an atmospheric pressure corona discharge in air. The 3D code capabilities were tested by following the development of one, two and four coplanar streamers generated by initial plasma spots for 6 ns. The preliminary results obtained allowed us to follow in detail the formation of the tree structure of a corona discharge and the effects of the mutual interactions between the streamers in terms of streamer velocity, trajectory and diameter. The computing time for 64 million of mesh cells distributed over 1000 cores using the MPI procedures is about 30 min ns-1, regardless of the number of streamers.

  14. Stampi: a message passing library for distributed parallel computing. User's guide, second edition

    International Nuclear Information System (INIS)

    Imamura, Toshiyuki; Koide, Hiroshi; Takemiya, Hiroshi

    2000-02-01

    A new message passing library, Stampi, has been developed to realize a computation with different kind of parallel computers arbitrarily and making MPI (Message Passing Interface) as an unique interface for communication. Stampi is based on the MPI2 specification, and it realizes dynamic process creation to different machines and communication between spawned one within the scope of MPI semantics. Main features of Stampi are summarized as follows: (i) an automatic switch function between external- and internal communications, (ii) a message routing/relaying with a routing module, (iii) a dynamic process creation, (iv) a support of two types of connection, Master/Slave and Client/Server, (v) a support of a communication with Java applets. Indeed vendors implemented MPI libraries as a closed system in one parallel machine or their systems, and did not support both functions; process creation and communication to external machines. Stampi supports both functions and enables us distributed parallel computing. Currently Stampi has been implemented on COMPACS (COMplex PArallel Computer System) introduced in CCSE, five parallel computers and one graphic workstation, moreover on eight kinds of parallel machines, totally fourteen systems. Stampi provides us MPI communication functionality on them. This report describes mainly the usage of Stampi. (author)

  15. An efficient MPI/OpenMP parallelization of the Hartree–Fock–Roothaan method for the first generation of Intel® Xeon Phi™ processor architecture

    International Nuclear Information System (INIS)

    Mironov, Vladimir; Moskovsky, Alexander; D’Mello, Michael; Alexeev, Yuri

    2017-01-01

    The Hartree-Fock (HF) method in the quantum chemistry package GAMESS represents one of the most irregular algorithms in computation today. Major steps in the calculation are the irregular computation of electron repulsion integrals (ERIs) and the building of the Fock matrix. These are the central components of the main Self Consistent Field (SCF) loop, the key hotspot in Electronic Structure (ES) codes. By threading the MPI ranks in the official release of the GAMESS code, we not only speed up the main SCF loop (4x to 6x for large systems), but also achieve a significant (>2x) reduction in the overall memory footprint. These improvements are a direct consequence of memory access optimizations within the MPI ranks. We benchmark our implementation against the official release of the GAMESS code on the Intel R Xeon PhiTM supercomputer. Here, scaling numbers are reported on up to 7,680 cores on Intel Xeon Phi coprocessors.

  16. Technologies and tools for high-performance distributed computing. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Karonis, Nicholas T.

    2000-05-01

    In this project we studied the practical use of the MPI message-passing interface in advanced distributed computing environments. We built on the existing software infrastructure provided by the Globus Toolkit{trademark}, the MPICH portable implementation of MPI, and the MPICH-G integration of MPICH with Globus. As a result of this project we have replaced MPICH-G with its successor MPICH-G2, which is also an integration of MPICH with Globus. MPICH-G2 delivers significant improvements in message passing performance when compared to its predecessor MPICH-G and was based on superior software design principles resulting in a software base that was much easier to make the functional extensions and improvements we did. Using Globus services we replaced the default implementation of MPI's collective operations in MPICH-G2 with more efficient multilevel topology-aware collective operations which, in turn, led to the development of a new timing methodology for broadcasts [8]. MPICH-G2 was extended to include client/server functionality from the MPI-2 standard [23] to facilitate remote visualization applications and, through the use of MPI idioms, MPICH-G2 provided application-level control of quality-of-service parameters as well as application-level discovery of underlying Grid-topology information. Finally, MPICH-G2 was successfully used in a number of applications including an award-winning record-setting computation in numerical relativity. In the sections that follow we describe in detail the accomplishments of this project, we present experimental results quantifying the performance improvements, and conclude with a discussion of our applications experiences. This project resulted in a significant increase in the utility of MPICH-G2.

  17. Fenix, A Fault Tolerant Programming Framework for MPI Applications

    Energy Technology Data Exchange (ETDEWEB)

    2016-10-05

    Fenix provides APIs to allow the users to add fault tolerance capability to MPI-based parallel programs in a transparent manner. Fenix-enabled programs can run through process failures during program execution using a pool of spare processes accommodated by Fenix.

  18. IB: A Monte Carlo simulation tool for neutron scattering instrument design under PVM and MPI

    International Nuclear Information System (INIS)

    Zhao Jinkui

    2011-01-01

    Design of modern neutron scattering instruments relies heavily on Monte Carlo simulation tools for optimization. IB is one such tool written in C++ and implemented under Parallel Virtual Machine and the Message Passing Interface. The program was initially written for the design and optimization of the EQ-SANS instrument at the Spallation Neutron Source. One of its features is the ability to group simple instrument components into more complex ones at the user input level, e.g. grouping neutron mirrors into neutron guides and curved benders. The simulation engine manages the grouped components such that neutrons entering a group are properly operated upon by all components, multiple times if needed, before exiting the group. Thus, only a few basic optical modules are needed at the programming level. For simulations that require higher computer speeds, the program can be compiled and run in parallel modes using either the PVM or the MPI architectures.

  19. Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2011-01-01

    The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.

  20. Performance characteristics of hybrid MPI/OpenMP implementations of NAS parallel benchmarks SP and BT on large-scale multicore supercomputers

    KAUST Repository

    Wu, Xingfu

    2011-03-29

    The NAS Parallel Benchmarks (NPB) are well-known applications with the fixed algorithms for evaluating parallel systems and tools. Multicore supercomputers provide a natural programming paradigm for hybrid programs, whereby OpenMP can be used with the data sharing with the multicores that comprise a node and MPI can be used with the communication between nodes. In this paper, we use SP and BT benchmarks of MPI NPB 3.3 as a basis for a comparative approach to implement hybrid MPI/OpenMP versions of SP and BT. In particular, we can compare the performance of the hybrid SP and BT with the MPI counterparts on large-scale multicore supercomputers. Our performance results indicate that the hybrid SP outperforms the MPI SP by up to 20.76%, and the hybrid BT outperforms the MPI BT by up to 8.58% on up to 10,000 cores on BlueGene/P at Argonne National Laboratory and Jaguar (Cray XT4/5) at Oak Ridge National Laboratory. We also use performance tools and MPI trace libraries available on these supercomputers to further investigate the performance characteristics of the hybrid SP and BT.

  1. Development of small scale cluster computer for numerical analysis

    Science.gov (United States)

    Zulkifli, N. H. N.; Sapit, A.; Mohammed, A. N.

    2017-09-01

    In this study, two units of personal computer were successfully networked together to form a small scale cluster. Each of the processor involved are multicore processor which has four cores in it, thus made this cluster to have eight processors. Here, the cluster incorporate Ubuntu 14.04 LINUX environment with MPI implementation (MPICH2). Two main tests were conducted in order to test the cluster, which is communication test and performance test. The communication test was done to make sure that the computers are able to pass the required information without any problem and were done by using simple MPI Hello Program where the program written in C language. Additional, performance test was also done to prove that this cluster calculation performance is much better than single CPU computer. In this performance test, four tests were done by running the same code by using single node, 2 processors, 4 processors, and 8 processors. The result shows that with additional processors, the time required to solve the problem decrease. Time required for the calculation shorten to half when we double the processors. To conclude, we successfully develop a small scale cluster computer using common hardware which capable of higher computing power when compare to single CPU processor, and this can be beneficial for research that require high computing power especially numerical analysis such as finite element analysis, computational fluid dynamics, and computational physics analysis.

  2. Feasibility of using auto Mod-MPI system, a novel technique for automated measurement of fetal modified myocardial performance index.

    Science.gov (United States)

    Lee, M-Y; Won, H-S; Jeon, E-J; Yoon, H C; Choi, J Y; Hong, S J; Kim, M-J

    2014-06-01

    To evaluate the reproducibility of measurement of the fetal left modified myocardial performance index (Mod-MPI) determined using a novel automated system. This was a prospective study of 116 ultrasound examinations from 110 normal singleton pregnancies at 12 + 1 to 37 + 1 weeks' gestation. Two experienced operators each measured the left Mod-MPI twice manually and twice automatically using the Auto Mod-MPI system. Intra- and interoperator reproducibility were assessed using intraclass correlation coefficients (ICCs) and the manual and automated measurements obtained by the more experienced operator were compared using Bland-Altman plots and ICCs. Both operators successfully measured the left Mod-MPI in all cases using the Auto Mod-MPI system. For both operators, intraoperator reproducibility was higher when performing automated measurements (ICC = 0.967 and 0.962 for Operators 1 and 2, respectively) than when performing manual measurements (ICC = 0.857 and 0.856 for Operators 1 and 2, respectively). Interoperator agreement was also better for automated than for manual measurements (ICC = 0.930 vs 0.723, respectively). There was good agreement between the automated and manual values measured by the more experienced operator. The Auto Mod-MPI system is a reliable technique for measuring fetal left Mod-MPI and demonstrates excellent reproducibility. Copyright © 2013 ISUOG. Published by John Wiley & Sons Ltd.

  3. The MPI facial expression database--a validated database of emotional and conversational facial expressions.

    Directory of Open Access Journals (Sweden)

    Kathrin Kaulard

    Full Text Available The ability to communicate is one of the core aspects of human life. For this, we use not only verbal but also nonverbal signals of remarkable complexity. Among the latter, facial expressions belong to the most important information channels. Despite the large variety of facial expressions we use in daily life, research on facial expressions has so far mostly focused on the emotional aspect. Consequently, most databases of facial expressions available to the research community also include only emotional expressions, neglecting the largely unexplored aspect of conversational expressions. To fill this gap, we present the MPI facial expression database, which contains a large variety of natural emotional and conversational expressions. The database contains 55 different facial expressions performed by 19 German participants. Expressions were elicited with the help of a method-acting protocol, which guarantees both well-defined and natural facial expressions. The method-acting protocol was based on every-day scenarios, which are used to define the necessary context information for each expression. All facial expressions are available in three repetitions, in two intensities, as well as from three different camera angles. A detailed frame annotation is provided, from which a dynamic and a static version of the database have been created. In addition to describing the database in detail, we also present the results of an experiment with two conditions that serve to validate the context scenarios as well as the naturalness and recognizability of the video sequences. Our results provide clear evidence that conversational expressions can be recognized surprisingly well from visual information alone. The MPI facial expression database will enable researchers from different research fields (including the perceptual and cognitive sciences, but also affective computing, as well as computer vision to investigate the processing of a wider range of natural

  4. Computation cluster for Monte Carlo calculations

    Energy Technology Data Exchange (ETDEWEB)

    Petriska, M.; Vitazek, K.; Farkas, G.; Stacho, M.; Michalek, S. [Dep. Of Nuclear Physics and Technology, Faculty of Electrical Engineering and Information, Technology, Slovak Technical University, Ilkovicova 3, 81219 Bratislava (Slovakia)

    2010-07-01

    Two computation clusters based on Rocks Clusters 5.1 Linux distribution with Intel Core Duo and Intel Core Quad based computers were made at the Department of the Nuclear Physics and Technology. Clusters were used for Monte Carlo calculations, specifically for MCNP calculations applied in Nuclear reactor core simulations. Optimization for computation speed was made on hardware and software basis. Hardware cluster parameters, such as size of the memory, network speed, CPU speed, number of processors per computation, number of processors in one computer were tested for shortening the calculation time. For software optimization, different Fortran compilers, MPI implementations and CPU multi-core libraries were tested. Finally computer cluster was used in finding the weighting functions of neutron ex-core detectors of VVER-440. (authors)

  5. Computation cluster for Monte Carlo calculations

    International Nuclear Information System (INIS)

    Petriska, M.; Vitazek, K.; Farkas, G.; Stacho, M.; Michalek, S.

    2010-01-01

    Two computation clusters based on Rocks Clusters 5.1 Linux distribution with Intel Core Duo and Intel Core Quad based computers were made at the Department of the Nuclear Physics and Technology. Clusters were used for Monte Carlo calculations, specifically for MCNP calculations applied in Nuclear reactor core simulations. Optimization for computation speed was made on hardware and software basis. Hardware cluster parameters, such as size of the memory, network speed, CPU speed, number of processors per computation, number of processors in one computer were tested for shortening the calculation time. For software optimization, different Fortran compilers, MPI implementations and CPU multi-core libraries were tested. Finally computer cluster was used in finding the weighting functions of neutron ex-core detectors of VVER-440. (authors)

  6. A Combined MPI-CUDA Parallel Solution of Linear and Nonlinear Poisson-Boltzmann Equation

    Directory of Open Access Journals (Sweden)

    José Colmenares

    2014-01-01

    Full Text Available The Poisson-Boltzmann equation models the electrostatic potential generated by fixed charges on a polarizable solute immersed in an ionic solution. This approach is often used in computational structural biology to estimate the electrostatic energetic component of the assembly of molecular biological systems. In the last decades, the amount of data concerning proteins and other biological macromolecules has remarkably increased. To fruitfully exploit these data, a huge computational power is needed as well as software tools capable of exploiting it. It is therefore necessary to move towards high performance computing and to develop proper parallel implementations of already existing and of novel algorithms. Nowadays, workstations can provide an amazing computational power: up to 10 TFLOPS on a single machine equipped with multiple CPUs and accelerators such as Intel Xeon Phi or GPU devices. The actual obstacle to the full exploitation of modern heterogeneous resources is efficient parallel coding and porting of software on such architectures. In this paper, we propose the implementation of a full Poisson-Boltzmann solver based on a finite-difference scheme using different and combined parallel schemes and in particular a mixed MPI-CUDA implementation. Results show great speedups when using the two schemes, achieving an 18.9x speedup using three GPUs.

  7. The MPI Facial Expression Database — A Validated Database of Emotional and Conversational Facial Expressions

    Science.gov (United States)

    Kaulard, Kathrin; Cunningham, Douglas W.; Bülthoff, Heinrich H.; Wallraven, Christian

    2012-01-01

    The ability to communicate is one of the core aspects of human life. For this, we use not only verbal but also nonverbal signals of remarkable complexity. Among the latter, facial expressions belong to the most important information channels. Despite the large variety of facial expressions we use in daily life, research on facial expressions has so far mostly focused on the emotional aspect. Consequently, most databases of facial expressions available to the research community also include only emotional expressions, neglecting the largely unexplored aspect of conversational expressions. To fill this gap, we present the MPI facial expression database, which contains a large variety of natural emotional and conversational expressions. The database contains 55 different facial expressions performed by 19 German participants. Expressions were elicited with the help of a method-acting protocol, which guarantees both well-defined and natural facial expressions. The method-acting protocol was based on every-day scenarios, which are used to define the necessary context information for each expression. All facial expressions are available in three repetitions, in two intensities, as well as from three different camera angles. A detailed frame annotation is provided, from which a dynamic and a static version of the database have been created. In addition to describing the database in detail, we also present the results of an experiment with two conditions that serve to validate the context scenarios as well as the naturalness and recognizability of the video sequences. Our results provide clear evidence that conversational expressions can be recognized surprisingly well from visual information alone. The MPI facial expression database will enable researchers from different research fields (including the perceptual and cognitive sciences, but also affective computing, as well as computer vision) to investigate the processing of a wider range of natural facial expressions

  8. Aminophylline and caffeine for reversal of adverse symptoms associated with regadenoson SPECT MPI.

    Science.gov (United States)

    Doran, Jesse A; Sajjad, Waseem; Schneider, Marabel D; Gupta, Rohit; Mackin, Maria L; Schwartz, Ronald G

    2017-06-01

    Aminophylline shortages led us to compare intravenous (IV) aminophylline with IV and oral (PO) caffeine during routine pharmacologic stress testing with SPECT MPI. We measured presence, duration, and reversal of adverse symptoms and cardiac events following regadenoson administration in consecutive patients randomized to IV aminophylline (100 mg administered over 30-60 seconds), IV caffeine citrate (60 mg infused over 3-5 minutes), or PO caffeine as coffee or diet cola. Of 241 patients, 152 (63%) received regadenoson reversal intervention. Complete (CR), predominant (PRE), or partial (PR) reversal was observed in 99%. CR by IV aminophylline (87%), IV caffeine (87%), and PO caffeine (78%) were similar (P = NS). Time to CR (162 ± 12.6 seconds, mean ± SD) was similar in treatment arms. PO caffeine was inferior to IV aminophylline for CR + PRE. IV aminophylline and IV caffeine provide rapid, safe reversal of regadenoson-induced adverse effects during SPECT MPI. Oral caffeine appeared similarly effective for CR but not for the combined CR + PRE. Our results suggest PO caffeine may be an effective initial strategy for reversal of regadenoson, but IV aminophylline or IV caffeine should be available to optimize symptom reversal as needed.

  9. Topology-oblivious optimization of MPI broadcast algorithms on extreme-scale platforms

    KAUST Repository

    Hasanov, Khalid; Quintin, Jean-Noë l; Lastovetsky, Alexey

    2015-01-01

    operations for particular architectures by taking into account either their topology or platform parameters. In this work we propose a simple but general approach to optimization of the legacy MPI broadcast algorithms, which are widely used in MPICH and Open

  10. Machine-learning model observer for detection and localization tasks in clinical SPECT-MPI

    Science.gov (United States)

    Parages, Felipe M.; O'Connor, J. Michael; Pretorius, P. Hendrik; Brankov, Jovan G.

    2016-03-01

    In this work we propose a machine-learning MO based on Naive-Bayes classification (NB-MO) for the diagnostic tasks of detection, localization and assessment of perfusion defects in clinical SPECT Myocardial Perfusion Imaging (MPI), with the goal of evaluating several image reconstruction methods used in clinical practice. NB-MO uses image features extracted from polar-maps in order to predict lesion detection, localization and severity scores given by human readers in a series of 3D SPECT-MPI. The population used to tune (i.e. train) the NB-MO consisted of simulated SPECT-MPI cases - divided into normals or with lesions in variable sizes and locations - reconstructed using filtered backprojection (FBP) method. An ensemble of five human specialists (physicians) read a subset of simulated reconstructed images, and assigned a perfusion score for each region of the left-ventricle (LV). Polar-maps generated from the simulated volumes along with their corresponding human scores were used to train five NB-MOs (one per human reader), which are subsequently applied (i.e. tested) on three sets of clinical SPECT-MPI polar maps, in order to predict human detection and localization scores. The clinical "testing" population comprises healthy individuals and patients suffering from coronary artery disease (CAD) in three possible regions, namely: LAD, LcX and RCA. Each clinical case was reconstructed using three reconstruction strategies, namely: FBP with no SC (i.e. scatter compensation), OSEM with Triple Energy Window (TEW) SC method, and OSEM with Effective Source Scatter Estimation (ESSE) SC. Alternative Free-Response (AFROC) analysis of perfusion scores shows that NB-MO predicts a higher human performance for scatter-compensated reconstructions, in agreement with what has been reported in published literature. These results suggest that NB-MO has good potential to generalize well to reconstruction methods not used during training, even for reasonably dissimilar datasets (i

  11. Computed tomography angiography and perfusion to assess coronary artery stenosis causing perfusion defects by single photon emission computed tomography

    DEFF Research Database (Denmark)

    Rochitte, Carlos E; George, Richard T; Chen, Marcus Y

    2014-01-01

    AIMS: To evaluate the diagnostic power of integrating the results of computed tomography angiography (CTA) and CT myocardial perfusion (CTP) to identify coronary artery disease (CAD) defined as a flow limiting coronary artery stenosis causing a perfusion defect by single photon emission computed...... emission computed tomography (SPECT/MPI). Sixteen centres enroled 381 patients who underwent combined CTA-CTP and SPECT/MPI prior to conventional coronary angiography. All four image modalities were analysed in blinded independent core laboratories. The prevalence of obstructive CAD defined by combined ICA...... tomography (SPECT). METHODS AND RESULTS: We conducted a multicentre study to evaluate the accuracy of integrated CTA-CTP for the identification of patients with flow-limiting CAD defined by ≥50% stenosis by invasive coronary angiography (ICA) with a corresponding perfusion deficit on stress single photon...

  12. CT myocardial perfusion imaging. Ready for prime time?

    Energy Technology Data Exchange (ETDEWEB)

    Takx, Richard A.P.; Celeng, Csilla [University Medical Center Utrecht, Department of Radiology, Utrecht (Netherlands); Schoepf, U.J. [Medical University of South Carolina, Division of Cardiovascular Imaging, Department of Radiology and Radiological Science, Charleston, SC (United States); Medical University of South Carolina, Ashley River Tower, Heart and Vascular Center, Charleston, SC (United States)

    2018-03-15

    The detection of functional coronary artery stenosis with coronary CT angiography (CCTA) is suboptimal. Additional CT myocardial perfusion imaging (CT-MPI) may be helpful to identify patients with myocardial ischaemia in whom coronary revascularization therapy would be beneficial. CT-MPI adds incremental diagnostic and prognostic value over obstructive disease on CCTA. It allows for the quantitation of myocardial blood flow and calculation of coronary flow reserve and shows good correlation with {sup 15}O-H{sub 2}O positron emission tomography and invasive fractional flow reserve. In addition, patients prefer CCTA/CT-MPI over SPECT, MRI and invasive coronary angiography. CT-MPI is ready for clinical use for detecting myocardial ischaemia caused by obstructive disease. Nevertheless, the clinical utility of CT-MPI to identify ischaemia in patients with non-obstructive/microvascular disease still has to be established. (orig.)

  13. Resource-Efficient, Hierarchical Auto-Tuning of a Hybrid Lattice Boltzmann Computation on the Cray XT4

    International Nuclear Information System (INIS)

    Williams, Samuel; Carter, Jonathan; Oliker, Leonid; Shalf, John; Yelick, Katherine

    2009-01-01

    We apply auto-tuning to a hybrid MPI-pthreads lattice Boltzmann computation running on the Cray XT4 at National Energy Research Scientific Computing Center (NERSC). Previous work showed that multicore-specific auto-tuning can improve the performance of lattice Boltzmann magnetohydrodynamics (LBMHD) by a factor of 4x when running on dual- and quad-core Opteron dual-socket SMPs. We extend these studies to the distributed memory arena via a hybrid MPI/pthreads implementation. In addition to conventional auto-tuning at the local SMP node, we tune at the message-passing level to determine the optimal aspect ratio as well as the correct balance between MPI tasks and threads per MPI task. Our study presents a detailed performance analysis when moving along an isocurve of constant hardware usage: fixed total memory, total cores, and total nodes. Overall, our work points to approaches for improving intra- and inter-node efficiency on large-scale multicore systems for demanding scientific applications

  14. Performance comparison analysis library communication cluster system using merge sort

    Science.gov (United States)

    Wulandari, D. A. R.; Ramadhan, M. E.

    2018-04-01

    Begins by using a single processor, to increase the speed of computing time, the use of multi-processor was introduced. The second paradigm is known as parallel computing, example cluster. The cluster must have the communication potocol for processing, one of it is message passing Interface (MPI). MPI have many library, both of them OPENMPI and MPICH2. Performance of the cluster machine depend on suitable between performance characters of library communication and characters of the problem so this study aims to analyze the comparative performances libraries in handling parallel computing process. The case study in this research are MPICH2 and OpenMPI. This case research execute sorting’s problem to know the performance of cluster system. The sorting problem use mergesort method. The research method is by implementing OpenMPI and MPICH2 on a Linux-based cluster by using five computer virtual then analyze the performance of the system by different scenario tests and three parameters for to know the performance of MPICH2 and OpenMPI. These performances are execution time, speedup and efficiency. The results of this study showed that the addition of each data size makes OpenMPI and MPICH2 have an average speed-up and efficiency tend to increase but at a large data size decreases. increased data size doesn’t necessarily increased speed up and efficiency but only execution time example in 100000 data size. OpenMPI has a execution time greater than MPICH2 example in 1000 data size average execution time with MPICH2 is 0,009721 and OpenMPI is 0,003895 OpenMPI can customize communication needs.

  15. Combined use of 64-slice computed tomography angiography and gated myocardial perfusion SPECT for the detection of functionally relevant coronary artery stenoses. First results in a clinical setting concerning patients with stable angina

    International Nuclear Information System (INIS)

    Hacker, M.; Hack, N.; Tiling, R.; Jakobs, T.; Nikolaou, K.; Becker, C.; Ziegler, F. von; Knez, A.; Koenig, A.; Klauss, V.

    2007-01-01

    Aim: In patients with stable angina pectoris both morphological and functional information about the coronary artery tree should be present before revascularization therapy is performed. High accuracy was shown for spiral computed tomography (MDCT) angiography acquired with a 64-slice CT scanner compared to invasive coronary angiography (ICA) in detecting obstructive'' coronary artery disease (CAD). Gated myocardial SPECT (MPI) is an established method for the noninvasive assessment of functional significance of coronary stenoses. Aim of the study was to evaluate the combination of 64-slice CT angiography plus MPI in comparison to ICA plus MPI in the detection of hemodynamically relevant coronary artery stenoses in a clinical setting. Patients, methods: 30 patients (63 ± 10.8 years, 23 men) with stable angina (21 with suspected, 9 with known CAD) were investigated. MPI, 64-slice CT angiography and ICA were performed, reversible and fixed perfusion defects were allocated to determining lesions separately for MDCT angiography and ICA. The combination of MDCT angiography plus MPI was compared to the results of ICA plus MPI. Results: Sensitivity, specificity, negative and positive predictive value for the combination of MDCT angiography plus MPI was 85%, 97%, 98% and 79%, respectively, on a vessel-based and 93%, 87%, 93% and 88%, respectively, on a patient-based level. 19 coronary arteries with stenoses =50% in both ICA and MDCT angiography showed no ischemia in MPI. Conclusion: The combination of 64-slice CT angiography and gated myocardial SPECT enabled a comprehensive non-invasive view of the anatomical and functional status of the coronary artery tree. (orig.)

  16. Combined use of 64-slice computed tomography angiography and gated myocardial perfusion SPECT for the detection of functionally relevant coronary artery stenoses. First results in a clinical setting concerning patients with stable angina

    Energy Technology Data Exchange (ETDEWEB)

    Hacker, M.; Hack, N.; Tiling, R. [Klinikum Grosshadern (Germany). Dept. of Nuclear Medicine; Jakobs, T.; Nikolaou, K.; Becker, C. [Klinikum Grosshadern (Germany). Dept. of Clinical Radiology; Ziegler, F. von; Knez, A. [Klinikum Grosshadern (Germany). Dept. of Cardiology; Koenig, A.; Klauss, V. [Medizinische Poliklinik-Innenstadt, Univ. of Munich (Germany). Dept. of Cardiology

    2007-07-01

    Aim: In patients with stable angina pectoris both morphological and functional information about the coronary artery tree should be present before revascularization therapy is performed. High accuracy was shown for spiral computed tomography (MDCT) angiography acquired with a 64-slice CT scanner compared to invasive coronary angiography (ICA) in detecting ''obstructive'' coronary artery disease (CAD). Gated myocardial SPECT (MPI) is an established method for the noninvasive assessment of functional significance of coronary stenoses. Aim of the study was to evaluate the combination of 64-slice CT angiography plus MPI in comparison to ICA plus MPI in the detection of hemodynamically relevant coronary artery stenoses in a clinical setting. Patients, methods: 30 patients (63 {+-} 10.8 years, 23 men) with stable angina (21 with suspected, 9 with known CAD) were investigated. MPI, 64-slice CT angiography and ICA were performed, reversible and fixed perfusion defects were allocated to determining lesions separately for MDCT angiography and ICA. The combination of MDCT angiography plus MPI was compared to the results of ICA plus MPI. Results: Sensitivity, specificity, negative and positive predictive value for the combination of MDCT angiography plus MPI was 85%, 97%, 98% and 79%, respectively, on a vessel-based and 93%, 87%, 93% and 88%, respectively, on a patient-based level. 19 coronary arteries with stenoses =50% in both ICA and MDCT angiography showed no ischemia in MPI. Conclusion: The combination of 64-slice CT angiography and gated myocardial SPECT enabled a comprehensive non-invasive view of the anatomical and functional status of the coronary artery tree. (orig.)

  17. High Performance Computing Multicast

    Science.gov (United States)

    2012-02-01

    A History of the Virtual Synchrony Replication Model,” in Replication: Theory and Practice, Charron-Bost, B., Pedone, F., and Schiper, A. (Eds...Performance Computing IP / IPv4 Internet Protocol (version 4.0) IPMC Internet Protocol MultiCast LAN Local Area Network MCMD Dr. Multicast MPI

  18. Optimization approaches to mpi and area merging-based parallel buffer algorithm

    Directory of Open Access Journals (Sweden)

    Junfu Fan

    Full Text Available On buffer zone construction, the rasterization-based dilation method inevitably introduces errors, and the double-sided parallel line method involves a series of complex operations. In this paper, we proposed a parallel buffer algorithm based on area merging and MPI (Message Passing Interface to improve the performances of buffer analyses on processing large datasets. Experimental results reveal that there are three major performance bottlenecks which significantly impact the serial and parallel buffer construction efficiencies, including the area merging strategy, the task load balance method and the MPI inter-process results merging strategy. Corresponding optimization approaches involving tree-like area merging strategy, the vertex number oriented parallel task partition method and the inter-process results merging strategy were suggested to overcome these bottlenecks. Experiments were carried out to examine the performance efficiency of the optimized parallel algorithm. The estimation results suggested that the optimization approaches could provide high performance and processing ability for buffer construction in a cluster parallel environment. Our method could provide insights into the parallelization of spatial analysis algorithm.

  19. Space/time non-commutative field theories and causality

    International Nuclear Information System (INIS)

    Bozkaya, H.; Fischer, P.; Pitschmann, M.; Schweda, M.; Grosse, H.; Putz, V.; Wulkenhaar, R.

    2003-01-01

    As argued previously, amplitudes of quantum field theories on non-commutative space and time cannot be computed using naive path integral Feynman rules. One of the proposals is to use the Gell-Mann-Low formula with time-ordering applied before performing the integrations. We point out that the previously given prescription should rather be regarded as an interaction-point time-ordering. Causality is explicitly violated inside the region of interaction. It is nevertheless a consistent procedure, which seems to be related to the interaction picture of quantum mechanics. In this framework we compute the one-loop self-energy for a space/time non-commutative φ 4 theory. Although in all intermediate steps only three-momenta play a role, the final result is manifestly Lorentz covariant and agrees with the naive calculation. Deriving the Feynman rules for general graphs, we show, however, that such a picture holds for tadpole lines only. (orig.)

  20. Diagnostic accuracy of combined coronary angiography and adenosine stress myocardial perfusion imaging using 320-detector computed tomography: pilot study

    International Nuclear Information System (INIS)

    Nasis, Arthur; Ko, Brian S.; Leung, Michael C.; Antonis, Paul R.; Wong, Dennis T.; Kyi, Leo; Cameron, James D.; Meredith, Ian T.; Seneviratne, Sujith K.; Nandurkar, Dee; Troupis, John M.

    2013-01-01

    To determine the diagnostic accuracy of combined 320-detector row computed tomography coronary angiography (CTA) and adenosine stress CT myocardial perfusion imaging (CTP) in detecting perfusion abnormalities caused by obstructive coronary artery disease (CAD). Twenty patients with suspected CAD who underwent initial investigation with single-photon-emission computed tomography myocardial perfusion imaging (SPECT-MPI) were recruited and underwent prospectively-gated 320-detector CTA/CTP and invasive angiography. Two blinded cardiologists evaluated invasive angiography images quantitatively (QCA). A blinded nuclear physician analysed SPECT-MPI images for fixed and reversible perfusion defects. Two blinded cardiologists assessed CTA/CTP studies qualitatively. Vessels/territories with both >50 % stenosis on QCA and corresponding perfusion defect on SPECT-MPI were defined as ischaemic and formed the reference standard. All patients completed the CTA/CTP protocol with diagnostic image quality. Of 60 vessels/territories, 17 (28 %) were ischaemic according to QCA/SPECT-MPI criteria. Sensitivity, specificity, PPV, NPV and area under the ROC curve for CTA/CTP was 94 %, 98 %, 94 %, 98 % and 0.96 (P < 0.001) on a per-vessel/territory basis. Mean CTA/CTP radiation dose was 9.2 ± 7.4 mSv compared with 13.2 ± 2.2 mSv for SPECT-MPI (P < 0.001). Combined 320-detector CTA/CTP is accurate in identifying obstructive CAD causing perfusion abnormalities compared with combined QCA/SPECT-MPI, achieved with lower radiation dose than SPECT-MPI. (orig.)

  1. Diagnostic accuracy of combined coronary angiography and adenosine stress myocardial perfusion imaging using 320-detector computed tomography: pilot study

    Energy Technology Data Exchange (ETDEWEB)

    Nasis, Arthur; Ko, Brian S.; Leung, Michael C.; Antonis, Paul R.; Wong, Dennis T.; Kyi, Leo; Cameron, James D.; Meredith, Ian T.; Seneviratne, Sujith K. [Southern Health and Monash University, Monash Cardiovascular Research Centre, Monash Heart, Department of Medicine Monash Medical Centre (MMC), Melbourne (Australia); Nandurkar, Dee; Troupis, John M. [MMC, Southern Health, Department of Diagnostic Imaging, Melbourne (Australia)

    2013-07-15

    To determine the diagnostic accuracy of combined 320-detector row computed tomography coronary angiography (CTA) and adenosine stress CT myocardial perfusion imaging (CTP) in detecting perfusion abnormalities caused by obstructive coronary artery disease (CAD). Twenty patients with suspected CAD who underwent initial investigation with single-photon-emission computed tomography myocardial perfusion imaging (SPECT-MPI) were recruited and underwent prospectively-gated 320-detector CTA/CTP and invasive angiography. Two blinded cardiologists evaluated invasive angiography images quantitatively (QCA). A blinded nuclear physician analysed SPECT-MPI images for fixed and reversible perfusion defects. Two blinded cardiologists assessed CTA/CTP studies qualitatively. Vessels/territories with both >50 % stenosis on QCA and corresponding perfusion defect on SPECT-MPI were defined as ischaemic and formed the reference standard. All patients completed the CTA/CTP protocol with diagnostic image quality. Of 60 vessels/territories, 17 (28 %) were ischaemic according to QCA/SPECT-MPI criteria. Sensitivity, specificity, PPV, NPV and area under the ROC curve for CTA/CTP was 94 %, 98 %, 94 %, 98 % and 0.96 (P < 0.001) on a per-vessel/territory basis. Mean CTA/CTP radiation dose was 9.2 {+-} 7.4 mSv compared with 13.2 {+-} 2.2 mSv for SPECT-MPI (P < 0.001). Combined 320-detector CTA/CTP is accurate in identifying obstructive CAD causing perfusion abnormalities compared with combined QCA/SPECT-MPI, achieved with lower radiation dose than SPECT-MPI. (orig.)

  2. Lemon : An MPI parallel I/O library for data encapsulation using LIME

    NARCIS (Netherlands)

    Deuzeman, Albert; Reker, Siebren; Urbach, Carsten

    We introduce Lemon, an MPI parallel I/O library that provides efficient parallel I/O of both binary and metadata on massively parallel architectures. Motivated by the demands of the lattice Quantum Chromodynamics community, the data is stored in the SciDAC Lattice QCD Interchange Message

  3. Enabling Highly-Scalable Remote Memory Access Programming with MPI-3 One Sided

    Directory of Open Access Journals (Sweden)

    Robert Gerstenberger

    2014-01-01

    Full Text Available Modern interconnects offer remote direct memory access (RDMA features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for exploiting RDMA networks directly, however, it's scalability and practicability has to be demonstrated in practice. In this work, we develop scalable bufferless protocols that implement the MPI-3.0 specification. Our protocols support scaling to millions of cores with negligible memory consumption while providing highest performance and minimal overheads. To arm programmers, we provide a spectrum of performance models for all critical functions and demonstrate the usability of our library and models with several application studies with up to half a million processes. We show that our design is comparable to, or better than UPC and Fortran Coarrays in terms of latency, bandwidth and message rate. We also demonstrate application performance improvements with comparable programming complexity.

  4. Portable and Transparent Message Compression in MPI Libraries to Improve the Performance and Scalability of Parallel Applications

    Energy Technology Data Exchange (ETDEWEB)

    Albonesi, David; Burtscher, Martin

    2009-04-17

    The goal of this project has been to develop a lossless compression algorithm for message-passing libraries that can accelerate HPC systems by reducing the communication time. Because both compression and decompression have to be performed in software in real time, the algorithm has to be extremely fast while still delivering a good compression ratio. During the first half of this project, they designed a new compression algorithm called FPC for scientific double-precision data, made the source code available on the web, and published two papers describing its operation, the first in the proceedings of the Data Compression Conference and the second in the IEEE Transactions on Computers. At comparable average compression ratios, this algorithm compresses and decompresses 10 to 100 times faster than BZIP2, DFCM, FSD, GZIP, and PLMI on the three architectures tested. With prediction tables that fit into the CPU's L1 data acache, FPC delivers a guaranteed throughput of six gigabits per second on a 1.6 GHz Itanium 2 system. The C source code and documentation of FPC are posted on-line and have already been downloaded hundreds of times. To evaluate FPC, they gathered 13 real-world scientific datasets from around the globe, including satellite data, crash-simulation data, and messages from HPC systems. Based on the large number of requests they received, they also made these datasets available to the community (with permission of the original sources). While FPC represents a great step forward, it soon became clear that its throughput was too slow for the emerging 10 gigabits per second networks. Hence, no speedup can be gained by including this algorithm in an MPI library. They therefore changed the aim of the second half of the project. Instead of implementing FPC in an MPI library, they refocused their efforts to develop a parallel compression algorithm to further boost the throughput. After all, all modern high-end microprocessors contain multiple CPUs on a

  5. Pthreads vs MPI Parallel Performance of Angular-Domain Decomposed S

    International Nuclear Information System (INIS)

    Azmy, Y.Y.; Barnett, D.A.

    2000-01-01

    Two programming models for parallelizing the Angular Domain Decomposition (ADD) of the discrete ordinates (S n ) approximation of the neutron transport equation are examined. These are the shared memory model based on the POSIX threads (Pthreads) standard, and the message passing model based on the Message Passing Interface (MPI) standard. These standard libraries are available on most multiprocessor platforms thus making the resulting parallel codes widely portable. The question is: on a fixed platform, and for a particular code solving a given test problem, which of the two programming models delivers better parallel performance? Such comparison is possible on Symmetric Multi-Processors (SMP) architectures in which several CPUs physically share a common memory, and in addition are capable of emulating message passing functionality. Implementation of the two-dimensional,(S n ), Arbitrarily High Order Transport (AHOT) code for solving neutron transport problems using these two parallelization models is described. Measured parallel performance of each model on the COMPAQ AlphaServer 8400 and the SGI Origin 2000 platforms is described, and comparison of the observed speedup for the two programming models is reported. For the case presented in this paper it appears that the MPI implementation scales better than the Pthreads implementation on both platforms

  6. Non-conforming finite-element formulation for cardiac electrophysiology: an effective approach to reduce the computation time of heart simulations without compromising accuracy

    Science.gov (United States)

    Hurtado, Daniel E.; Rojas, Guillermo

    2018-04-01

    Computer simulations constitute a powerful tool for studying the electrical activity of the human heart, but computational effort remains prohibitively high. In order to recover accurate conduction velocities and wavefront shapes, the mesh size in linear element (Q1) formulations cannot exceed 0.1 mm. Here we propose a novel non-conforming finite-element formulation for the non-linear cardiac electrophysiology problem that results in accurate wavefront shapes and lower mesh-dependance in the conduction velocity, while retaining the same number of global degrees of freedom as Q1 formulations. As a result, coarser discretizations of cardiac domains can be employed in simulations without significant loss of accuracy, thus reducing the overall computational effort. We demonstrate the applicability of our formulation in biventricular simulations using a coarse mesh size of ˜ 1 mm, and show that the activation wave pattern closely follows that obtained in fine-mesh simulations at a fraction of the computation time, thus improving the accuracy-efficiency trade-off of cardiac simulations.

  7. Computing a Non-trivial Lower Bound on the Joint Entropy between Two Images

    Energy Technology Data Exchange (ETDEWEB)

    Perumalla, Kalyan S. [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2017-03-01

    In this report, a non-trivial lower bound on the joint entropy of two non-identical images is developed, which is greater than the individual entropies of the images. The lower bound is the least joint entropy possible among all pairs of images that have the same histograms as those of the given images. New algorithms are presented to compute the joint entropy lower bound with a computation time proportional to S log S where S is the number of histogram bins of the images. This is faster than the traditional methods of computing the exact joint entropy with a computation time that is quadratic in S .

  8. Multi-GPU-based acceleration of the explicit time domain volume integral equation solver using MPI-OpenACC

    KAUST Repository

    Feki, Saber

    2013-07-01

    An explicit marching-on-in-time (MOT)-based time-domain volume integral equation (TDVIE) solver has recently been developed for characterizing transient electromagnetic wave interactions on arbitrarily shaped dielectric bodies (A. Al-Jarro et al., IEEE Trans. Antennas Propag., vol. 60, no. 11, 2012). The solver discretizes the spatio-temporal convolutions of the source fields with the background medium\\'s Green function using nodal discretization in space and linear interpolation in time. The Green tensor, which involves second order spatial and temporal derivatives, is computed using finite differences on the temporal and spatial grid. A predictor-corrector algorithm is used to maintain the stability of the MOT scheme. The simplicity of the discretization scheme permits the computation of the discretized spatio-temporal convolutions on the fly during time marching; no \\'interaction\\' matrices are pre-computed or stored resulting in a memory efficient scheme. As a result, most often the applicability of this solver to the characterization of wave interactions on electrically large structures is limited by the computation time but not the memory. © 2013 IEEE.

  9. A lightweight communication library for distributed computing

    NARCIS (Netherlands)

    Groen, D.; Rieder, S.; Grosso, P.; de Laat, C.; Portegies Zwart, S.

    2010-01-01

    We present MPWide, a platform-independent communication library for performing message passing between computers. Our library allows coupling of several local message passing interface (MPI) applications through a long-distance network and is specifically optimized for such communications. The

  10. Development of a real time imaging-based guidance system of magnetic nanoparticles for targeted drug delivery

    International Nuclear Information System (INIS)

    Zhang, Xingming; Le, Tuan-Anh; Yoon, Jungwon

    2017-01-01

    Targeted drug delivery using magnetic nanoparticles is an efficient technique as molecules can be directed toward specific tissues inside a human body. For the first time, we implemented a real-time imaging-based guidance system of nanoparticles using untethered electro-magnetic devices for simultaneous guiding and tracking. In this paper a low-amplitude-excitation-field magnetic particle imaging (MPI) is introduced. Based on this imaging technology, a hybrid system comprised of an electromagnetic actuator and MPI was used to navigate nanoparticles in a non-invasive way. The real-time low-amplitude-excitation-field MPI and electromagnetic actuator of this navigation system are achieved by applying a time-division multiplexing scheme to the coil topology. A one dimensional nanoparticle navigation system was built to demonstrate the feasibility of the proposed approach and it could achieve a 2 Hz navigation update rate with the field gradient of 3.5 T/m during the imaging mode and 8.75 T/m during the actuation mode. Particles with both 90 nm and 5 nm diameters could be successfully manipulated and monitored in a tube through the proposed system, which can significantly enhance targeting efficiency and allow precise analysis in a real drug delivery. - Highlights: • A real-time system comprised of an electromagnetic actuator and a low-amplitude-excitation-field MPI can navigate magnetic nanoparticles. • The imaging scheme is feasible to enlarge field of view size. • The proposed navigation system can be cost efficient, compact, and optimized for targeting of the nanoparticles.

  11. Development of a real time imaging-based guidance system of magnetic nanoparticles for targeted drug delivery

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Xingming [School of Naval Architecture and Ocean Engineering, Harbin Institute of Technology at Weihai, Weihai, Shandong (China); School of Mechanical and Aerospace Engineering & ReCAPT, Gyeongsang National University, Jinju 660-701 (Korea, Republic of); Le, Tuan-Anh [School of Mechanical and Aerospace Engineering & ReCAPT, Gyeongsang National University, Jinju 660-701 (Korea, Republic of); Yoon, Jungwon, E-mail: jwyoon@gnu.ac.kr [School of Mechanical and Aerospace Engineering & ReCAPT, Gyeongsang National University, Jinju 660-701 (Korea, Republic of)

    2017-04-01

    Targeted drug delivery using magnetic nanoparticles is an efficient technique as molecules can be directed toward specific tissues inside a human body. For the first time, we implemented a real-time imaging-based guidance system of nanoparticles using untethered electro-magnetic devices for simultaneous guiding and tracking. In this paper a low-amplitude-excitation-field magnetic particle imaging (MPI) is introduced. Based on this imaging technology, a hybrid system comprised of an electromagnetic actuator and MPI was used to navigate nanoparticles in a non-invasive way. The real-time low-amplitude-excitation-field MPI and electromagnetic actuator of this navigation system are achieved by applying a time-division multiplexing scheme to the coil topology. A one dimensional nanoparticle navigation system was built to demonstrate the feasibility of the proposed approach and it could achieve a 2 Hz navigation update rate with the field gradient of 3.5 T/m during the imaging mode and 8.75 T/m during the actuation mode. Particles with both 90 nm and 5 nm diameters could be successfully manipulated and monitored in a tube through the proposed system, which can significantly enhance targeting efficiency and allow precise analysis in a real drug delivery. - Highlights: • A real-time system comprised of an electromagnetic actuator and a low-amplitude-excitation-field MPI can navigate magnetic nanoparticles. • The imaging scheme is feasible to enlarge field of view size. • The proposed navigation system can be cost efficient, compact, and optimized for targeting of the nanoparticles.

  12. Cluster Computing For Real Time Seismic Array Analysis.

    Science.gov (United States)

    Martini, M.; Giudicepietro, F.

    A seismic array is an instrument composed by a dense distribution of seismic sen- sors that allow to measure the directional properties of the wavefield (slowness or wavenumber vector) radiated by a seismic source. Over the last years arrays have been widely used in different fields of seismological researches. In particular they are applied in the investigation of seismic sources on volcanoes where they can be suc- cessfully used for studying the volcanic microtremor and long period events which are critical for getting information on the volcanic systems evolution. For this reason arrays could be usefully employed for the volcanoes monitoring, however the huge amount of data produced by this type of instruments and the processing techniques which are quite time consuming limited their potentiality for this application. In order to favor a direct application of arrays techniques to continuous volcano monitoring we designed and built a small PC cluster able to near real time computing the kinematics properties of the wavefield (slowness or wavenumber vector) produced by local seis- mic source. The cluster is composed of 8 Intel Pentium-III bi-processors PC working at 550 MHz, and has 4 Gigabytes of RAM memory. It runs under Linux operating system. The developed analysis software package is based on the Multiple SIgnal Classification (MUSIC) algorithm and is written in Fortran. The message-passing part is based upon the LAM programming environment package, an open-source imple- mentation of the Message Passing Interface (MPI). The developed software system includes modules devote to receiving date by internet and graphical applications for the continuous displaying of the processing results. The system has been tested with a data set collected during a seismic experiment conducted on Etna in 1999 when two dense seismic arrays have been deployed on the northeast and the southeast flanks of this volcano. A real time continuous acquisition system has been simulated by

  13. The reliable solution and computation time of variable parameters Logistic model

    OpenAIRE

    Pengfei, Wang; Xinnong, Pan

    2016-01-01

    The reliable computation time (RCT, marked as Tc) when applying a double precision computation of a variable parameters logistic map (VPLM) is studied. First, using the method proposed, the reliable solutions for the logistic map are obtained. Second, for a time-dependent non-stationary parameters VPLM, 10000 samples of reliable experiments are constructed, and the mean Tc is then computed. The results indicate that for each different initial value, the Tcs of the VPLM are generally different...

  14. PhyloBayes MPI: phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment.

    Science.gov (United States)

    Lartillot, Nicolas; Rodrigue, Nicolas; Stubbs, Daniel; Richer, Jacques

    2013-07-01

    Modeling across site variation of the substitution process is increasingly recognized as important for obtaining more accurate phylogenetic reconstructions. Both finite and infinite mixture models have been proposed and have been shown to significantly improve on classical single-matrix models. Compared with their finite counterparts, infinite mixtures have a greater expressivity. However, they are computationally more challenging. This has resulted in practical compromises in the design of infinite mixture models. In particular, a fast but simplified version of a Dirichlet process model over equilibrium frequency profiles implemented in PhyloBayes has often been used in recent phylogenomics studies, while more refined model structures, more realistic and empirically more fit, have been practically out of reach. We introduce a message passing interface version of PhyloBayes, implementing the Dirichlet process mixture models as well as more classical empirical matrices and finite mixtures. The parallelization is made efficient thanks to the combination of two algorithmic strategies: a partial Gibbs sampling update of the tree topology and the use of a truncated stick-breaking representation for the Dirichlet process prior. The implementation shows close to linear gains in computational speed for up to 64 cores, thus allowing faster phylogenetic reconstruction under complex mixture models. PhyloBayes MPI is freely available from our website www.phylobayes.org.

  15. Multi-GPU hybrid programming accelerated three-dimensional phase-field model in binary alloy

    Directory of Open Access Journals (Sweden)

    Changsheng Zhu

    2018-03-01

    Full Text Available In the process of dendritic growth simulation, the computational efficiency and the problem scales have extremely important influence on simulation efficiency of three-dimensional phase-field model. Thus, seeking for high performance calculation method to improve the computational efficiency and to expand the problem scales has a great significance to the research of microstructure of the material. A high performance calculation method based on MPI+CUDA hybrid programming model is introduced. Multi-GPU is used to implement quantitative numerical simulations of three-dimensional phase-field model in binary alloy under the condition of multi-physical processes coupling. The acceleration effect of different GPU nodes on different calculation scales is explored. On the foundation of multi-GPU calculation model that has been introduced, two optimization schemes, Non-blocking communication optimization and overlap of MPI and GPU computing optimization, are proposed. The results of two optimization schemes and basic multi-GPU model are compared. The calculation results show that the use of multi-GPU calculation model can improve the computational efficiency of three-dimensional phase-field obviously, which is 13 times to single GPU, and the problem scales have been expanded to 8193. The feasibility of two optimization schemes is shown, and the overlap of MPI and GPU computing optimization has better performance, which is 1.7 times to basic multi-GPU model, when 21 GPUs are used.

  16. Time dependency of the prediction skill for the North Atlantic subpolar gyre in initialized decadal hindcasts with MPI-ESM

    Science.gov (United States)

    Brune, Sebastian; Düsterhus, Andre; Pohlmann, Holger; Müller, Wolfgang; Baehr, Johanna

    2017-04-01

    We analyze the time dependency of decadal hindcast skill in the North Atlantic subpolar gyre within the time period 1961-2013. We compare anomaly correlation coefficients and interquartile ranges of total upper ocean heat content and sea surface temperature for three differently initialized sets of hindcast simulations with the global coupled model MPI-ESM. All initializations use weakly coupled assimilation with the same full-field nudging in the atmospheric component and different assimilation techniques for oceanic temperature and salinity: (1) ensemble Kalman filter assimilating EN4 and HadISST observations, (2) nudging of anomalies to ORAS4 reanalysis, (3) nudging of full values to ORAS4 reanalysis. We find that hindcast skill depends strongly on the evaluation time period, with higher hindcast skill during strong multiyear trends and lower hindcast skill in the absence of such trends. While there may only be small differences between the prediction systems in the analysis focusing on the entire hindcast period, these differences between the hindcast systems are much more pronounced when investigating any 20-year subperiod within the entire hindcast period. For the ensemble Kalman filter high skill in the assimilation experiment is generally linked to high skill in the initialized hindcasts. Such direct link does not seem to exist in the hindcasts initialized by either nudged system. In the ensemble Kalman filter initialized hindcasts, we find significant hindcast skill for up to 5 to 8 lead years, except for the 1970s. In the nudged system initialized hindcasts, hindcast skill is consistently diminished in lead years 2 and 3 with lowest skill in the 1970s as well. Overall, we find that a model-consistent assimilation technique can improve hindcast skill. Further, the evaluation of 20 year subperiods within the full hindcast period provides essential insights to judge the success of both the assimilation and the subsequent hindcast skill.

  17. Patient-specific non-linear finite element modelling for predicting soft organ deformation in real-time: application to non-rigid neuroimage registration.

    Science.gov (United States)

    Wittek, Adam; Joldes, Grand; Couton, Mathieu; Warfield, Simon K; Miller, Karol

    2010-12-01

    Long computation times of non-linear (i.e. accounting for geometric and material non-linearity) biomechanical models have been regarded as one of the key factors preventing application of such models in predicting organ deformation for image-guided surgery. This contribution presents real-time patient-specific computation of the deformation field within the brain for six cases of brain shift induced by craniotomy (i.e. surgical opening of the skull) using specialised non-linear finite element procedures implemented on a graphics processing unit (GPU). In contrast to commercial finite element codes that rely on an updated Lagrangian formulation and implicit integration in time domain for steady state solutions, our procedures utilise the total Lagrangian formulation with explicit time stepping and dynamic relaxation. We used patient-specific finite element meshes consisting of hexahedral and non-locking tetrahedral elements, together with realistic material properties for the brain tissue and appropriate contact conditions at the boundaries. The loading was defined by prescribing deformations on the brain surface under the craniotomy. Application of the computed deformation fields to register (i.e. align) the preoperative and intraoperative images indicated that the models very accurately predict the intraoperative deformations within the brain. For each case, computing the brain deformation field took less than 4 s using an NVIDIA Tesla C870 GPU, which is two orders of magnitude reduction in computation time in comparison to our previous study in which the brain deformation was predicted using a commercial finite element solver executed on a personal computer. Copyright © 2010 Elsevier Ltd. All rights reserved.

  18. SeMPI: a genome-based secondary metabolite prediction and identification web server.

    Science.gov (United States)

    Zierep, Paul F; Padilla, Natàlia; Yonchev, Dimitar G; Telukunta, Kiran K; Klementz, Dennis; Günther, Stefan

    2017-07-03

    The secondary metabolism of bacteria, fungi and plants yields a vast number of bioactive substances. The constantly increasing amount of published genomic data provides the opportunity for an efficient identification of gene clusters by genome mining. Conversely, for many natural products with resolved structures, the encoding gene clusters have not been identified yet. Even though genome mining tools have become significantly more efficient in the identification of biosynthetic gene clusters, structural elucidation of the actual secondary metabolite is still challenging, especially due to as yet unpredictable post-modifications. Here, we introduce SeMPI, a web server providing a prediction and identification pipeline for natural products synthesized by polyketide synthases of type I modular. In order to limit the possible structures of PKS products and to include putative tailoring reactions, a structural comparison with annotated natural products was introduced. Furthermore, a benchmark was designed based on 40 gene clusters with annotated PKS products. The web server of the pipeline (SeMPI) is freely available at: http://www.pharmaceutical-bioinformatics.de/sempi. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  19. Scalable space-time adaptive simulation tools for computational electrocardiology

    OpenAIRE

    Krause, Dorian; Krause, Rolf

    2013-01-01

    This work is concerned with the development of computational tools for the solution of reaction-diffusion equations from the field of computational electrocardiology. We designed lightweight spatially and space-time adaptive schemes for large-scale parallel simulations. We propose two different adaptive schemes based on locally structured meshes, managed either via a conforming coarse tessellation or a forest of shallow trees. A crucial ingredient of our approach is a non-conforming morta...

  20. Numerical Modeling of 3D Seismic Wave Propagation around Yogyakarta, the Southern Part of Central Java, Indonesia, Using Spectral-Element Method on MPI-GPU Cluster

    Science.gov (United States)

    Sudarmaji; Rudianto, Indra; Eka Nurcahya, Budi

    2018-04-01

    A strong tectonic earthquake with a magnitude of 5.9 Richter scale has been occurred in Yogyakarta and Central Java on May 26, 2006. The earthquake has caused severe damage in Yogyakarta and the southern part of Central Java, Indonesia. The understanding of seismic response of earthquake among ground shaking and the level of building damage is important. We present numerical modeling of 3D seismic wave propagation around Yogyakarta and the southern part of Central Java using spectral-element method on MPI-GPU (Graphics Processing Unit) computer cluster to observe its seismic response due to the earthquake. The homogeneous 3D realistic model is generated with detailed topography surface. The influences of free surface topography and layer discontinuity of the 3D model among the seismic response are observed. The seismic wave field is discretized using spectral-element method. The spectral-element method is solved on a mesh of hexahedral elements that is adapted to the free surface topography and the internal discontinuity of the model. To increase the data processing capabilities, the simulation is performed on a GPU cluster with implementation of MPI (Message Passing Interface).

  1. How to stir a revolution as a reluctant rebel: Rudolf Trümpy in the Alps

    Science.gov (United States)

    Şengör, A. M. Celâl; Bernoulli, Daniel

    2011-07-01

    Rudolf Trümpy (1921-2009) was one of the great Alpine geologists of the twentieth century and an influential figure in the international geological community. He played a dominant role in the change of opinion concerning the Alpine evolution by showing that normal faulting dominated the early development of the Alpine realm from the Triassic to the early Cretaceous. This provided a convenient model for later plate-tectonic interpretations of collisional mountain belts. His further recognition of strike-slip faulting during all stages of the Alpine evolution presaged the realisation that the Alps were not built by a simple open-and-shut mechanism. Trümpy was educated during an intellectual lull, a time when simplistic models of the earth behaviour inherited from the middle of the nineteenth century became prevalent under the influence of a close-minded, positivist approach to geological problems. This period, which we term the Dark Intermezzo, lasted from about 1925 to 1965. The grand syntheses of Suess and Argand which preceded this period were viewed from this narrow angle and consequently misunderstood. It was thought that earth history was punctuated by global orogenic events of short duration taking place within and among continents and oceans whose relative positions had remained fixed since the origin of the planet. These views, summarised under the term `fixism', were developed when the ocean floors were almost totally unknown. When data began coming in from the post World War II oceanographic surveys, the world geological community was slow to receive and digest them. Trümpy followed these developments closely, realising that his work was important in placing the geology of the mountain belts within the emerging, new theoretical framework. He adopted the position of a critic and emphasised where detailed knowledge of the Alps, unquestionably the best known mountain belt in the world, supported and where it contradicted the new ideas. His voice was

  2. Advanced computational simulations of water waves interacting with wave energy converters

    Science.gov (United States)

    Pathak, Ashish; Freniere, Cole; Raessi, Mehdi

    2017-03-01

    Wave energy converter (WEC) devices harness the renewable ocean wave energy and convert it into useful forms of energy, e.g. mechanical or electrical. This paper presents an advanced 3D computational framework to study the interaction between water waves and WEC devices. The computational tool solves the full Navier-Stokes equations and considers all important effects impacting the device performance. To enable large-scale simulations in fast turnaround times, the computational solver was developed in an MPI parallel framework. A fast multigrid preconditioned solver is introduced to solve the computationally expensive pressure Poisson equation. The computational solver was applied to two surface-piercing WEC geometries: bottom-hinged cylinder and flap. Their numerically simulated response was validated against experimental data. Additional simulations were conducted to investigate the applicability of Froude scaling in predicting full-scale WEC response from the model experiments.

  3. astroABC : An Approximate Bayesian Computation Sequential Monte Carlo sampler for cosmological parameter estimation

    Energy Technology Data Exchange (ETDEWEB)

    Jennings, E.; Madigan, M.

    2017-04-01

    Given the complexity of modern cosmological parameter inference where we arefaced with non-Gaussian data and noise, correlated systematics and multi-probecorrelated data sets, the Approximate Bayesian Computation (ABC) method is apromising alternative to traditional Markov Chain Monte Carlo approaches in thecase where the Likelihood is intractable or unknown. The ABC method is called"Likelihood free" as it avoids explicit evaluation of the Likelihood by using aforward model simulation of the data which can include systematics. Weintroduce astroABC, an open source ABC Sequential Monte Carlo (SMC) sampler forparameter estimation. A key challenge in astrophysics is the efficient use oflarge multi-probe datasets to constrain high dimensional, possibly correlatedparameter spaces. With this in mind astroABC allows for massive parallelizationusing MPI, a framework that handles spawning of jobs across multiple nodes. Akey new feature of astroABC is the ability to create MPI groups with differentcommunicators, one for the sampler and several others for the forward modelsimulation, which speeds up sampling time considerably. For smaller jobs thePython multiprocessing option is also available. Other key features include: aSequential Monte Carlo sampler, a method for iteratively adapting tolerancelevels, local covariance estimate using scikit-learn's KDTree, modules forspecifying optimal covariance matrix for a component-wise or multivariatenormal perturbation kernel, output and restart files are backed up everyiteration, user defined metric and simulation methods, a module for specifyingheterogeneous parameter priors including non-standard prior PDFs, a module forspecifying a constant, linear, log or exponential tolerance level,well-documented examples and sample scripts. This code is hosted online athttps://github.com/EliseJ/astroABC

  4. Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals.

    Science.gov (United States)

    Guven, Onur; Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G

    2016-06-01

    This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors' previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp-p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat.

  5. Non-Abelian Kubo formula and the multiple time-scale method

    International Nuclear Information System (INIS)

    Zhang, X.; Li, J.

    1996-01-01

    The non-Abelian Kubo formula is derived from the kinetic theory. That expression is compared with the one obtained using the eikonal for a Chern endash Simons theory. The multiple time-scale method is used to study the non-Abelian Kubo formula, and the damping rate for longitudinal color waves is computed. copyright 1996 Academic Press, Inc

  6. Cpu/gpu Computing for AN Implicit Multi-Block Compressible Navier-Stokes Solver on Heterogeneous Platform

    Science.gov (United States)

    Deng, Liang; Bai, Hanli; Wang, Fang; Xu, Qingxin

    2016-06-01

    CPU/GPU computing allows scientists to tremendously accelerate their numerical codes. In this paper, we port and optimize a double precision alternating direction implicit (ADI) solver for three-dimensional compressible Navier-Stokes equations from our in-house Computational Fluid Dynamics (CFD) software on heterogeneous platform. First, we implement a full GPU version of the ADI solver to remove a lot of redundant data transfers between CPU and GPU, and then design two fine-grain schemes, namely “one-thread-one-point” and “one-thread-one-line”, to maximize the performance. Second, we present a dual-level parallelization scheme using the CPU/GPU collaborative model to exploit the computational resources of both multi-core CPUs and many-core GPUs within the heterogeneous platform. Finally, considering the fact that memory on a single node becomes inadequate when the simulation size grows, we present a tri-level hybrid programming pattern MPI-OpenMP-CUDA that merges fine-grain parallelism using OpenMP and CUDA threads with coarse-grain parallelism using MPI for inter-node communication. We also propose a strategy to overlap the computation with communication using the advanced features of CUDA and MPI programming. We obtain speedups of 6.0 for the ADI solver on one Tesla M2050 GPU in contrast to two Xeon X5670 CPUs. Scalability tests show that our implementation can offer significant performance improvement on heterogeneous platform.

  7. Matrix Factorizations at Scale: a Comparison of Scientific Data Analytics in Spark and C+MPI Using Three Case Studies

    Energy Technology Data Exchange (ETDEWEB)

    Gittens, Alex; Devarakonda, Aditya; Racah, Evan; Ringenburg, Michael; Gerhardt, Lisa; Kottalam, Jey; Liu, Jialin; Maschhoff, Kristyn; Canon, Shane; Chhugani, Jatin; Sharma, Pramod; Yang, Jiyan; Demmel, James; Harrell, Jim; Krishnamurthy, Venkat; Mahoney, Michael; Prabhat, Mr

    2016-05-12

    We explore the trade-offs of performing linear algebra using Apache Spark, compared to traditional C and MPI implementations on HPC platforms. Spark is designed for data analytics on cluster computing platforms with access to local disks and is optimized for data-parallel tasks. We examine three widely-used and important matrix factorizations: NMF (for physical plausibility), PCA (for its ubiquity) and CX (for data interpretability). We apply these methods to 1.6TB particle physics, 2.2TB and 16TB climate modeling and 1.1TB bioimaging data. The data matrices are tall-and-skinny which enable the algorithms to map conveniently into Spark’s data parallel model. We perform scaling experiments on up to 1600 Cray XC40 nodes, describe the sources of slowdowns, and provide tuning guidance to obtain high performance.

  8. The ELPA library: scalable parallel eigenvalue solutions for electronic structure theory and computational science.

    Science.gov (United States)

    Marek, A; Blum, V; Johanni, R; Havu, V; Lang, B; Auckenthaler, T; Heinecke, A; Bungartz, H-J; Lederer, H

    2014-05-28

    Obtaining the eigenvalues and eigenvectors of large matrices is a key problem in electronic structure theory and many other areas of computational science. The computational effort formally scales as O(N(3)) with the size of the investigated problem, N (e.g. the electron count in electronic structure theory), and thus often defines the system size limit that practical calculations cannot overcome. In many cases, more than just a small fraction of the possible eigenvalue/eigenvector pairs is needed, so that iterative solution strategies that focus only on a few eigenvalues become ineffective. Likewise, it is not always desirable or practical to circumvent the eigenvalue solution entirely. We here review some current developments regarding dense eigenvalue solvers and then focus on the Eigenvalue soLvers for Petascale Applications (ELPA) library, which facilitates the efficient algebraic solution of symmetric and Hermitian eigenvalue problems for dense matrices that have real-valued and complex-valued matrix entries, respectively, on parallel computer platforms. ELPA addresses standard as well as generalized eigenvalue problems, relying on the well documented matrix layout of the Scalable Linear Algebra PACKage (ScaLAPACK) library but replacing all actual parallel solution steps with subroutines of its own. For these steps, ELPA significantly outperforms the corresponding ScaLAPACK routines and proprietary libraries that implement the ScaLAPACK interface (e.g. Intel's MKL). The most time-critical step is the reduction of the matrix to tridiagonal form and the corresponding backtransformation of the eigenvectors. ELPA offers both a one-step tridiagonalization (successive Householder transformations) and a two-step transformation that is more efficient especially towards larger matrices and larger numbers of CPU cores. ELPA is based on the MPI standard, with an early hybrid MPI-OpenMPI implementation available as well. Scalability beyond 10,000 CPU cores for problem

  9. Non-Determinism: An Abstract Concept in Computer Science Studies

    Science.gov (United States)

    Armoni, Michal; Gal-Ezer, Judith

    2007-01-01

    Non-determinism is one of the most important, yet abstract, recurring concepts of Computer Science. It plays an important role in Computer Science areas such as formal language theory, computability theory, distributed computing, and operating systems. We conducted a series of studies on the perception of non-determinism. In the current research,…

  10. Massively parallel computation of PARASOL code on the Origin 3800 system

    International Nuclear Information System (INIS)

    Hosokawa, Masanari; Takizuka, Tomonori

    2001-10-01

    The divertor particle simulation code named PARASOL simulates open-field plasmas between divertor walls self-consistently by using an electrostatic PIC method and a binary collision Monte Carlo model. The PARASOL parallelized with MPI-1.1 for scalar parallel computer worked on Intel Paragon XP/S system. A system SGI Origin 3800 was newly installed (May, 2001). The parallel programming was improved at this switchover. As a result of the high-performance new hardware and this improvement, the PARASOL is speeded up by about 60 times with the same number of processors. (author)

  11. 12 CFR 1102.27 - Computing time.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Computing time. 1102.27 Section 1102.27 Banks... for Proceedings § 1102.27 Computing time. (a) General rule. In computing any period of time prescribed... time begins to run is not included. The last day so computed is included, unless it is a Saturday...

  12. Non-unitary probabilistic quantum computing

    Science.gov (United States)

    Gingrich, Robert M.; Williams, Colin P.

    2004-01-01

    We present a method for designing quantum circuits that perform non-unitary quantum computations on n-qubit states probabilistically, and give analytic expressions for the success probability and fidelity.

  13. Time-domain seismic modeling in viscoelastic media for full waveform inversion on heterogeneous computing platforms with OpenCL

    Science.gov (United States)

    Fabien-Ouellet, Gabriel; Gloaguen, Erwan; Giroux, Bernard

    2017-03-01

    Full Waveform Inversion (FWI) aims at recovering the elastic parameters of the Earth by matching recordings of the ground motion with the direct solution of the wave equation. Modeling the wave propagation for realistic scenarios is computationally intensive, which limits the applicability of FWI. The current hardware evolution brings increasing parallel computing power that can speed up the computations in FWI. However, to take advantage of the diversity of parallel architectures presently available, new programming approaches are required. In this work, we explore the use of OpenCL to develop a portable code that can take advantage of the many parallel processor architectures now available. We present a program called SeisCL for 2D and 3D viscoelastic FWI in the time domain. The code computes the forward and adjoint wavefields using finite-difference and outputs the gradient of the misfit function given by the adjoint state method. To demonstrate the code portability on different architectures, the performance of SeisCL is tested on three different devices: Intel CPUs, NVidia GPUs and Intel Xeon PHI. Results show that the use of GPUs with OpenCL can speed up the computations by nearly two orders of magnitudes over a single threaded application on the CPU. Although OpenCL allows code portability, we show that some device-specific optimization is still required to get the best performance out of a specific architecture. Using OpenCL in conjunction with MPI allows the domain decomposition of large models on several devices located on different nodes of a cluster. For large enough models, the speedup of the domain decomposition varies quasi-linearly with the number of devices. Finally, we investigate two different approaches to compute the gradient by the adjoint state method and show the significant advantages of using OpenCL for FWI.

  14. Fault-tolerant quantum computation for local non-Markovian noise

    International Nuclear Information System (INIS)

    Terhal, Barbara M.; Burkard, Guido

    2005-01-01

    We derive a threshold result for fault-tolerant quantum computation for local non-Markovian noise models. The role of error amplitude in our analysis is played by the product of the elementary gate time t 0 and the spectral width of the interaction Hamiltonian between system and bath. We discuss extensions of our model and the applicability of our analysis

  15. Depth-Averaged Non-Hydrostatic Hydrodynamic Model Using a New Multithreading Parallel Computing Method

    Directory of Open Access Journals (Sweden)

    Ling Kang

    2017-03-01

    Full Text Available Compared to the hydrostatic hydrodynamic model, the non-hydrostatic hydrodynamic model can accurately simulate flows that feature vertical accelerations. The model’s low computational efficiency severely restricts its wider application. This paper proposes a non-hydrostatic hydrodynamic model based on a multithreading parallel computing method. The horizontal momentum equation is obtained by integrating the Navier–Stokes equations from the bottom to the free surface. The vertical momentum equation is approximated by the Keller-box scheme. A two-step method is used to solve the model equations. A parallel strategy based on block decomposition computation is utilized. The original computational domain is subdivided into two subdomains that are physically connected via a virtual boundary technique. Two sub-threads are created and tasked with the computation of the two subdomains. The producer–consumer model and the thread lock technique are used to achieve synchronous communication between sub-threads. The validity of the model was verified by solitary wave propagation experiments over a flat bottom and slope, followed by two sinusoidal wave propagation experiments over submerged breakwater. The parallel computing method proposed here was found to effectively enhance computational efficiency and save 20%–40% computation time compared to serial computing. The parallel acceleration rate and acceleration efficiency are approximately 1.45% and 72%, respectively. The parallel computing method makes a contribution to the popularization of non-hydrostatic models.

  16. Time dependent non-extinction probability for prompt critical systems

    International Nuclear Information System (INIS)

    Gregson, M. W.; Prinja, A. K.

    2009-01-01

    The time dependent non-extinction probability equation is presented for slab geometry. Numerical solutions are provided for a nested inner/outer iteration routine where the fission terms (both linear and non-linear) are updated and then held fixed over the inner scattering iteration. Time dependent results are presented highlighting the importance of the injection position and angle. The iteration behavior is also described as the steady state probability of initiation is approached for both small and large time steps. Theoretical analysis of the nested iteration scheme is shown and highlights poor numerical convergence for marginally prompt critical systems. An acceleration scheme for the outer iterations is presented to improve convergence of such systems. Theoretical analysis of the acceleration scheme is also provided and the associated decrease in computational run time addressed. (authors)

  17. 12 CFR 622.21 - Computing time.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 6 2010-01-01 2010-01-01 false Computing time. 622.21 Section 622.21 Banks and... Formal Hearings § 622.21 Computing time. (a) General rule. In computing any period of time prescribed or... run is not to be included. The last day so computed shall be included, unless it is a Saturday, Sunday...

  18. Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    KAUST Repository

    Wu, Xingfu; Taylor, Valerie

    2013-01-01

    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.

  19. Performance Characteristics of Hybrid MPI/OpenMP Scientific Applications on a Large-Scale Multithreaded BlueGene/Q Supercomputer

    KAUST Repository

    Wu, Xingfu

    2013-07-01

    In this paper, we investigate the performance characteristics of five hybrid MPI/OpenMP scientific applications (two NAS Parallel benchmarks Multi-Zone SP-MZ and BT-MZ, an earthquake simulation PEQdyna, an aerospace application PMLB and a 3D particle-in-cell application GTC) on a large-scale multithreaded Blue Gene/Q supercomputer at Argonne National laboratory, and quantify the performance gap resulting from using different number of threads per node. We use performance tools and MPI profile and trace libraries available on the supercomputer to analyze and compare the performance of these hybrid scientific applications with increasing the number OpenMP threads per node, and find that increasing the number of threads to some extent saturates or worsens performance of these hybrid applications. For the strong-scaling hybrid scientific applications such as SP-MZ, BT-MZ, PEQdyna and PLMB, using 32 threads per node results in much better application efficiency than using 64 threads per node, and as increasing the number of threads per node, the FPU (Floating Point Unit) percentage decreases, and the MPI percentage (except PMLB) and IPC (Instructions per cycle) per core (except BT-MZ) increase. For the weak-scaling hybrid scientific application such as GTC, the performance trend (relative speedup) is very similar with increasing number of threads per node no matter how many nodes (32, 128, 512) are used. © 2013 IEEE.

  20. Time-dependent ionization balance model for non-LTE plasma

    International Nuclear Information System (INIS)

    Lee, Y.T.; Zimmerman, G.B.; Bailey, D.S.; Dickson, D.; Kim, D.

    1986-01-01

    We have developed a detailed configuration-accounting kinetic model for calculating time-dependent ionization-balance and ion-level populations in non-local thermal-equilibrium (non-LTE) plasmas. We use these population estimates in computing spectral line intensities, line ratios, and synthetic spectra, and in fitting these calculated values to experimental measurements. The model is also used to design laboratory x-ray laser experiments. For this purpose, it is self-consistently coupled to the hydrodynamics code LASNEX. 20 refs., 14 figs

  1. Computation of reactor control rod drop time under accident conditions

    International Nuclear Information System (INIS)

    Dou Yikang; Yao Weida; Yang Renan; Jiang Nanyan

    1998-01-01

    The computational method of reactor control rod drop time under accident conditions lies mainly in establishing forced vibration equations for the components under action of outside forces on control rod driven line and motion equation for the control rod moving in vertical direction. The above two kinds of equations are connected by considering the impact effects between control rod and its outside components. Finite difference method is adopted to make discretization of the vibration equations and Wilson-θ method is applied to deal with the time history problem. The non-linearity caused by impact is iteratively treated with modified Newton method. Some experimental results are used to validate the validity and reliability of the computational method. Theoretical and experimental testing problems show that the computer program based on the computational method is applicable and reliable. The program can act as an effective tool of design by analysis and safety analysis for the relevant components

  2. Computations on Wings With Full-Span Oscillating Control Surfaces Using Navier-Stokes Equations

    Science.gov (United States)

    Guruswamy, Guru P.

    2013-01-01

    A dual-level parallel procedure is presented for computing large databases to support aerospace vehicle design. This procedure has been developed as a single Unix script within the Parallel Batch Submission environment utilizing MPIexec and runs MPI based analysis software. It has been developed to provide a process for aerospace designers to generate data for large numbers of cases with the highest possible fidelity and reasonable wall clock time. A single job submission environment has been created to avoid keeping track of multiple jobs and the associated system administration overhead. The process has been demonstrated for computing large databases for the design of typical aerospace configurations, a launch vehicle and a rotorcraft.

  3. Real-time definition of non-randomness in the distribution of genomic events.

    Directory of Open Access Journals (Sweden)

    Ulrich Abel

    Full Text Available Features such as mutations or structural characteristics can be non-randomly or non-uniformly distributed within a genome. So far, computer simulations were required for statistical inferences on the distribution of sequence motifs. Here, we show that these analyses are possible using an analytical, mathematical approach. For the assessment of non-randomness, our calculations only require information including genome size, number of (sampled sequence motifs and distance parameters. We have developed computer programs evaluating our analytical formulas for the real-time determination of expected values and p-values. This approach permits a flexible cluster definition that can be applied to most effectively identify non-random or non-uniform sequence motif distribution. As an example, we show the effectivity and reliability of our mathematical approach in clinical retroviral vector integration site distribution.

  4. Polyphyly and gene flow between non-sibling Heliconius species

    Directory of Open Access Journals (Sweden)

    Jiggins Chris D

    2006-04-01

    Full Text Available Abstract Background The view that gene flow between related animal species is rare and evolutionarily unimportant largely antedates sensitive molecular techniques. Here we use DNA sequencing to investigate a pair of morphologically and ecologically divergent, non-sibling butterfly species, Heliconius cydno and H. melpomene (Lepidoptera: Nymphalidae, whose distributions overlap in Central and Northwestern South America. Results In these taxa, we sequenced 30–45 haplotypes per locus of a mitochondrial region containing the genes for cytochrome oxidase subunits I and II (CoI/CoII, and intron-spanning fragments of three unlinked nuclear loci: triose-phosphate isomerase (Tpi, mannose-6-phosphate isomerase (Mpi and cubitus interruptus (Ci genes. A fifth gene, dopa decarboxylase (Ddc produced sequence data likely to be from different duplicate loci in some of the taxa, and so was excluded. Mitochondrial and Tpi genealogies are consistent with reciprocal monophyly, whereas sympatric populations of the species in Panama share identical or similar Mpi and Ci haplotypes, giving rise to genealogical polyphyly at the species level despite evidence for rapid sequence divergence at these genes between geographic races of H. melpomene. Conclusion Recent transfer of Mpi haplotypes between species is strongly supported, but there is no evidence for introgression at the other three loci. Our results demonstrate that the boundaries between animal species can remain selectively porous to gene flow long after speciation, and that introgression, even between non-sibling species, can be an important factor in animal evolution. Interspecific gene flow is demonstrated here for the first time in Heliconius and may provide a route for the transfer of switch-gene adaptations for Müllerian mimicry. The results also forcefully demonstrate how reliance on a single locus may give an erroneous picture of the overall genealogical history of speciation and gene flow.

  5. Non-Mechanism in Quantum Oracle Computing

    OpenAIRE

    Castagnoli, Giuseppe

    1999-01-01

    A typical oracle problem is finding which software program is installed on a computer, by running the computer and testing its input-output behaviour. The program is randomly chosen from a set of programs known to the problem solver. As well known, some oracle problems are solved more efficiently by using quantum algorithms; this naturally implies changing the computer to quantum, while the choice of the software program remains sharp. In order to highlight the non-mechanistic origin of this ...

  6. Utilization of the MPI Process for in-tank solidification of heel material in large-diameter cylindrical tanks

    Energy Technology Data Exchange (ETDEWEB)

    Kauschinger, J.L.; Lewis, B.E.

    2000-01-01

    A major problem faced by the US Department of Energy is remediation of sludge and supernatant waste in underground storage tanks. Exhumation of the waste is currently the preferred remediation method. However, exhumation cannot completely remove all of the contaminated materials from the tanks. For large-diameter tanks, amounts of highly contaminated ``heel'' material approaching 20,000 gal can remain. Often sludge containing zeolite particles leaves ``sand bars'' of locally contaminated material across the floor of the tank. The best management practices for in-tank treatment (stabilization and immobilization) of wastes require an integrated approach to develop appropriate treatment agents that can be safely delivered and mixed uniformly with sludge. Ground Environmental Services has developed and demonstrated a remotely controlled, high-velocity jet delivery system termed, Multi-Point-Injection (MPI). This robust jet delivery system has been field-deployed to create homogeneous monoliths containing shallow buried miscellaneous waste in trenches [fiscal year (FY) 1995] and surrogate sludge in cylindrical (FY 1998) and long, horizontal tanks (FY 1999). During the FY 1998 demonstration, the MPI process successfully formed a 32-ton uniform monolith of grout and waste surrogates in about 8 min. Analytical data indicated that 10 tons of zeolite-type physical surrogate were uniformly mixed within a 40-in.-thick monolith without lifting the MPI jetting tools off the tank floor. Over 1,000 lb of cohesive surrogates, with consistencies similar to Gunite and Associated Tank (GAAT) TH-4 and Hanford tank sludges, were easily intermixed into the monolith without exceeding a core temperature of 100 F during curing.

  7. On a numerical strategy to compute gravity currents of non-Newtonian fluids

    International Nuclear Information System (INIS)

    Vola, D.; Babik, F.; Latche, J.-C.

    2004-01-01

    This paper is devoted to the presentation of a numerical scheme for the simulation of gravity currents of non-Newtonian fluids. The two dimensional computational grid is fixed and the free-surface is described as a polygonal interface independent from the grid and advanced in time by a Lagrangian technique. Navier-Stokes equations are semi-discretized in time by the Characteristic-Galerkin method, which finally leads to solve a generalized Stokes problem posed on a physical domain limited by the free surface to only a part of the computational grid. To this purpose, we implement a Galerkin technique with a particular approximation space, defined as the restriction to the fluid domain of functions of a finite element space. The decomposition-coordination method allows to deal without any regularization with a variety of non-linear and possibly non-differentiable constitutive laws. Beside more analytical tests, we revisit with this numerical method some simulations of gravity currents of the literature, up to now investigated within the simplified thin-flow approximation framework

  8. Speeding up image reconstruction in computed tomography

    CERN Multimedia

    CERN. Geneva

    2018-01-01

    Computed tomography (CT) is a technique for imaging cross-sections of an object using X-ray measurements taken from different angles. In last decades a significant progress has happened there: today advanced algorithms allow fast image reconstruction and obtaining high-quality images even with missing or dirty data, modern detectors provide high resolution without increasing radiation dose, and high-performance multi-core computing devices are there to help us solving such tasks even faster. I will start with CT basics, then briefly present existing classes of reconstruction algorithms and their differences. After that I will proceed to employing distinctive architectural features of modern multi-core devices (CPUs and GPUs) and popular program interfaces (OpenMP, MPI, CUDA, OpenCL) for developing effective parallel realizations of image reconstruction algorithms. Decreasing full reconstruction time from long hours up to minutes or even seconds has a revolutionary impact in diagnostic medicine and industria...

  9. 12 CFR 908.27 - Computing time.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Computing time. 908.27 Section 908.27 Banks and... PRACTICE AND PROCEDURE IN HEARINGS ON THE RECORD General Rules § 908.27 Computing time. (a) General rule. In computing any period of time prescribed or allowed by this subpart, the date of the act or event...

  10. Mixing thermodynamic properties of 1-butyl-4-methylpyridinium tetrafluoroborate [b4mpy][BF{sub 4}] with water and with an alkan-1ol (methanol to pentanol)

    Energy Technology Data Exchange (ETDEWEB)

    Ortega, J. [Laboratorio de Termodinamica y Fisicoquimica de Fluidos, Parque Cientifico-Tecnologico, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria, 35071-Las Palmas de Gran Canaria, Canary Islands (Spain)], E-mail: jortega@dip.ulpgc.es; Vreekamp, R.; Penco, E.; Marrero, E. [Laboratorio de Termodinamica y Fisicoquimica de Fluidos, Parque Cientifico-Tecnologico, Campus Universitario de Tafira, Universidad de Las Palmas de Gran Canaria, 35071-Las Palmas de Gran Canaria, Canary Islands (Spain)

    2008-07-15

    This article presents a study of the behaviour in solution of 1-butyl-4-methylpyridinium tetrafluoroborate [b4mpy][BF{sub 4}] in water and in the first five alkanols of the series methanol to pentan-1-ol. The excess enthalpies, H{sub m}{sup E} and volumes, V{sub m}{sup E} were determined at the temperatures (298.15 and 318.15) K. At these temperatures, the [b4mpy][BF{sub 4}] was completely miscible in water, methanol, and ethanol, but only partially miscible in the other alkanols. A solubility study was carried out and the (liquid + liquid) equilibria of the ([b4mpy][BF{sub 4}] + alkanol) systems were experimentally determined, evaluating zones of complete miscibility and determining the UCST in each case. The mixtures with water gave positive values of H{sub m}{sup E} and V{sub m}{sup E}, being also positive the changes of these quantities with temperature. The mixtures with alkanols gave values of H{sub m}{sup E}>0 and V{sub m}{sup E}<0, and for these binary mixtures (dH{sub m}{sup E}/dT){sub p}>0 and (dV{sub m}{sup E}/dT){sub p}<0. For all cases, results were interpreted and compared with data obtained in mixtures with another isomer [b3mpy][BF{sub 4}]. Excess properties were correlated with a suitable equation and the area and volume parameters were calculated for [b4mpy][BF{sub 4}].

  11. Warm Paleocene/Eocene climate as simulated in ECHAM5/MPI-OM

    Directory of Open Access Journals (Sweden)

    M. Heinemann

    2009-12-01

    Full Text Available We investigate the late Paleocene/early Eocene (PE climate using the coupled atmosphere-ocean-sea ice model ECHAM5/MPI-OM. The surface in our PE control simulation is on average 297 K warm and ice-free, despite a moderate atmospheric CO2 concentration of 560 ppm. Compared to a pre-industrial reference simulation (PR, low latitudes are 5 to 8 K warmer, while high latitudes are up to 40 K warmer. This high-latitude amplification is in line with proxy data, yet a comparison to sea surface temperature proxy data suggests that the Arctic surface temperatures are still too low in our PE simulation.

    To identify the mechanisms that cause the PE-PR surface temperature differences, we fit two simple energy balance models to the ECHAM5/MPI-OM results. We find that about 2/3 of the PE-PR global mean surface temperature difference are caused by a smaller clear sky emissivity due to higher atmospheric CO2 and water vapour concentrations in PE compared to PR; 1/3 is due to a smaller planetary albedo. The reduction of the pole-to-equator temperature gradient in PE compared to PR is due to (1 the large high-latitude effect of the higher CO2 and water vapour concentrations in PE compared to PR, (2 the lower Antarctic orography, (3 the smaller surface albedo at high latitudes, and (4 longwave cloud radiative effects. Our results support the hypothesis that local radiative effects rather than increased meridional heat transports were responsible for the "equable" PE climate.

  12. Distributed multiscale computing with MUSCLE 2, the Multiscale Coupling Library and Environment

    NARCIS (Netherlands)

    Borgdorff, J.; Mamonski, M.; Bosak, B.; Kurowski, K.; Ben Belgacem, M.; Chopard, B.; Groen, D.; Coveney, P.V.; Hoekstra, A.G.

    2014-01-01

    We present the Multiscale Coupling Library and Environment: MUSCLE 2. This multiscale component-based execution environment has a simple to use Java, C++, C, Python and Fortran API, compatible with MPI, OpenMP and threading codes. We demonstrate its local and distributed computing capabilities and

  13. Hot Chips and Hot Interconnects for High End Computing Systems

    Science.gov (United States)

    Saini, Subhash

    2005-01-01

    I will discuss several processors: 1. The Cray proprietary processor used in the Cray X1; 2. The IBM Power 3 and Power 4 used in an IBM SP 3 and IBM SP 4 systems; 3. The Intel Itanium and Xeon, used in the SGI Altix systems and clusters respectively; 4. IBM System-on-a-Chip used in IBM BlueGene/L; 5. HP Alpha EV68 processor used in DOE ASCI Q cluster; 6. SPARC64 V processor, which is used in the Fujitsu PRIMEPOWER HPC2500; 7. An NEC proprietary processor, which is used in NEC SX-6/7; 8. Power 4+ processor, which is used in Hitachi SR11000; 9. NEC proprietary processor, which is used in Earth Simulator. The IBM POWER5 and Red Storm Computing Systems will also be discussed. The architectures of these processors will first be presented, followed by interconnection networks and a description of high-end computer systems based on these processors and networks. The performance of various hardware/programming model combinations will then be compared, based on latest NAS Parallel Benchmark results (MPI, OpenMP/HPF and hybrid (MPI + OpenMP). The tutorial will conclude with a discussion of general trends in the field of high performance computing, (quantum computing, DNA computing, cellular engineering, and neural networks).

  14. Domain decomposition parallel computing for transient two-phase flow of nuclear reactors

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jae Ryong; Yoon, Han Young [KAERI, Daejeon (Korea, Republic of); Choi, Hyoung Gwon [Seoul National University, Seoul (Korea, Republic of)

    2016-05-15

    KAERI (Korea Atomic Energy Research Institute) has been developing a multi-dimensional two-phase flow code named CUPID for multi-physics and multi-scale thermal hydraulics analysis of Light water reactors (LWRs). The CUPID code has been validated against a set of conceptual problems and experimental data. In this work, the CUPID code has been parallelized based on the domain decomposition method with Message passing interface (MPI) library. For domain decomposition, the CUPID code provides both manual and automatic methods with METIS library. For the effective memory management, the Compressed sparse row (CSR) format is adopted, which is one of the methods to represent the sparse asymmetric matrix. CSR format saves only non-zero value and its position (row and column). By performing the verification for the fundamental problem set, the parallelization of the CUPID has been successfully confirmed. Since the scalability of a parallel simulation is generally known to be better for fine mesh system, three different scales of mesh system are considered: 40000 meshes for coarse mesh system, 320000 meshes for mid-size mesh system, and 2560000 meshes for fine mesh system. In the given geometry, both single- and two-phase calculations were conducted. In addition, two types of preconditioners for a matrix solver were compared: Diagonal and incomplete LU preconditioner. In terms of enhancement of the parallel performance, the OpenMP and MPI hybrid parallel computing for a pressure solver was examined. It is revealed that the scalability of hybrid calculation was enhanced for the multi-core parallel computation.

  15. Performing three-dimensional neutral particle transport calculations on tera scale computers

    International Nuclear Information System (INIS)

    Woodward, C.S.; Brown, P.N.; Chang, B.; Dorr, M.R.; Hanebutte, U.R.

    1999-01-01

    A scalable, parallel code system to perform neutral particle transport calculations in three dimensions is presented. To utilize the hyper-cluster architecture of emerging tera scale computers, the parallel code successfully combines the MPI message passing and paradigms. The code's capabilities are demonstrated by a shielding calculation containing over 14 billion unknowns. This calculation was accomplished on the IBM SP ''ASCI-Blue-Pacific computer located at Lawrence Livermore National Laboratory (LLNL)

  16. A hybrid version of swan for fast and efficient practical wave modelling

    NARCIS (Netherlands)

    M. Genseberger (Menno); J. Donners

    2016-01-01

    htmlabstractIn the Netherlands, for coastal and inland water applications, wave modelling with SWAN has become a main ingredient. However, computational times are relatively high. Therefore we investigated the parallel efficiency of the current MPI and OpenMP versions of SWAN. The MPI version is

  17. Computation Reduction Oriented Circular Scanning SAR Raw Data Simulation on Multi-GPUs

    Directory of Open Access Journals (Sweden)

    Hu Chen

    2016-08-01

    Full Text Available As a special working mode, the circular scanning Synthetic Aperture Radar (SAR is widely used in the earth observation. With the increase of resolution and swath width, the simulation data has a massive increase, which boosts the new requirements of efficiency. Through analyzing the redundancy in the raw data simulation based on Graphics Processing Unit (GPU, a fast simulation method considering reduction of redundant computation is realized by the multi-GPUs and Message Passing Interface (MPI. The results show that the efficiency of 4-GPUs increases 2 times through the redundant reduction, and the hardware cost decreases by 50%, thus the overall speedup achieves 350 times than the traditional CPU simulation.

  18. Temporal trends in compliance with appropriateness criteria for stress single-photon emission computed tomography sestamibi studies in an academic medical center.

    Science.gov (United States)

    Gibbons, Raymond J; Askew, J Wells; Hodge, David; Miller, Todd D

    2010-03-01

    The purpose of this study was to apply published appropriateness criteria for single-photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) in a single academic medical center to determine if the percentage of inappropriate studies was changing over time. In a previous study, we applied the American College of Cardiology Foundation/American Society of Nuclear Cardiology (ASNC) appropriateness criteria for stress SPECT MPI and reported that 14% of stress SPECT studies were performed for inappropriate reasons. Using similar methodology, we retrospectively examined 284 patients who underwent stress SPECT MPI in October 2006 and compared the findings to the previous cohort of 284 patients who underwent stress SPECT MPI in May 2005. The indications for testing in the 2 cohorts were very similar. The overall level of agreement in characterizing categories of appropriateness between 2 experienced cardiovascular nurse abstractors was good (kappa = 0.68), which represented an improvement from our previous study (kappa = 0.56). There was a significant change between May 2005 and October 2006 in the overall classification of categories for appropriateness (P = .024 by chi(2) statistic). There were modest, but insignificant, increases in the number of patients who were unclassified (15% in the current study vs 11% previously), appropriate (66% vs 64%), and uncertain (12% vs 11%). Only 7% of the studies in the current study were inappropriate, which represented a significant (P = .004) decrease from the 14% reported in the 2005 cohort. In the absence of any specific intervention, there was a significant change in the overall classification of SPECT appropriateness in an academic medical center over 17 months. The only significant difference in individual categories was a decrease in inappropriate studies. Additional measurements over time will be required to determine if this trend is sustainable or generalizable.

  19. Characterization of Sendai virus persistently infected L929 cells and Sendai virus pi strain: recombinant Sendai viruses having Mpi protein shows lower cytotoxicity and are incapable of establishing persistent infection

    International Nuclear Information System (INIS)

    Nishio, Machiko; Tsurudome, Masato; Ito, Morihiro; Kawano, Mitsuo; Komada, Hiroshi; Ito, Yasuhiko

    2003-01-01

    It is commonly accepted that the temperature-sensitive phenotype of Sendai virus (SeV) persistently infected cells is caused by the M and/or HN proteins. Expression level of the L, M, HN, and V proteins is extremely low in L929 cells persistently infected with SeVpi (L929/SeVpi cells) incubated at 38 deg. C. The HN protein quickly disappears in L929/SeVpi cells following a temperature shift up to 38 deg. C, and pulse-chase experiments show that the Lpi, HNpi, and Mpi proteins are unstable at 38 deg. C. Following a temperature shift either upward or downward, M protein is translocated into the nucleus and then localizes to the perinuclear region. None of virus-specific polypeptides are detected in the cells primarily infected with SeVpi and incubated at 38 deg. C and virus proteins are not pulse-labeled at 38 deg. C, indicating that temperature-sensitive step is at an early stage of infection. The Mpi protein is transiently located in the nucleus of the SeVpi primarily infected cells. Recombinant SeVs possessing the HNpi or/and Mpi proteins are not temperature-sensitive. The HN protein is expressed at very low levels and the F protein localizes to the perinuclear region in rSeV(Mpi)-infected cells incubated at 38 deg. C for 18 h. rSeVs having the Mpi protein exhibit lower cytotoxicity and are incapable of establishing persistent infection. Amino acid 116 of the Mpi protein is related to the nuclear translocation and lower cytopathogenesis, whereas aa183 is involved in the interaction between M protein and viral glycoproteins

  20. Traffic Flow Prediction Model for Large-Scale Road Network Based on Cloud Computing

    Directory of Open Access Journals (Sweden)

    Zhaosheng Yang

    2014-01-01

    Full Text Available To increase the efficiency and precision of large-scale road network traffic flow prediction, a genetic algorithm-support vector machine (GA-SVM model based on cloud computing is proposed in this paper, which is based on the analysis of the characteristics and defects of genetic algorithm and support vector machine. In cloud computing environment, firstly, SVM parameters are optimized by the parallel genetic algorithm, and then this optimized parallel SVM model is used to predict traffic flow. On the basis of the traffic flow data of Haizhu District in Guangzhou City, the proposed model was verified and compared with the serial GA-SVM model and parallel GA-SVM model based on MPI (message passing interface. The results demonstrate that the parallel GA-SVM model based on cloud computing has higher prediction accuracy, shorter running time, and higher speedup.

  1. High performance simulation for the Silva project using the tera computer

    International Nuclear Information System (INIS)

    Bergeaud, V.; La Hargue, J.P.; Mougery, F.; Boulet, M.; Scheurer, B.; Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A.

    2003-01-01

    In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)

  2. An Application-Based Performance Evaluation of NASAs Nebula Cloud Computing Platform

    Science.gov (United States)

    Saini, Subhash; Heistand, Steve; Jin, Haoqiang; Chang, Johnny; Hood, Robert T.; Mehrotra, Piyush; Biswas, Rupak

    2012-01-01

    The high performance computing (HPC) community has shown tremendous interest in exploring cloud computing as it promises high potential. In this paper, we examine the feasibility, performance, and scalability of production quality scientific and engineering applications of interest to NASA on NASA's cloud computing platform, called Nebula, hosted at Ames Research Center. This work represents the comprehensive evaluation of Nebula using NUTTCP, HPCC, NPB, I/O, and MPI function benchmarks as well as four applications representative of the NASA HPC workload. Specifically, we compare Nebula performance on some of these benchmarks and applications to that of NASA s Pleiades supercomputer, a traditional HPC system. We also investigate the impact of virtIO and jumbo frames on interconnect performance. Overall results indicate that on Nebula (i) virtIO and jumbo frames improve network bandwidth by a factor of 5x, (ii) there is a significant virtualization layer overhead of about 10% to 25%, (iii) write performance is lower by a factor of 25x, (iv) latency for short MPI messages is very high, and (v) overall performance is 15% to 48% lower than that on Pleiades for NASA HPC applications. We also comment on the usability of the cloud platform.

  3. High performance simulation for the Silva project using the tera computer

    Energy Technology Data Exchange (ETDEWEB)

    Bergeaud, V.; La Hargue, J.P.; Mougery, F. [CS Communication and Systemes, 92 - Clamart (France); Boulet, M.; Scheurer, B. [CEA Bruyeres-le-Chatel, 91 - Bruyeres-le-Chatel (France); Le Fur, J.F.; Comte, M.; Benisti, D.; Lamare, J. de; Petit, A. [CEA Saclay, 91 - Gif sur Yvette (France)

    2003-07-01

    In the context of the SILVA Project (Atomic Vapor Laser Isotope Separation), numerical simulation of the plant scale propagation of laser beams through uranium vapour was a great challenge. The PRODIGE code has been developed to achieve this goal. Here we focus on the task of achieving high performance simulation on the TERA computer. We describe the main issues for optimizing the parallelization of the PRODIGE code on TERA. Thus, we discuss advantages and drawbacks of the implemented diagonal parallelization scheme. As a consequence, it has been found fruitful to fit out the code in three aspects: memory allocation, MPI communications and interconnection network bandwidth usage. We stress out the interest of MPI/IO in this context and the benefit obtained for production computations on TERA. Finally, we shall illustrate our developments. We indicate some performance measurements reflecting the good parallelization properties of PRODIGE on the TERA computer. The code is currently used for demonstrating the feasibility of the laser propagation at a plant enrichment level and for preparing the 2003 Menphis experiment. We conclude by emphasizing the contribution of high performance TERA simulation to the project. (authors)

  4. INTRANS. A computer code for the non-linear structural response analysis of reactor internals under transient loads

    International Nuclear Information System (INIS)

    Ramani, D.T.

    1977-01-01

    The 'INTRANS' system is a general purpose computer code, designed to perform linear and non-linear structural stress and deflection analysis of impacting or non-impacting nuclear reactor internals components coupled with reactor vessel, shield building and external as well as internal gapped spring support system. This paper describes in general a unique computational procedure for evaluating the dynamic response of reactor internals, descretised as beam and lumped mass structural system and subjected to external transient loads such as seismic and LOCA time-history forces. The computational procedure is outlined in the INTRANS code, which computes component flexibilities of a discrete lumped mass planar model of reactor internals by idealising an assemblage of finite elements consisting of linear elastic beams with bending, torsional and shear stiffnesses interacted with external or internal linear as well as non-linear multi-gapped spring support system. The method of analysis is based on the displacement method and the code uses the fourth-order Runge-Kutta numerical integration technique as a basis for solution of dynamic equilibrium equations of motion for the system. During the computing process, the dynamic response of each lumped mass is calculated at specific instant of time using well-known step-by-step procedure. At any instant of time then, the transient dynamic motions of the system are held stationary and based on the predicted motions and internal forces of the previous instant. From which complete response at any time-step of interest may then be computed. Using this iterative process, the relationship between motions and internal forces is satisfied step by step throughout the time interval

  5. Computationally determining the salience of decision points for real-time wayfinding support

    Directory of Open Access Journals (Sweden)

    Makoto Takemiya

    2012-06-01

    Full Text Available This study introduces the concept of computational salience to explain the discriminatory efficacy of decision points, which in turn may have applications to providing real-time assistance to users of navigational aids. This research compared algorithms for calculating the computational salience of decision points and validated the results via three methods: high-salience decision points were used to classify wayfinders; salience scores were used to weight a conditional probabilistic scoring function for real-time wayfinder performance classification; and salience scores were correlated with wayfinding-performance metrics. As an exploratory step to linking computational and cognitive salience, a photograph-recognition experiment was conducted. Results reveal a distinction between algorithms useful for determining computational and cognitive saliences. For computational salience, information about the structural integration of decision points is effective, while information about the probability of decision-point traversal shows promise for determining cognitive salience. Limitations from only using structural information and motivations for future work that include non-structural information are elicited.

  6. One-loop calculation in time-dependent non-equilibrium thermo field dynamics

    International Nuclear Information System (INIS)

    Umezawa, H.; Yamanaka, Y.

    1989-01-01

    This paper is a review on the structure of thermo field dynamics (TFD) in which the basic concepts such as the thermal doublets, the quasi-particles and the self-consistent renormalization are presented in detail. A strong emphasis is put on the computational scheme. A detailed structure of this scheme is illustrated by the one-loop calculation in a non-equilibrium time-dependent process. A detailed account of the one-loop calculation has never been reported anywhere. The role of the self-consistent renormalization is explained. The equilibrium TFD is obtained as the long-time limit of non-equilibrium TFD. (author)

  7. Real-Time Simulation of Ship-Structure and Ship-Ship Interaction

    DEFF Research Database (Denmark)

    Lindberg, Ole; Glimberg, Stefan Lemvig; Bingham, Harry B.

    2013-01-01

    , because it is simple, easy to implement and computationally efficient. Multiple many-core graphical processing units (GPUs) are used for parallel execution and the model is implemented using a combination of C/C++, CUDA and MPI. Two ship hydrodynamic cases are presented: Kriso Container Carrier at steady...

  8. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

    2010-09-30

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.

  9. Finding Tropical Cyclones on a Cloud Computing Cluster: Using Parallel Virtualization for Large-Scale Climate Simulation Analysis

    International Nuclear Information System (INIS)

    Hasenkamp, Daren; Sim, Alexander; Wehner, Michael; Wu, Kesheng

    2010-01-01

    Extensive computing power has been used to tackle issues such as climate changes, fusion energy, and other pressing scientific challenges. These computations produce a tremendous amount of data; however, many of the data analysis programs currently only run a single processor. In this work, we explore the possibility of using the emerging cloud computing platform to parallelize such sequential data analysis tasks. As a proof of concept, we wrap a program for analyzing trends of tropical cyclones in a set of virtual machines (VMs). This approach allows the user to keep their familiar data analysis environment in the VMs, while we provide the coordination and data transfer services to ensure the necessary input and output are directed to the desired locations. This work extensively exercises the networking capability of the cloud computing systems and has revealed a number of weaknesses in the current cloud system software. In our tests, we are able to scale the parallel data analysis job to a modest number of VMs and achieve a speedup that is comparable to running the same analysis task using MPI. However, compared to MPI based parallelization, the cloud-based approach has a number of advantages. The cloud-based approach is more flexible because the VMs can capture arbitrary software dependencies without requiring the user to rewrite their programs. The cloud-based approach is also more resilient to failure; as long as a single VM is running, it can make progress while as soon as one MPI node fails the whole analysis job fails. In short, this initial work demonstrates that a cloud computing system is a viable platform for distributed scientific data analyses traditionally conducted on dedicated supercomputing systems.

  10. 12 CFR 1780.11 - Computing time.

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 7 2010-01-01 2010-01-01 false Computing time. 1780.11 Section 1780.11 Banks... time. (a) General rule. In computing any period of time prescribed or allowed by this subpart, the date of the act or event that commences the designated period of time is not included. The last day so...

  11. Some consequences of a non-commutative space-time structure

    International Nuclear Information System (INIS)

    Vilela Mendes, R.

    2005-01-01

    The existence of a fundamental length (or fundamental time) has been conjectured in many contexts. Here we discuss some consequences of a fundamental constant of this type, which emerges as a consequence of deformation-stability considerations leading to a non-commutative space-time structure. This mathematically well defined structure is sufficiently constrained to allow for unambiguous experimental predictions. In particular we discuss the phase-space volume modifications and their relevance for the calculation of the Greisen-Zatsepin-Kuz'min sphere. The (small) corrections to the spectrum of the Coulomb problem are also computed. (orig.)

  12. Instantaneous Non-Local Computation of Low T-Depth Quantum Circuits

    DEFF Research Database (Denmark)

    Speelman, Florian

    2016-01-01

    -depth of a quantum circuit, able to perform non-local computation of quantum circuits with a (poly-)logarithmic number of layers of T gates with quasi-polynomial entanglement. Our proofs combine ideas from blind and delegated quantum computation with the garden-hose model, a combinatorial model of communication......Instantaneous non-local quantum computation requires multiple parties to jointly perform a quantum operation, using pre-shared entanglement and a single round of simultaneous communication. We study this task for its close connection to position-based quantum cryptography, but it also has natural...... applications in the context of foundations of quantum physics and in distributed computing. The best known general construction for instantaneous non-local quantum computation requires a pre-shared state which is exponentially large in the number of qubits involved in the operation, while efficient...

  13. The role of dendritic non-linearities in single neuron computation

    Directory of Open Access Journals (Sweden)

    Boris Gutkin

    2014-05-01

    Full Text Available Experiment has demonstrated that summation of excitatory post-synaptic protientials (EPSPs in dendrites is non-linear. The sum of multiple EPSPs can be larger than their arithmetic sum, a superlinear summation due to the opening of voltage-gated channels and similar to somatic spiking. The so-called dendritic spike. The sum of multiple of EPSPs can also be smaller than their arithmetic sum, because the synaptic current necessarily saturates at some point. While these observations are well-explained by biophysical models the impact of dendritic spikes on computation remains a matter of debate. One reason is that dendritic spikes may fail to make the neuron spike; similarly, dendritic saturations are sometime presented as a glitch which should be corrected by dendritic spikes. We will provide solid arguments against this claim and show that dendritic saturations as well as dendritic spikes enhance single neuron computation, even when they cannot directly make the neuron fire. To explore the computational impact of dendritic spikes and saturations, we are using a binary neuron model in conjunction with Boolean algebra. We demonstrate using these tools that a single dendritic non-linearity, either spiking or saturating, combined with somatic non-linearity, enables a neuron to compute linearly non-separable Boolean functions (lnBfs. These functions are impossible to compute when summation is linear and the exclusive OR is a famous example of lnBfs. Importantly, the implementation of these functions does not require the dendritic non-linearity to make the neuron spike. Next, We show that reduced and realistic biophysical models of the neuron are capable of computing lnBfs. Within these models and contrary to the binary model, the dendritic and somatic non-linearity are tightly coupled. Yet we show that these neuron models are capable of linearly non-separable computations.

  14. Real-time Tsunami Inundation Prediction Using High Performance Computers

    Science.gov (United States)

    Oishi, Y.; Imamura, F.; Sugawara, D.

    2014-12-01

    Recently off-shore tsunami observation stations based on cabled ocean bottom pressure gauges are actively being deployed especially in Japan. These cabled systems are designed to provide real-time tsunami data before tsunamis reach coastlines for disaster mitigation purposes. To receive real benefits of these observations, real-time analysis techniques to make an effective use of these data are necessary. A representative study was made by Tsushima et al. (2009) that proposed a method to provide instant tsunami source prediction based on achieving tsunami waveform data. As time passes, the prediction is improved by using updated waveform data. After a tsunami source is predicted, tsunami waveforms are synthesized from pre-computed tsunami Green functions of linear long wave equations. Tsushima et al. (2014) updated the method by combining the tsunami waveform inversion with an instant inversion of coseismic crustal deformation and improved the prediction accuracy and speed in the early stages. For disaster mitigation purposes, real-time predictions of tsunami inundation are also important. In this study, we discuss the possibility of real-time tsunami inundation predictions, which require faster-than-real-time tsunami inundation simulation in addition to instant tsunami source analysis. Although the computational amount is large to solve non-linear shallow water equations for inundation predictions, it has become executable through the recent developments of high performance computing technologies. We conducted parallel computations of tsunami inundation and achieved 6.0 TFLOPS by using 19,000 CPU cores. We employed a leap-frog finite difference method with nested staggered grids of which resolution range from 405 m to 5 m. The resolution ratio of each nested domain was 1/3. Total number of grid points were 13 million, and the time step was 0.1 seconds. Tsunami sources of 2011 Tohoku-oki earthquake were tested. The inundation prediction up to 2 hours after the

  15. NGScloud: RNA-seq analysis of non-model species using cloud computing.

    Science.gov (United States)

    Mora-Márquez, Fernando; Vázquez-Poletti, José Luis; López de Heredia, Unai

    2018-05-03

    RNA-seq analysis usually requires large computing infrastructures. NGScloud is a bioinformatic system developed to analyze RNA-seq data using the cloud computing services of Amazon that permit the access to ad hoc computing infrastructure scaled according to the complexity of the experiment, so its costs and times can be optimized. The application provides a user-friendly front-end to operate Amazon's hardware resources, and to control a workflow of RNA-seq analysis oriented to non-model species, incorporating the cluster concept, which allows parallel runs of common RNA-seq analysis programs in several virtual machines for faster analysis. NGScloud is freely available at https://github.com/GGFHF/NGScloud/. A manual detailing installation and how-to-use instructions is available with the distribution. unai.lopezdeheredia@upm.es.

  16. A non-local computational boundary condition for duct acoustics

    Science.gov (United States)

    Zorumski, William E.; Watson, Willie R.; Hodge, Steve L.

    1994-01-01

    A non-local boundary condition is formulated for acoustic waves in ducts without flow. The ducts are two dimensional with constant area, but with variable impedance wall lining. Extension of the formulation to three dimensional and variable area ducts is straightforward in principle, but requires significantly more computation. The boundary condition simulates a nonreflecting wave field in an infinite duct. It is implemented by a constant matrix operator which is applied at the boundary of the computational domain. An efficient computational solution scheme is developed which allows calculations for high frequencies and long duct lengths. This computational solution utilizes the boundary condition to limit the computational space while preserving the radiation boundary condition. The boundary condition is tested for several sources. It is demonstrated that the boundary condition can be applied close to the sound sources, rendering the computational domain small. Computational solutions with the new non-local boundary condition are shown to be consistent with the known solutions for nonreflecting wavefields in an infinite uniform duct.

  17. Wigner-Smith delay times and the non-Hermitian Hamiltonian for the HOCl molecule

    International Nuclear Information System (INIS)

    Barr, A.M.; Reichl, L.E.

    2013-01-01

    We construct the scattering matrix for a two-dimensional model of a Cl atom scattering from an OH dimer. We show that the scattering matrix can be written in terms of a non-Hermitian Hamiltonian whose complex energy eigenvalues can be used to compute Wigner-Smith delay times for the Cl-OH scattering process. We compute the delay times for a range of energies, and show that the scattering states with the longest delay times are strongly influenced by unstable periodic orbits in the classical dynamics. (Copyright copyright 2013 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)

  18. 6 CFR 13.27 - Computation of time.

    Science.gov (United States)

    2010-01-01

    ... 6 Domestic Security 1 2010-01-01 2010-01-01 false Computation of time. 13.27 Section 13.27 Domestic Security DEPARTMENT OF HOMELAND SECURITY, OFFICE OF THE SECRETARY PROGRAM FRAUD CIVIL REMEDIES § 13.27 Computation of time. (a) In computing any period of time under this part or in an order issued...

  19. A non-perturbative definition of 2D quantum gravity by the fifth time action

    International Nuclear Information System (INIS)

    Ambjoern, J.; Greensite, J.; Varsted, S.

    1990-07-01

    The general formalism for stabilizing bottomless Euclidean field theories (the 'fifth-time' action) provides a natural non-perturbative definition of matrix models corresponding to 2d quantum gravity. The formalism allows, in principle, the use of lattice Monte Carlo techniques for non-perturbative computation of correlation functions. (orig.)

  20. Hard Real-Time Task Scheduling in Cloud Computing Using an Adaptive Genetic Algorithm

    Directory of Open Access Journals (Sweden)

    Amjad Mahmood

    2017-04-01

    Full Text Available In the Infrastructure-as-a-Service cloud computing model, virtualized computing resources in the form of virtual machines are provided over the Internet. A user can rent an arbitrary number of computing resources to meet their requirements, making cloud computing an attractive choice for executing real-time tasks. Economical task allocation and scheduling on a set of leased virtual machines is an important problem in the cloud computing environment. This paper proposes a greedy and a genetic algorithm with an adaptive selection of suitable crossover and mutation operations (named as AGA to allocate and schedule real-time tasks with precedence constraint on heterogamous virtual machines. A comprehensive simulation study has been done to evaluate the performance of the proposed algorithms in terms of their solution quality and efficiency. The simulation results show that AGA outperforms the greedy algorithm and non-adaptive genetic algorithm in terms of solution quality.

  1. Real time computer control of a nonlinear Multivariable System via Linearization and Stability Analysis

    International Nuclear Information System (INIS)

    Raza, K.S.M.

    2004-01-01

    This paper demonstrates that if a complicated nonlinear, non-square, state-coupled multi variable system is smartly linearized and subjected to a thorough stability analysis then we can achieve our design objectives via a controller which will be quite simple (in term of resource usage and execution time) and very efficient (in terms of robustness). Further the aim is to implement this controller via computer in a real time environment. Therefore first a nonlinear mathematical model of the system is achieved. An intelligent work is done to decouple the multivariable system. Linearization and stability analysis techniques are employed for the development of a linearized and mathematically sound control law. Nonlinearities like the saturation in actuators are also been catered. The controller is then discretized using Runge-Kutta integration. Finally the discretized control law is programmed in a computer in a real time environment. The programme is done in RT -Linux using GNU C for the real time realization of the control scheme. The real time processes, like sampling and controlled actuation, and the non real time processes, like graphical user interface and display, are programmed as different tasks. The issue of inter process communication, between real time and non real time task is addressed quite carefully. The results of this research pursuit are presented graphically. (author)

  2. X-ray absorption in insulators with non-Hermitian real-time time-dependent density functional theory.

    Science.gov (United States)

    Fernando, Ranelka G; Balhoff, Mary C; Lopata, Kenneth

    2015-02-10

    Non-Hermitian real-time time-dependent density functional theory was used to compute the Si L-edge X-ray absorption spectrum of α-quartz using an embedded finite cluster model and atom-centered basis sets. Using tuned range-separated functionals and molecular orbital-based imaginary absorbing potentials, the excited states spanning the pre-edge to ∼20 eV above the ionization edge were obtained in good agreement with experimental data. This approach is generalizable to TDDFT studies of core-level spectroscopy and dynamics in a wide range of materials.

  3. Cardiovascular outcomes after pharmacologic stress myocardial perfusion imaging.

    Science.gov (United States)

    Lee, Douglas S; Husain, Mansoor; Wang, Xuesong; Austin, Peter C; Iwanochko, Robert M

    2016-04-01

    While pharmacologic stress single photon emission computed tomography myocardial perfusion imaging (SPECT-MPI) is used for noninvasive evaluation of patients who are unable to perform treadmill exercise, its impact on net reclassification improvement (NRI) of prognosis is unknown. We evaluated the prognostic value of pharmacologic stress MPI for prediction of cardiovascular death or non-fatal myocardial infarction (MI) within 1 year at a single-center, university-based laboratory. We examined continuous and categorical NRI of pharmacologic SPECT-MPI for prediction of outcomes beyond clinical factors alone. Six thousand two hundred forty patients (median age 66 years [IQR 56-74], 3466 men) were studied and followed for 5963 person-years. SPECT-MPI variables associated with increased risk of cardiovascular death or non-fatal MI included summed stress score, stress ST-shift, and post-stress resting left ventricular ejection fraction ≤50%. Compared to a clinical model which included age, sex, cardiovascular disease, risk factors, and medications, model χ(2) (210.5 vs. 281.9, P statistic (0.74 vs. 0.78, P stress score, stress ST-shift and stress resting left ventricular ejection fraction). SPECT-MPI predictors increased continuous NRI by 49.4% (P 3% annualized risk of cardiovascular death or non-fatal MI, yielded a 15.0% improvement in NRI (95% CI 7.6%-27.6%, P stress MPI substantially improved net reclassification of cardiovascular death or MI risk beyond that afforded by clinical factors. Copyright © 2016 Elsevier Inc. All rights reserved.

  4. Non-unitary probabilistic quantum computing circuit and method

    Science.gov (United States)

    Williams, Colin P. (Inventor); Gingrich, Robert M. (Inventor)

    2009-01-01

    A quantum circuit performing quantum computation in a quantum computer. A chosen transformation of an initial n-qubit state is probabilistically obtained. The circuit comprises a unitary quantum operator obtained from a non-unitary quantum operator, operating on an n-qubit state and an ancilla state. When operation on the ancilla state provides a success condition, computation is stopped. When operation on the ancilla state provides a failure condition, computation is performed again on the ancilla state and the n-qubit state obtained in the previous computation, until a success condition is obtained.

  5. Study of MPI based on parallel MOM on PC clusters for EM-beam scattering by 2-D PEC rough surfaces

    International Nuclear Information System (INIS)

    Jun, Ma; Li-Xin, Guo; An-Qi, Wang

    2009-01-01

    This paper firstly applies the finite impulse response filter (FIR) theory combined with the fast Fourier transform (FFT) method to generate two-dimensional Gaussian rough surface. Using the electric field integral equation (EFIE), it introduces the method of moment (MOM) with RWG vector basis function and Galerkin's method to investigate the electromagnetic beam scattering by a two-dimensional PEC Gaussian rough surface on personal computer (PC) clusters. The details of the parallel conjugate gradient method (CGM) for solving the matrix equation are also presented and the numerical simulations are obtained through the message passing interface (MPI) platform on the PC clusters. It finds significantly that the parallel MOM supplies a novel technique for solving a two-dimensional rough surface electromagnetic-scattering problem. The influences of the root-mean-square height, the correlation length and the polarization on the beam scattering characteristics by two-dimensional PEC Gaussian rough surfaces are finally discussed. (classical areas of phenomenology)

  6. Scalability Dilemma and Statistic Multiplexed Computing — A Theory and Experiment

    Directory of Open Access Journals (Sweden)

    Justin Yuan Shi

    2017-08-01

    Full Text Available The For the last three decades, end-to-end computing paradigms, such as MPI (Message Passing Interface, RPC (Remote Procedure Call and RMI (Remote Method Invocation, have been the de facto paradigms for distributed and parallel programming. Despite of the successes, applications built using these paradigms suffer due to the proportionality factor of crash in the application with its size. Checkpoint/restore and backup/recovery are the only means to save otherwise lost critical information. The scalability dilemma is such a practical challenge that the probability of the data losses increases as the application scales in size. The theoretical significance of this practical challenge is that it undermines the fundamental structure of the scientific discovery process and mission critical services in production today. In 1997, the direct use of end-to-end reference model in distributed programming was recognized as a fallacy. The scalability dilemma was predicted. However, this voice was overrun by the passage of time. Today, the rapidly growing digitized data demands solving the increasingly critical scalability challenges. Computing architecture scalability, although loosely defined, is now the front and center of large-scale computing efforts. Constrained only by the economic law of diminishing returns, this paper proposes a narrow definition of a Scalable Computing Service (SCS. Three scalability tests are also proposed in order to distinguish service architecture flaws from poor application programming. Scalable data intensive service requires additional treatments. Thus, the data storage is assumed reliable in this paper. A single-sided Statistic Multiplexed Computing (SMC paradigm is proposed. A UVR (Unidirectional Virtual Ring SMC architecture is examined under SCS tests. SMC was designed to circumvent the well-known impossibility of end-to-end paradigms. It relies on the proven statistic multiplexing principle to deliver reliable service

  7. The SQL Server Database for Non Computer Professional Teaching Reform

    Science.gov (United States)

    Liu, Xiangwei

    2012-01-01

    A summary of the teaching methods of the non-computer professional SQL Server database, analyzes the current situation of the teaching course. According to non computer professional curriculum teaching characteristic, put forward some teaching reform methods, and put it into practice, improve the students' analysis ability, practice ability and…

  8. Noise-constrained switching times for heteroclinic computing

    Science.gov (United States)

    Neves, Fabio Schittler; Voit, Maximilian; Timme, Marc

    2017-03-01

    Heteroclinic computing offers a novel paradigm for universal computation by collective system dynamics. In such a paradigm, input signals are encoded as complex periodic orbits approaching specific sequences of saddle states. Without inputs, the relevant states together with the heteroclinic connections between them form a network of states—the heteroclinic network. Systems of pulse-coupled oscillators or spiking neurons naturally exhibit such heteroclinic networks of saddles, thereby providing a substrate for general analog computations. Several challenges need to be resolved before it becomes possible to effectively realize heteroclinic computing in hardware. The time scales on which computations are performed crucially depend on the switching times between saddles, which in turn are jointly controlled by the system's intrinsic dynamics and the level of external and measurement noise. The nonlinear dynamics of pulse-coupled systems often strongly deviate from that of time-continuously coupled (e.g., phase-coupled) systems. The factors impacting switching times in pulse-coupled systems are still not well understood. Here we systematically investigate switching times in dependence of the levels of noise and intrinsic dissipation in the system. We specifically reveal how local responses to pulses coact with external noise. Our findings confirm that, like in time-continuous phase-coupled systems, piecewise-continuous pulse-coupled systems exhibit switching times that transiently increase exponentially with the number of switches up to some order of magnitude set by the noise level. Complementarily, we show that switching times may constitute a good predictor for the computation reliability, indicating how often an input signal must be reiterated. By characterizing switching times between two saddles in conjunction with the reliability of a computation, our results provide a first step beyond the coding of input signal identities toward a complementary coding for

  9. Non-equal-time Poisson brackets

    OpenAIRE

    Nikolic, H.

    1998-01-01

    The standard definition of the Poisson brackets is generalized to the non-equal-time Poisson brackets. Their relationship to the equal-time Poisson brackets, as well as to the equal- and non-equal-time commutators, is discussed.

  10. Performance Comparison of a Matrix Solver on a Heterogeneous Network Using Two Implementations of MPI: MPICH and LAM

    Science.gov (United States)

    Phillips, Jennifer K.

    1995-01-01

    Two of the current and most popular implementations of the Message-Passing Standard, Message Passing Interface (MPI), were contrasted: MPICH by Argonne National Laboratory, and LAM by the Ohio Supercomputer Center at Ohio State University. A parallel skyline matrix solver was adapted to be run in a heterogeneous environment using MPI. The Message-Passing Interface Forum was held in May 1994 which lead to a specification of library functions that implement the message-passing model of parallel communication. LAM, which creates it's own environment, is more robust in a highly heterogeneous network. MPICH uses the environment native to the machine architecture. While neither of these free-ware implementations provides the performance of native message-passing or vendor's implementations, MPICH begins to approach that performance on the SP-2. The machines used in this study were: IBM RS6000, 3 Sun4, SGI, and the IBM SP-2. Each machine is unique and a few machines required specific modifications during the installation. When installed correctly, both implementations worked well with only minor problems.

  11. Porting of Bio-Informatics Tools for Plant Virology on a Computational Grid

    International Nuclear Information System (INIS)

    Lanzalone, G.; Lombardo, A.; Muoio, A.; Iacono-Manno, M.

    2007-01-01

    The goal of Tri Grid Project and PI2S2 is the creation of the first Sicilian regional computational Grid. In particular, it aims to build various software-hardware interfaces between the infrastructure and some scientific and industrial applications. In this context, we have integrated some among the most innovative computing applications in virology research inside these Grid infrastructure. Particularly, we have implemented in a complete work flow, various tools for pairwise or multiple sequence alignment and phylogeny tree construction (ClustalW-MPI), phylogenetic networks (Splits Tree), detection of recombination by phylogenetic methods (TOPALi) and prediction of DNA or RNA secondary consensus structures (KnetFold). This work will show how the ported applications decrease the execution time of the analysis programs, improve the accessibility to the data storage system and allow the use of metadata for data processing. (Author)

  12. Effects of computing time delay on real-time control systems

    Science.gov (United States)

    Shin, Kang G.; Cui, Xianzhong

    1988-01-01

    The reliability of a real-time digital control system depends not only on the reliability of the hardware and software used, but also on the speed in executing control algorithms. The latter is due to the negative effects of computing time delay on control system performance. For a given sampling interval, the effects of computing time delay are classified into the delay problem and the loss problem. Analysis of these two problems is presented as a means of evaluating real-time control systems. As an example, both the self-tuning predicted (STP) control and Proportional-Integral-Derivative (PID) control are applied to the problem of tracking robot trajectories, and their respective effects of computing time delay on control performance are comparatively evaluated. For this example, the STP (PID) controller is shown to outperform the PID (STP) controller in coping with the delay (loss) problem.

  13. Addressing the challenges of standalone multi-core simulations in molecular dynamics

    Science.gov (United States)

    Ocaya, R. O.; Terblans, J. J.

    2017-07-01

    Computational modelling in material science involves mathematical abstractions of force fields between particles with the aim to postulate, develop and understand materials by simulation. The aggregated pairwise interactions of the material's particles lead to a deduction of its macroscopic behaviours. For practically meaningful macroscopic scales, a large amount of data are generated, leading to vast execution times. Simulation times of hours, days or weeks for moderately sized problems are not uncommon. The reduction of simulation times, improved result accuracy and the associated software and hardware engineering challenges are the main motivations for many of the ongoing researches in the computational sciences. This contribution is concerned mainly with simulations that can be done on a "standalone" computer based on Message Passing Interfaces (MPI), parallel code running on hardware platforms with wide specifications, such as single/multi- processor, multi-core machines with minimal reconfiguration for upward scaling of computational power. The widely available, documented and standardized MPI library provides this functionality through the MPI_Comm_size (), MPI_Comm_rank () and MPI_Reduce () functions. A survey of the literature shows that relatively little is written with respect to the efficient extraction of the inherent computational power in a cluster. In this work, we discuss the main avenues available to tap into this extra power without compromising computational accuracy. We also present methods to overcome the high inertia encountered in single-node-based computational molecular dynamics. We begin by surveying the current state of the art and discuss what it takes to achieve parallelism, efficiency and enhanced computational accuracy through program threads and message passing interfaces. Several code illustrations are given. The pros and cons of writing raw code as opposed to using heuristic, third-party code are also discussed. The growing trend

  14. Perceived problems with computer gaming and internet use among adolescents: measurement tool for non-clinical survey studies

    Science.gov (United States)

    2014-01-01

    Background Existing instruments for measuring problematic computer and console gaming and internet use are often lengthy and often based on a pathological perspective. The objective was to develop and present a new and short non-clinical measurement tool for perceived problems related to computer use and gaming among adolescents and to study the association between screen time and perceived problems. Methods Cross-sectional school-survey of 11-, 13-, and 15-year old students in thirteen schools in the City of Aarhus, Denmark, participation rate 89%, n = 2100. The main exposure was time spend on weekdays on computer- and console-gaming and internet use for communication and surfing. The outcome measures were three indexes on perceived problems related to computer and console gaming and internet use. Results The three new indexes showed high face validity and acceptable internal consistency. Most schoolchildren with high screen time did not experience problems related to computer use. Still, there was a strong and graded association between time use and perceived problems related to computer gaming, console gaming (only boys) and internet use, odds ratios ranging from 6.90 to 10.23. Conclusion The three new measures of perceived problems related to computer and console gaming and internet use among adolescents are appropriate, reliable and valid for use in non-clinical surveys about young people’s everyday life and behaviour. These new measures do not assess Internet Gaming Disorder as it is listed in the DSM and therefore has no parity with DSM criteria. We found an increasing risk of perceived problems with increasing time spent with gaming and internet use. Nevertheless, most schoolchildren who spent much time with gaming and internet use did not experience problems. PMID:24731270

  15. The relationship between TV/computer time and adolescents' health-promoting behavior: a secondary data analysis.

    Science.gov (United States)

    Chen, Mei-Yen; Liou, Yiing-Mei; Wu, Jen-Yee

    2008-03-01

    Television and computers provide significant benefits for learning about the world. Some studies have linked excessive television (TV) watching or computer game playing to disadvantage of health status or some unhealthy behavior among adolescents. However, the relationships between watching TV/playing computer games and adolescents adopting health promoting behavior were limited. This study aimed to discover the relationship between time spent on watching TV and on leisure use of computers and adolescents' health promoting behavior, and associated factors. This paper used secondary data analysis from part of a health promotion project in Taoyuan County, Taiwan. A cross-sectional design was used and purposive sampling was conducted among adolescents in the original project. A total of 660 participants answered the questions appropriately for this work between January and June 2004. Findings showed the mean age of the respondents was 15.0 +/- 1.7 years. The mean numbers of TV watching hours were 2.28 and 4.07 on weekdays and weekends respectively. The mean hours of leisure (non-academic) computer use were 1.64 and 3.38 on weekdays and weekends respectively. Results indicated that adolescents spent significant time watching TV and using the computer, which was negatively associated with adopting health-promoting behaviors such as life appreciation, health responsibility, social support and exercise behavior. Moreover, being boys, being overweight, living in a rural area, and being middle-school students were significantly associated with spending long periods watching TV and using the computer. Therefore, primary health care providers should record the TV and non-academic computer time of youths when conducting health promotion programs, and educate parents on how to become good and healthy electronic media users.

  16. Robust second-order scheme for multi-phase flow computations

    Science.gov (United States)

    Shahbazi, Khosro

    2017-06-01

    A robust high-order scheme for the multi-phase flow computations featuring jumps and discontinuities due to shock waves and phase interfaces is presented. The scheme is based on high-order weighted-essentially non-oscillatory (WENO) finite volume schemes and high-order limiters to ensure the maximum principle or positivity of the various field variables including the density, pressure, and order parameters identifying each phase. The two-phase flow model considered besides the Euler equations of gas dynamics consists of advection of two parameters of the stiffened-gas equation of states, characterizing each phase. The design of the high-order limiter is guided by the findings of Zhang and Shu (2011) [36], and is based on limiting the quadrature values of the density, pressure and order parameters reconstructed using a high-order WENO scheme. The proof of positivity-preserving and accuracy is given, and the convergence and the robustness of the scheme are illustrated using the smooth isentropic vortex problem with very small density and pressure. The effectiveness and robustness of the scheme in computing the challenging problem of shock wave interaction with a cluster of tightly packed air or helium bubbles placed in a body of liquid water is also demonstrated. The superior performance of the high-order schemes over the first-order Lax-Friedrichs scheme for computations of shock-bubble interaction is also shown. The scheme is implemented in two-dimensional space on parallel computers using message passing interface (MPI). The proposed scheme with limiter features approximately 50% higher number of inter-processor message communications compared to the corresponding scheme without limiter, but with only 10% higher total CPU time. The scheme is provably second-order accurate in regions requiring positivity enforcement and higher order in the rest of domain.

  17. Dual computations of non-Abelian Yang-Mills theories on the lattice

    International Nuclear Information System (INIS)

    Cherrington, J. Wade; Khavkine, Igor; Christensen, J. Daniel

    2007-01-01

    In the past several decades there have been a number of proposals for computing with dual forms of non-Abelian Yang-Mills theories on the lattice. Motivated by the gauge-invariant, geometric picture offered by dual models and successful applications of duality in the U(1) case, we revisit the question of whether it is practical to perform numerical computation using non-Abelian dual models. Specifically, we consider three-dimensional SU(2) pure Yang-Mills as an accessible yet nontrivial case in which the gauge group is non-Abelian. Using methods developed recently in the context of spin foam quantum gravity, we derive an algorithm for efficiently computing the dual amplitude and describe Metropolis moves for sampling the dual ensemble. We relate our algorithms to prior work in non-Abelian dual computations of Hari Dass and his collaborators, addressing several problems that have been left open. We report results of spin expectation value computations over a range of lattice sizes and couplings that are in agreement with our conventional lattice computations. We conclude with an outlook on further development of dual methods and their application to problems of current interest

  18. Diagnostic value of 123I-betamethyl-p-iodophenyl-pentadecanoic acid (BMIPP) single photon emission computed tomography (SPECT) in patients with chest pain. Comparison with rest-stress 99mTc-tetrofosmin SPECT and coronary angiography

    International Nuclear Information System (INIS)

    Kawai, Yuko; Nozaki, Yoichi; Ohkusa, Takanori; Sakurai, Masayuki; Morita, Koichi; Tamaki, Nagara

    2004-01-01

    Basic and clinical studies have indicated that 15-(p-[ 123 I] iodophenyl)-3-(R, S) methylpentadecanoic acid (BMIPP) single photon emission computed tomography (SPECT) can identify ischemic myocardium without evidence of myocardial infarction by the regional decline of tracer uptake. The present study compared BMIPP SPECT with rest-stress myocardial perfusion imaging (MPI) findings and coronary angiography (CAG) in 150 patients with acute chest pain. Patients with acute chest pain who underwent all of the following tests were selected: MPI at rest-stress, BMIPP SPECT at rest and CAG. Organic coronary artery stenosis (≥75%) was observed in 46 patients, 27 patients had total or subtotal coronary occlusion by spasm in the spasm provocation test on CAG and the remaining 77 patients had no significant coronary artery stenosis or spasm. The sensitivity of BMIPP at rest to detect organic stenosis was significantly higher (54%) than that of rest-MPI (33%, p<0.005), but lower than that of stress-MPI (76%, p=0.05). The sensitivity of BMIPP at rest to detect spasm was significantly higher (63%) than that of both rest-MPI (15%; p<0.001) and stress-MPI (19%; p<0.001). Overall, the sensitivity of BMIPP at rest to detect both organic stenosis and spasm was significantly higher (58%) than that of rest-MPI (26%; p<0.001), despite having no significance with that of stress-MPI (55%). The specificity was not significantly different among the three imaging techniques. Resting BMIPP SPECT is an alternative method to stress MPI for identifying patients with not only organic stenosis but also spasm without the need for a stress examination. (author)

  19. On efficiency of fire simulation realization: parallelization with greater number of computational meshes

    Science.gov (United States)

    Valasek, Lukas; Glasa, Jan

    2017-12-01

    Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.

  20. New method for model coupling using Stampi. Application to the coupling of atmosphere model (MM5) and land-surface model (SOLVEG)

    International Nuclear Information System (INIS)

    Nagai, Haruyasu

    2003-12-01

    A new method to couple atmosphere and land-surface models using the message passing interface (MPI) was proposed to develop an atmosphere-land model for studies on heat, water, and material exchanges around the land surface. A non-hydrostatic atmospheric dynamic model of Pennsylvania State University and National Center for Atmospheric Research (PUS/NCAR-MM5) and a detailed land surface model (SOLVEG) including the surface-layer atmosphere, soil, and vegetation developed at Japan Atomic Energy Research Institute (JAERI) are used as the atmosphere and land-surface models, respectively. Concerning the MPI, a message passing library named Stampi developed at JAERI that can be used between different parallel computers is used. The models are coupled by exchanging calculation results by using MPI on their independent parallel calculations. The modifications for this model coupling are easy, simply adding some modules for data exchanges to each model code without changing each model's original structure. Moreover, this coupling method is flexible and allows the use of independent time step and grid interval for each model. (author)

  1. Multilevel Parallelization of AutoDock 4.2

    Directory of Open Access Journals (Sweden)

    Norgan Andrew P

    2011-04-01

    Full Text Available Abstract Background Virtual (computational screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4. Results Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Conclusions Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI and node-level (OpenMP parallelization to best fit both workloads and computational resources.

  2. Multilevel Parallelization of AutoDock 4.2.

    Science.gov (United States)

    Norgan, Andrew P; Coffman, Paul K; Kocher, Jean-Pierre A; Katzmann, David J; Sosa, Carlos P

    2011-04-28

    Virtual (computational) screening is an increasingly important tool for drug discovery. AutoDock is a popular open-source application for performing molecular docking, the prediction of ligand-receptor interactions. AutoDock is a serial application, though several previous efforts have parallelized various aspects of the program. In this paper, we report on a multi-level parallelization of AutoDock 4.2 (mpAD4). Using MPI and OpenMP, AutoDock 4.2 was parallelized for use on MPI-enabled systems and to multithread the execution of individual docking jobs. In addition, code was implemented to reduce input/output (I/O) traffic by reusing grid maps at each node from docking to docking. Performance of mpAD4 was examined on two multiprocessor computers. Using MPI with OpenMP multithreading, mpAD4 scales with near linearity on the multiprocessor systems tested. In situations where I/O is limiting, reuse of grid maps reduces both system I/O and overall screening time. Multithreading of AutoDock's Lamarkian Genetic Algorithm with OpenMP increases the speed of execution of individual docking jobs, and when combined with MPI parallelization can significantly reduce the execution time of virtual screens. This work is significant in that mpAD4 speeds the execution of certain molecular docking workloads and allows the user to optimize the degree of system-level (MPI) and node-level (OpenMP) parallelization to best fit both workloads and computational resources.

  3. Polynomial-time computability of the edge-reliability of graphs using Gilbert's formula

    Directory of Open Access Journals (Sweden)

    Marlowe Thomas J.

    1998-01-01

    Full Text Available Reliability is an important consideration in analyzing computer and other communication networks, but current techniques are extremely limited in the classes of graphs which can be analyzed efficiently. While Gilbert's formula establishes a theoretically elegant recursive relationship between the edge reliability of a graph and the reliability of its subgraphs, naive evaluation requires consideration of all sequences of deletions of individual vertices, and for many graphs has time complexity essentially Θ (N!. We discuss a general approach which significantly reduces complexity, encoding subgraph isomorphism in a finer partition by invariants, and recursing through the set of invariants. We illustrate this approach using threshhold graphs, and show that any computation of reliability using Gilbert's formula will be polynomial-time if and only if the number of invariants considered is polynomial; we then show families of graphs with polynomial-time, and non-polynomial reliability computation, and show that these encompass most previously known results. We then codify our approach to indicate how it can be used for other classes of graphs, and suggest several classes to which the technique can be applied.

  4. The reliable solution and computation time of variable parameters logistic model

    Science.gov (United States)

    Wang, Pengfei; Pan, Xinnong

    2018-05-01

    The study investigates the reliable computation time (RCT, termed as T c) by applying a double-precision computation of a variable parameters logistic map (VPLM). Firstly, by using the proposed method, we obtain the reliable solutions for the logistic map. Secondly, we construct 10,000 samples of reliable experiments from a time-dependent non-stationary parameters VPLM and then calculate the mean T c. The results indicate that, for each different initial value, the T cs of the VPLM are generally different. However, the mean T c trends to a constant value when the sample number is large enough. The maximum, minimum, and probable distribution functions of T c are also obtained, which can help us to identify the robustness of applying a nonlinear time series theory to forecasting by using the VPLM output. In addition, the T c of the fixed parameter experiments of the logistic map is obtained, and the results suggest that this T c matches the theoretical formula-predicted value.

  5. New computational method for non-LTE, the linear response matrix

    International Nuclear Information System (INIS)

    Fournier, K.B.; Grasiani, F.R.; Harte, J.A.; Libby, S.B.; More, R.M.; Zimmerman, G.B.

    1998-01-01

    My coauthors have done extensive theoretical and computational calculations that lay the ground work for a linear response matrix method to calculate non-LTE (local thermodynamic equilibrium) opacities. I will give briefly review some of their work and list references. Then I will describe what has been done to utilize this theory to create a computational package to rapidly calculate mild non-LTE emission and absorption opacities suitable for use in hydrodynamic calculations. The opacities are obtained by performing table look-ups on data that has been generated with a non-LTE package. This scheme is currently under development. We can see that it offers a significant computational speed advantage. It is suitable for mild non-LTE, quasi-steady conditions. And it offers a new insertion path for high-quality non-LTE data. Currently, the linear response matrix data file is created using XSN. These data files could be generated by more detailed and rigorous calculations without changing any part of the implementation in the hydro code. The scheme is running in Lasnex and is being tested and developed

  6. Parallel computation of fluid-structural interactions using high resolution upwind schemes

    Science.gov (United States)

    Hu, Zongjun

    An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different

  7. Diagnostic value of thallium-201 myocardial perfusion IQ-SPECT without and with computed tomography-based attenuation correction to predict clinically significant and insignificant fractional flow reserve

    Science.gov (United States)

    Tanaka, Haruki; Takahashi, Teruyuki; Ohashi, Norihiko; Tanaka, Koichi; Okada, Takenori; Kihara, Yasuki

    2017-01-01

    Abstract The aim of this study was to clarify the predictive value of fractional flow reserve (FFR) determined by myocardial perfusion imaging (MPI) using thallium (Tl)-201 IQ-SPECT without and with computed tomography-based attenuation correction (CT-AC) for patients with stable coronary artery disease (CAD). We assessed 212 angiographically identified diseased vessels using adenosine-stress Tl-201 MPI-IQ-SPECT/CT in 84 consecutive, prospectively identified patients with stable CAD. We compared the FFR in 136 of the 212 diseased vessels using visual semiquantitative interpretations of corresponding territories on MPI-IQ-SPECT images without and with CT-AC. FFR inversely correlated most accurately with regional summed difference scores (rSDS) in images without and with CT-AC (r = −0.584 and r = −0.568, respectively, both P system can predict FFR at an optimal cut-off of <0.80, and we propose a novel application of CT-AC to MPI-IQ-SPECT for predicting clinically significant and insignificant FFR even in nonobese patients. PMID:29390486

  8. Cluster Computing for Embedded/Real-Time Systems

    Science.gov (United States)

    Katz, D.; Kepner, J.

    1999-01-01

    Embedded and real-time systems, like other computing systems, seek to maximize computing power for a given price, and thus can significantly benefit from the advancing capabilities of cluster computing.

  9. Large Scale Frequent Pattern Mining using MPI One-Sided Model

    Energy Technology Data Exchange (ETDEWEB)

    Vishnu, Abhinav; Agarwal, Khushbu

    2015-09-08

    In this paper, we propose a work-stealing runtime --- Library for Work Stealing LibWS --- using MPI one-sided model for designing scalable FP-Growth --- {\\em de facto} frequent pattern mining algorithm --- on large scale systems. LibWS provides locality efficient and highly scalable work-stealing techniques for load balancing on a variety of data distributions. We also propose a novel communication algorithm for FP-growth data exchange phase, which reduces the communication complexity from state-of-the-art O(p) to O(f + p/f) for p processes and f frequent attributed-ids. FP-Growth is implemented using LibWS and evaluated on several work distributions and support counts. An experimental evaluation of the FP-Growth on LibWS using 4096 processes on an InfiniBand Cluster demonstrates excellent efficiency for several work distributions (87\\% efficiency for Power-law and 91% for Poisson). The proposed distributed FP-Tree merging algorithm provides 38x communication speedup on 4096 cores.

  10. A Modularized Efficient Framework for Non-Markov Time Series Estimation

    Science.gov (United States)

    Schamberg, Gabriel; Ba, Demba; Coleman, Todd P.

    2018-06-01

    We present a compartmentalized approach to finding the maximum a-posteriori (MAP) estimate of a latent time series that obeys a dynamic stochastic model and is observed through noisy measurements. We specifically consider modern signal processing problems with non-Markov signal dynamics (e.g. group sparsity) and/or non-Gaussian measurement models (e.g. point process observation models used in neuroscience). Through the use of auxiliary variables in the MAP estimation problem, we show that a consensus formulation of the alternating direction method of multipliers (ADMM) enables iteratively computing separate estimates based on the likelihood and prior and subsequently "averaging" them in an appropriate sense using a Kalman smoother. As such, this can be applied to a broad class of problem settings and only requires modular adjustments when interchanging various aspects of the statistical model. Under broad log-concavity assumptions, we show that the separate estimation problems are convex optimization problems and that the iterative algorithm converges to the MAP estimate. As such, this framework can capture non-Markov latent time series models and non-Gaussian measurement models. We provide example applications involving (i) group-sparsity priors, within the context of electrophysiologic specrotemporal estimation, and (ii) non-Gaussian measurement models, within the context of dynamic analyses of learning with neural spiking and behavioral observations.

  11. Fast algorithms for computing phylogenetic divergence time.

    Science.gov (United States)

    Crosby, Ralph W; Williams, Tiffani L

    2017-12-06

    The inference of species divergence time is a key step in most phylogenetic studies. Methods have been available for the last ten years to perform the inference, but the performance of the methods does not yet scale well to studies with hundreds of taxa and thousands of DNA base pairs. For example a study of 349 primate taxa was estimated to require over 9 months of processing time. In this work, we present a new algorithm, AncestralAge, that significantly improves the performance of the divergence time process. As part of AncestralAge, we demonstrate a new method for the computation of phylogenetic likelihood and our experiments show a 90% improvement in likelihood computation time on the aforementioned dataset of 349 primates taxa with over 60,000 DNA base pairs. Additionally, we show that our new method for the computation of the Bayesian prior on node ages reduces the running time for this computation on the 349 taxa dataset by 99%. Through the use of these new algorithms we open up the ability to perform divergence time inference on large phylogenetic studies.

  12. Computational commutative and non-commutative algebraic geometry

    CERN Document Server

    Cojocaru, S; Ufnarovski, V

    2005-01-01

    This publication gives a good insight in the interplay between commutative and non-commutative algebraic geometry. The theoretical and computational aspects are the central theme in this study. The topic is looked at from different perspectives in over 20 lecture reports. It emphasizes the current trends in commutative and non-commutative algebraic geometry and algebra. The contributors to this publication present the most recent and state-of-the-art progresses which reflect the topic discussed in this publication. Both researchers and graduate students will find this book a good source of information on commutative and non-commutative algebraic geometry.

  13. FRANTIC: a computer code for time dependent unavailability analysis

    International Nuclear Information System (INIS)

    Vesely, W.E.; Goldberg, F.F.

    1977-03-01

    The FRANTIC computer code evaluates the time dependent and average unavailability for any general system model. The code is written in FORTRAN IV for the IBM 370 computer. Non-repairable components, monitored components, and periodically tested components are handled. One unique feature of FRANTIC is the detailed, time dependent modeling of periodic testing which includes the effects of test downtimes, test overrides, detection inefficiencies, and test-caused failures. The exponential distribution is used for the component failure times and periodic equations are developed for the testing and repair contributions. Human errors and common mode failures can be included by assigning an appropriate constant probability for the contributors. The output from FRANTIC consists of tables and plots of the system unavailability along with a breakdown of the unavailability contributions. Sensitivity studies can be simply performed and a wide range of tables and plots can be obtained for reporting purposes. The FRANTIC code represents a first step in the development of an approach that can be of direct value in future system evaluations. Modifications resulting from use of the code, along with the development of reliability data based on operating reactor experience, can be expected to provide increased confidence in its use and potential application to the licensing process

  14. Computer network time synchronization the network time protocol

    CERN Document Server

    Mills, David L

    2006-01-01

    What started with the sundial has, thus far, been refined to a level of precision based on atomic resonance: Time. Our obsession with time is evident in this continued scaling down to nanosecond resolution and beyond. But this obsession is not without warrant. Precision and time synchronization are critical in many applications, such as air traffic control and stock trading, and pose complex and important challenges in modern information networks.Penned by David L. Mills, the original developer of the Network Time Protocol (NTP), Computer Network Time Synchronization: The Network Time Protocol

  15. Teaching Scientific Computing: A Model-Centered Approach to Pipeline and Parallel Programming with C

    Directory of Open Access Journals (Sweden)

    Vladimiras Dolgopolovas

    2015-01-01

    Full Text Available The aim of this study is to present an approach to the introduction into pipeline and parallel computing, using a model of the multiphase queueing system. Pipeline computing, including software pipelines, is among the key concepts in modern computing and electronics engineering. The modern computer science and engineering education requires a comprehensive curriculum, so the introduction to pipeline and parallel computing is the essential topic to be included in the curriculum. At the same time, the topic is among the most motivating tasks due to the comprehensive multidisciplinary and technical requirements. To enhance the educational process, the paper proposes a novel model-centered framework and develops the relevant learning objects. It allows implementing an educational platform of constructivist learning process, thus enabling learners’ experimentation with the provided programming models, obtaining learners’ competences of the modern scientific research and computational thinking, and capturing the relevant technical knowledge. It also provides an integral platform that allows a simultaneous and comparative introduction to pipelining and parallel computing. The programming language C for developing programming models and message passing interface (MPI and OpenMP parallelization tools have been chosen for implementation.

  16. Applicability of Time-Averaged Holography for Micro-Electro-Mechanical System Performing Non-Linear Oscillations

    Directory of Open Access Journals (Sweden)

    Paulius Palevicius

    2014-01-01

    Full Text Available Optical investigation of movable microsystem components using time-averaged holography is investigated in this paper. It is shown that even a harmonic excitation of a non-linear microsystem may result in an unpredictable chaotic motion. Analytical results between parameters of the chaotic oscillations and the formation of time-averaged fringes provide a deeper insight into computational and experimental interpretation of time-averaged MEMS holograms.

  17. Applicability of Time-Averaged Holography for Micro-Electro-Mechanical System Performing Non-Linear Oscillations

    Science.gov (United States)

    Palevicius, Paulius; Ragulskis, Minvydas; Palevicius, Arvydas; Ostasevicius, Vytautas

    2014-01-01

    Optical investigation of movable microsystem components using time-averaged holography is investigated in this paper. It is shown that even a harmonic excitation of a non-linear microsystem may result in an unpredictable chaotic motion. Analytical results between parameters of the chaotic oscillations and the formation of time-averaged fringes provide a deeper insight into computational and experimental interpretation of time-averaged MEMS holograms. PMID:24451467

  18. Applicability of time-averaged holography for micro-electro-mechanical system performing non-linear oscillations.

    Science.gov (United States)

    Palevicius, Paulius; Ragulskis, Minvydas; Palevicius, Arvydas; Ostasevicius, Vytautas

    2014-01-21

    Optical investigation of movable microsystem components using time-averaged holography is investigated in this paper. It is shown that even a harmonic excitation of a non-linear microsystem may result in an unpredictable chaotic motion. Analytical results between parameters of the chaotic oscillations and the formation of time-averaged fringes provide a deeper insight into computational and experimental interpretation of time-averaged MEMS holograms.

  19. Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

    2014-11-11

    has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).

  20. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems

    OpenAIRE

    Shehzad, Danish; Bozkuş, Zeki

    2016-01-01

    Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collecti...

  1. 5 CFR 890.101 - Definitions; time computations.

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 2 2010-01-01 2010-01-01 false Definitions; time computations. 890.101....101 Definitions; time computations. (a) In this part, the terms annuitant, carrier, employee, employee... in section 8901 of title 5, United States Code, and supplement the following definitions: Appropriate...

  2. Development of real-time visualization system for Computational Fluid Dynamics on parallel computers

    International Nuclear Information System (INIS)

    Muramatsu, Kazuhiro; Otani, Takayuki; Matsumoto, Hideki; Takei, Toshifumi; Doi, Shun

    1998-03-01

    A real-time visualization system for computational fluid dynamics in a network connecting between a parallel computing server and the client terminal was developed. Using the system, a user can visualize the results of a CFD (Computational Fluid Dynamics) simulation on the parallel computer as a client terminal during the actual computation on a server. Using GUI (Graphical User Interface) on the client terminal, to user is also able to change parameters of the analysis and visualization during the real-time of the calculation. The system carries out both of CFD simulation and generation of a pixel image data on the parallel computer, and compresses the data. Therefore, the amount of data from the parallel computer to the client is so small in comparison with no compression that the user can enjoy the swift image appearance comfortably. Parallelization of image data generation is based on Owner Computation Rule. GUI on the client is built on Java applet. A real-time visualization is thus possible on the client PC only if Web browser is implemented on it. (author)

  3. 29 CFR 1921.22 - Computation of time.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 7 2010-07-01 2010-07-01 false Computation of time. 1921.22 Section 1921.22 Labor Regulations Relating to Labor (Continued) OCCUPATIONAL SAFETY AND HEALTH ADMINISTRATION, DEPARTMENT OF LABOR... WORKERS' COMPENSATION ACT Miscellaneous § 1921.22 Computation of time. Sundays and holidays shall be...

  4. How many accelerograms to use and how to deal with scattering for transient non-linear seismic computations?

    International Nuclear Information System (INIS)

    Viallet, E.; Heinfling, G.

    2005-01-01

    Due to increased potentialities of computers, it is nowadays possible to perform dynamic non-linear computation of structures to evaluate their ultimate behavior under seismic loads using refined finite element models. Nevertheless, one key parameter for such complex computations is the input load (i.e. input time histories) which may lead to important discrepancies in the results and therefore difficulties to deal with for engineering purpose (variability, number of time histories to use...). In this situation, the number of accelerograms to be used and the way to deal with the results is to be carefully assessed. The objective of this study is to give some elements concerning (i) the number of accelerograms to be used for transient non-linear computations and (ii) the way to account for scattering of results. For this purpose, some simplified non-linear models are used. These models represent characteristic types of non-linearities such as : - Reinforce concrete (RC) structure model (with plastic non-linearity), - PWR core model (with impact non-linearity). For each type of non-linearity, different sets of accelerograms are used (artificial and natural ones). Each set is composed of a relatively high number of accelerograms in order to get proper trends. The results are expressed in term of average and standard deviation values of the characteristic parameters for each non-linearity (i.e. ductility drift for RC structure model and impact force for PWR core model). The results show that, a relatively large number of time histories may be necessary to get proper predictions of the average value of the characteristic non-linear parameter under consideration. In that situation, it should be difficult to deal with such a result for complex studies on reel structures. Nevertheless, it may be necessarily to perform transient non-linear seismic computations for design analyses but with a reduced number of calculations. For this purpose, the previous results are analyzed

  5. Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces.

    Science.gov (United States)

    Spezia, Riccardo; Martínez-Nuñez, Emilio; Vazquez, Saulo; Hase, William L

    2017-04-28

    In this Introduction, we show the basic problems of non-statistical and non-equilibrium phenomena related to the papers collected in this themed issue. Over the past few years, significant advances in both computing power and development of theories have allowed the study of larger systems, increasing the time length of simulations and improving the quality of potential energy surfaces. In particular, the possibility of using quantum chemistry to calculate energies and forces 'on the fly' has paved the way to directly study chemical reactions. This has provided a valuable tool to explore molecular mechanisms at given temperatures and energies and to see whether these reactive trajectories follow statistical laws and/or minimum energy pathways. This themed issue collects different aspects of the problem and gives an overview of recent works and developments in different contexts, from the gas phase to the condensed phase to excited states.This article is part of the themed issue 'Theoretical and computational studies of non-equilibrium and non-statistical dynamics in the gas phase, in the condensed phase and at interfaces'. © 2017 The Author(s).

  6. A lightweight communication library for distributed computing

    International Nuclear Information System (INIS)

    Groen, Derek; Rieder, Steven; Zwart, Simon Portegies; Grosso, Paola; Laat, Cees de

    2010-01-01

    We present MPWide, a platform-independent communication library for performing message passing between computers. Our library allows coupling of several local message passing interface (MPI) applications through a long-distance network and is specifically optimized for such communications. The implementation is deliberately kept lightweight and platform independent, and the library can be installed and used without administrative privileges. The only requirements are a C++ compiler and at least one open port to a wide-area network on each site. In this paper we present the library, describe the user interface, present performance tests and apply MPWide in a large-scale cosmological N-body simulation on a network of two computers, one in Amsterdam and the other in Tokyo.

  7. Non-Linear Interactive Stories in Computer Games

    DEFF Research Database (Denmark)

    Bangsø, Olav; Jensen, Ole Guttorm; Kocka, Tomas

    2003-01-01

    The paper introduces non-linear interactive stories (NOLIST) as a means to generate varied and interesting stories for computer games automatically. We give a compact representation of a NOLIST based on the specification of atomic stories, and show how to build an object-oriented Bayesian network...

  8. General purpose computers in real time

    International Nuclear Information System (INIS)

    Biel, J.R.

    1989-01-01

    I see three main trends in the use of general purpose computers in real time. The first is more processing power. The second is the use of higher speed interconnects between computers (allowing more data to be delivered to the processors). The third is the use of larger programs running in the computers. Although there is still work that needs to be done, I believe that all indications are that the online need for general purpose computers should be available for the SCC and LHC machines. 2 figs

  9. Computational imaging with multi-camera time-of-flight systems

    KAUST Repository

    Shrestha, Shikhar

    2016-07-11

    Depth cameras are a ubiquitous technology used in a wide range of applications, including robotic and machine vision, human computer interaction, autonomous vehicles as well as augmented and virtual reality. In this paper, we explore the design and applications of phased multi-camera time-of-flight (ToF) systems. We develop a reproducible hardware system that allows for the exposure times and waveforms of up to three cameras to be synchronized. Using this system, we analyze waveform interference between multiple light sources in ToF applications and propose simple solutions to this problem. Building on the concept of orthogonal frequency design, we demonstrate state-of-the-art results for instantaneous radial velocity capture via Doppler time-of-flight imaging and we explore new directions for optically probing global illumination, for example by de-scattering dynamic scenes and by non-line-of-sight motion detection via frequency gating. © 2016 ACM.

  10. Parallel computing by Monte Carlo codes MVP/GMVP

    International Nuclear Information System (INIS)

    Nagaya, Yasunobu; Nakagawa, Masayuki; Mori, Takamasa

    2001-01-01

    General-purpose Monte Carlo codes MVP/GMVP are well-vectorized and thus enable us to perform high-speed Monte Carlo calculations. In order to achieve more speedups, we parallelized the codes on the different types of parallel computing platforms or by using a standard parallelization library MPI. The platforms used for benchmark calculations are a distributed-memory vector-parallel computer Fujitsu VPP500, a distributed-memory massively parallel computer Intel paragon and a distributed-memory scalar-parallel computer Hitachi SR2201, IBM SP2. As mentioned generally, linear speedup could be obtained for large-scale problems but parallelization efficiency decreased as the batch size per a processing element(PE) was smaller. It was also found that the statistical uncertainty for assembly powers was less than 0.1% by the PWR full-core calculation with more than 10 million histories and it took about 1.5 hours by massively parallel computing. (author)

  11. A Modular Environment for Geophysical Inversion and Run-time Autotuning using Heterogeneous Computing Systems

    Science.gov (United States)

    Myre, Joseph M.

    Heterogeneous computing systems have recently come to the forefront of the High-Performance Computing (HPC) community's interest. HPC computer systems that incorporate special purpose accelerators, such as Graphics Processing Units (GPUs), are said to be heterogeneous. Large scale heterogeneous computing systems have consistently ranked highly on the Top500 list since the beginning of the heterogeneous computing trend. By using heterogeneous computing systems that consist of both general purpose processors and special- purpose accelerators, the speed and problem size of many simulations could be dramatically increased. Ultimately this results in enhanced simulation capabilities that allows, in some cases for the first time, the execution of parameter space and uncertainty analyses, model optimizations, and other inverse modeling techniques that are critical for scientific discovery and engineering analysis. However, simplifying the usage and optimization of codes for heterogeneous computing systems remains a challenge. This is particularly true for scientists and engineers for whom understanding HPC architectures and undertaking performance analysis may not be primary research objectives. To enable scientists and engineers to remain focused on their primary research objectives, a modular environment for geophysical inversion and run-time autotuning on heterogeneous computing systems is presented. This environment is composed of three major components: 1) CUSH---a framework for reducing the complexity of programming heterogeneous computer systems, 2) geophysical inversion routines which can be used to characterize physical systems, and 3) run-time autotuning routines designed to determine configurations of heterogeneous computing systems in an attempt to maximize the performance of scientific and engineering codes. Using three case studies, a lattice-Boltzmann method, a non-negative least squares inversion, and a finite-difference fluid flow method, it is shown that

  12. Parallel computing and networking; Heiretsu keisanki to network

    Energy Technology Data Exchange (ETDEWEB)

    Asakawa, E; Tsuru, T [Japan National Oil Corp., Tokyo (Japan); Matsuoka, T [Japan Petroleum Exploration Co. Ltd., Tokyo (Japan)

    1996-05-01

    This paper describes the trend of parallel computers used in geophysical exploration. Around 1993 was the early days when the parallel computers began to be used for geophysical exploration. Classification of these computers those days was mainly MIMD (multiple instruction stream, multiple data stream), SIMD (single instruction stream, multiple data stream) and the like. Parallel computers were publicized in the 1994 meeting of the Geophysical Exploration Society as a `high precision imaging technology`. Concerning the library of parallel computers, there was a shift to PVM (parallel virtual machine) in 1993 and to MPI (message passing interface) in 1995. In addition, the compiler of FORTRAN90 was released with support implemented for data parallel and vector computers. In 1993, networks used were Ethernet, FDDI, CDDI and HIPPI. In 1995, the OC-3 products under ATM began to propagate. However, ATM remains to be an interoffice high speed network because the ATM service has not spread yet for the public network. 1 ref.

  13. GPU-based stochastic-gradient optimization for non-rigid medical image registration in time-critical applications

    NARCIS (Netherlands)

    Staring, M.; Al-Ars, Z.; Berendsen, Floris; Angelini, Elsa D.; Landman, Bennett A.

    2018-01-01

    Currently, non-rigid image registration algorithms are too computationally intensive to use in time-critical applications. Existing implementations that focus on speed typically address this by either parallelization on GPU-hardware, or by introducing methodically novel techniques into

  14. Studies of the wavelength dependence of non-sequential double ionization of xenon in strong fields

    International Nuclear Information System (INIS)

    Kaminski, P.; Wiehle, R.; Kamke, W.; Helm, H.; Witzele, B.

    2005-01-01

    Full text: The non-sequential double ionization of noble gases in strong fields is still a process which is not completely understood. The most challenging question is: what is the dominant physical process behind the knee structure in the yield of doubly charged ions which are produced in the focus of an ultrashort laser pulse in dependence of the intensity? Numerous studies can be explained with the so-called rescattering model, where an electron is freed by the strong laser field and then driven back to its parent ion due to the oscillation of the field. Through this backscattering process it is possible to kick out a second electron. However in the low intensity or multiphoton (MPI) region this model predicts that the first electron can not gain enough energy in the oscillating electric field to further ionize or excite the ion. We present experimental results for xenon in the MPI region which show a significant contribution of doubly charged ions. A Ti:sapphire laser system (800 nm, 100 fs) is used to ionize the atoms. The coincident detection of the momentum distribution of the photoelectrons with an imaging spectrometer and the time of flight spectrum of the ions allows a detailed view into the ionization process. For the first time we also show a systematic study of the wavelength dependence (780-830 nm and 1180-1550 nm) on the non-sequential double ionization. The ratio Xe 2+ /Xe + shows a surprising oscillatory behavior with varying wavelength. Ref. 1 (author)

  15. Parallelization characteristics of the DeCART code

    International Nuclear Information System (INIS)

    Cho, J. Y.; Joo, H. G.; Kim, H. Y.; Lee, C. C.; Chang, M. H.; Zee, S. Q.

    2003-12-01

    This report is to describe the parallelization characteristics of the DeCART code and also examine its parallel performance. Parallel computing algorithms are implemented to DeCART to reduce the tremendous computational burden and memory requirement involved in the three-dimensional whole core transport calculation. In the parallelization of the DeCART code, the axial domain decomposition is first realized by using MPI (Message Passing Interface), and then the azimuthal angle domain decomposition by using either MPI or OpenMP. When using the MPI for both the axial and the angle domain decomposition, the concept of MPI grouping is employed for convenient communication in each communication world. For the parallel computation, most of all the computing modules except for the thermal hydraulic module are parallelized. These parallelized computing modules include the MOC ray tracing, CMFD, NEM, region-wise cross section preparation and cell homogenization modules. For the distributed allocation, most of all the MOC and CMFD/NEM variables are allocated only for the assigned planes, which reduces the required memory by a ratio of the number of the assigned planes to the number of all planes. The parallel performance of the DeCART code is evaluated by solving two problems, a rodded variation of the C5G7 MOX three-dimensional benchmark problem and a simplified three-dimensional SMART PWR core problem. In the aspect of parallel performance, the DeCART code shows a good speedup of about 40.1 and 22.4 in the ray tracing module and about 37.3 and 20.2 in the total computing time when using 48 CPUs on the IBM Regatta and 24 CPUs on the LINUX cluster, respectively. In the comparison between the MPI and OpenMP, OpenMP shows a somewhat better performance than MPI. Therefore, it is concluded that the first priority in the parallel computation of the DeCART code is in the axial domain decomposition by using MPI, and then in the angular domain using OpenMP, and finally the angular

  16. Time-Predictable Computer Architecture

    Directory of Open Access Journals (Sweden)

    Schoeberl Martin

    2009-01-01

    Full Text Available Today's general-purpose processors are optimized for maximum throughput. Real-time systems need a processor with both a reasonable and a known worst-case execution time (WCET. Features such as pipelines with instruction dependencies, caches, branch prediction, and out-of-order execution complicate WCET analysis and lead to very conservative estimates. In this paper, we evaluate the issues of current architectures with respect to WCET analysis. Then, we propose solutions for a time-predictable computer architecture. The proposed architecture is evaluated with implementation of some features in a Java processor. The resulting processor is a good target for WCET analysis and still performs well in the average case.

  17. Recent achievements in real-time computational seismology in Taiwan

    Science.gov (United States)

    Lee, S.; Liang, W.; Huang, B.

    2012-12-01

    Real-time computational seismology is currently possible to be achieved which needs highly connection between seismic database and high performance computing. We have developed a real-time moment tensor monitoring system (RMT) by using continuous BATS records and moment tensor inversion (CMT) technique. The real-time online earthquake simulation service is also ready to open for researchers and public earthquake science education (ROS). Combine RMT with ROS, the earthquake report based on computational seismology can provide within 5 minutes after an earthquake occurred (RMT obtains point source information ROS completes a 3D simulation real-time now. For more information, welcome to visit real-time computational seismology earthquake report webpage (RCS).

  18. A method of non-contact reading code based on computer vision

    Science.gov (United States)

    Zhang, Chunsen; Zong, Xiaoyu; Guo, Bingxuan

    2018-03-01

    With the purpose of guarantee the computer information exchange security between internal and external network (trusted network and un-trusted network), A non-contact Reading code method based on machine vision has been proposed. Which is different from the existing network physical isolation method. By using the computer monitors, camera and other equipment. Deal with the information which will be on exchanged, Include image coding ,Generate the standard image , Display and get the actual image , Calculate homography matrix, Image distort correction and decoding in calibration, To achieve the computer information security, Non-contact, One-way transmission between the internal and external network , The effectiveness of the proposed method is verified by experiments on real computer text data, The speed of data transfer can be achieved 24kb/s. The experiment shows that this algorithm has the characteristics of high security, fast velocity and less loss of information. Which can meet the daily needs of the confidentiality department to update the data effectively and reliably, Solved the difficulty of computer information exchange between Secret network and non-secret network, With distinctive originality, practicability, and practical research value.

  19. Differences in prevalence of self-reported musculoskeletal symptoms among computer and non-computer users in a Nigerian population: a cross-sectional study

    Directory of Open Access Journals (Sweden)

    Ayanniyi O

    2010-08-01

    Full Text Available Abstract Background Literature abounds on the prevalent nature of Self Reported Musculoskeletal Symptoms (SRMS among computer users, but studies that actually compared this with non computer users are meagre thereby reducing the strength of the evidence. This study compared the prevalence of SRMS between computer and non computer users and assessed the risk factors associated with SRMS. Methods A total of 472 participants comprising equal numbers of age and sex matched computer and non computer users were assessed for the presence of SRMS. Information concerning musculoskeletal symptoms and discomforts from the neck, shoulders, upper back, elbows, wrists/hands, low back, hips/thighs, knees and ankles/feet were obtained using the Standardized Nordic questionnaire. Results The prevalence of SRMS was significantly higher in the computer users than the non computer users both over the past 7 days (χ2 = 39.11, p = 0.001 and during the past 12 month durations (χ2 = 53.56, p = 0.001. The odds of reporting musculoskeletal symptoms was least for participants above the age of 40 years (OR = 0.42, 95% CI = 0.31-0.64 over the past 7 days and OR = 0.61; 95% CI = 0.47-0.77 during the past 12 months and also reduced in female participants. Increasing daily hours and accumulated years of computer use and tasks of data processing and designs/graphics significantly (p Conclusion The prevalence of SRMS was significantly higher in the computer users than the non computer users and younger age, being male, working longer hours daily, increasing years of computer use, data entry tasks and computer designs/graphics were the significant risk factors for reporting musculoskeletal symptoms among the computer users. Computer use may explain the increase in prevalence of SRMS among the computer users.

  20. Finite-volume effects due to spatially non-local operators arXiv

    CERN Document Server

    Briceño, Raúl A.; Hansen, Maxwell T.; Monahan, Christopher J.

    Spatially non-local matrix elements are useful lattice-QCD observables in a variety of contexts, for example in determining hadron structure. To quote credible estimates of the systematic uncertainties in these calculations, one must understand, among other things, the size of the finite-volume effects when such matrix elements are extracted from numerical lattice calculations. In this work, we estimate finite-volume effects for matrix elements of non-local operators, composed of two currents displaced in a spatial direction by a distance $\\xi$. We find that the finite-volume corrections depend on the details of the matrix element. If the external state is the lightest degree of freedom in the theory, e.g.~the pion in QCD, then the volume corrections scale as $ e^{-m_\\pi (L- \\xi)} $, where $m_\\pi$ is the mass of the light state. For heavier external states the usual $e^{- m_\\pi L}$ form is recovered, but with a polynomial prefactor of the form $L^m/|L - \\xi|^n$ that can lead to enhanced volume effects. These ...

  1. Covariant non-commutative space–time

    Directory of Open Access Journals (Sweden)

    Jonathan J. Heckman

    2015-05-01

    Full Text Available We introduce a covariant non-commutative deformation of 3+1-dimensional conformal field theory. The deformation introduces a short-distance scale ℓp, and thus breaks scale invariance, but preserves all space–time isometries. The non-commutative algebra is defined on space–times with non-zero constant curvature, i.e. dS4 or AdS4. The construction makes essential use of the representation of CFT tensor operators as polynomials in an auxiliary polarization tensor. The polarization tensor takes active part in the non-commutative algebra, which for dS4 takes the form of so(5,1, while for AdS4 it assembles into so(4,2. The structure of the non-commutative correlation functions hints that the deformed theory contains gravitational interactions and a Regge-like trajectory of higher spin excitations.

  2. Distributed computing feasibility in a non-dedicated homogeneous distributed system

    Science.gov (United States)

    Leutenegger, Scott T.; Sun, Xian-He

    1993-01-01

    The low cost and availability of clusters of workstations have lead researchers to re-explore distributed computing using independent workstations. This approach may provide better cost/performance than tightly coupled multiprocessors. In practice, this approach often utilizes wasted cycles to run parallel jobs. The feasibility of such a non-dedicated parallel processing environment assuming workstation processes have preemptive priority over parallel tasks is addressed. An analytical model is developed to predict parallel job response times. Our model provides insight into how significantly workstation owner interference degrades parallel program performance. A new term task ratio, which relates the parallel task demand to the mean service demand of nonparallel workstation processes, is introduced. It was proposed that task ratio is a useful metric for determining how large the demand of a parallel applications must be in order to make efficient use of a non-dedicated distributed system.

  3. 7 CFR 1.603 - How are time periods computed?

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 1 2010-01-01 2010-01-01 false How are time periods computed? 1.603 Section 1.603... Licenses General Provisions § 1.603 How are time periods computed? (a) General. Time periods are computed as follows: (1) The day of the act or event from which the period begins to run is not included. (2...

  4. Finite magnetic relaxation in x-space magnetic particle imaging: Comparison of measurements and ferrohydrodynamic models.

    Science.gov (United States)

    Dhavalikar, R; Hensley, D; Maldonado-Camargo, L; Croft, L R; Ceron, S; Goodwill, P W; Conolly, S M; Rinaldi, C

    2016-08-03

    Magnetic Particle Imaging (MPI) is an emerging tomographic imaging technology that detects magnetic nanoparticle tracers by exploiting their non-linear magnetization properties. In order to predict the behavior of nanoparticles in an imager, it is possible to use a non-imaging MPI relaxometer or spectrometer to characterize the behavior of nanoparticles in a controlled setting. In this paper we explore the use of ferrohydrodynamic magnetization equations for predicting the response of particles in an MPI relaxometer. These include a magnetization equation developed by Shliomis (Sh) which has a constant relaxation time and a magnetization equation which uses a field-dependent relaxation time developed by Martsenyuk, Raikher and Shliomis (MRSh). We compare the predictions from these models with measurements and with the predictions based on the Langevin function that assumes instantaneous magnetization response of the nanoparticles. The results show good qualitative and quantitative agreement between the ferrohydrodynamic models and the measurements without the use of fitting parameters and provide further evidence of the potential of ferrohydrodynamic modeling in MPI.

  5. Improving appropriate use of echocardiography and single-photon emission computed tomographic myocardial perfusion imaging: a continuous quality improvement initiative.

    Science.gov (United States)

    Johnson, Thomas V; Rose, Geoffrey A; Fenner, Deborah J; Rozario, Nigel L

    2014-07-01

    Appropriate use criteria for cardiovascular imaging have been published, but compliance in practice has been incomplete, with persistent high rates of inappropriate use. The aim of this study was to show the efficacy of a continuous quality improvement (CQI) initiative to favorably influence the appropriate use of outpatient transthoracic echocardiography and single-photon emission computed tomographic (SPECT) myocardial perfusion imaging (MPI) in a large cardiovascular practice. In this prospective study, a multiphase CQI initiative was implemented, and its impact on ordering patterns for outpatient transthoracic echocardiography and SPECT MPI was assessed. Between November and December 2010, a baseline analysis of the application of appropriate use criteria to indications for outpatient transthoracic echocardiographic studies (n = 203) and SPECT MPI studies (n = 205) was performed, with studies categorized as "appropriate," "inappropriate," "uncertain," or "unclassified." The CQI initiative was then begun, with (1) clinician education, including didactic lectures and case-based presentations with audience participation; (2) system changes in ordering processes, with redesigned image ordering forms; and (3) peer review and feedback. A follow-up analysis was then performed between June and August 2012, with categorization of indications for transthoracic echocardiographic studies (n = 206) and SPECT MPI studies (n = 206). At baseline, 73.9% of echocardiographic studies were categorized as appropriate, 16.7% as inappropriate, 5.9% as uncertain, and 3.4% as unclassified. Similarly, for SPECT MPI studies 71.7% were categorized as appropriate, 18.5% as inappropriate, 7.8% as uncertain, and 1.9% as unclassified. Separate analysis of the two most important categories, appropriate and inappropriate, demonstrated a significant improvement after the CQI initiative, with a 63% reduction in inappropriate echocardiographic studies (18.5% vs 6.9%, P = .0010) and a 46% reduction

  6. Instruction timing for the CDC 7600 computer

    International Nuclear Information System (INIS)

    Lipps, H.

    1975-01-01

    This report provides timing information for all instructions of the Control Data 7600 computer, except for instructions of type 01X, to enable the optimization of 7600 programs. The timing rules serve as background information for timing charts which are produced by a program (TIME76) of the CERN Program Library. The rules that co-ordinate the different sections of the CPU are stated in as much detail as is necessary to time the flow of instructions for a given sequence of code. Instruction fetch, instruction issue, and access to small core memory are treated at length, since details are not available from the computer manuals. Annotated timing charts are given for 24 examples, chosen to display the full range of timing considerations. (Author)

  7. 50 CFR 221.3 - How are time periods computed?

    Science.gov (United States)

    2010-10-01

    ... 50 Wildlife and Fisheries 7 2010-10-01 2010-10-01 false How are time periods computed? 221.3... Provisions § 221.3 How are time periods computed? (a) General. Time periods are computed as follows: (1) The day of the act or event from which the period begins to run is not included. (2) The last day of the...

  8. Timing analysis for embedded systems using non-preemptive EDF scheduling under bounded error arrivals

    Directory of Open Access Journals (Sweden)

    Michael Short

    2017-07-01

    Full Text Available Embedded systems consist of one or more processing units which are completely encapsulated by the devices under their control, and they often have stringent timing constraints associated with their functional specification. Previous research has considered the performance of different types of task scheduling algorithm and developed associated timing analysis techniques for such systems. Although preemptive scheduling techniques have traditionally been favored, rapid increases in processor speeds combined with improved insights into the behavior of non-preemptive scheduling techniques have seen an increased interest in their use for real-time applications such as multimedia, automation and control. However when non-preemptive scheduling techniques are employed there is a potential lack of error confinement should any timing errors occur in individual software tasks. In this paper, the focus is upon adding fault tolerance in systems using non-preemptive deadline-driven scheduling. Schedulability conditions are derived for fault-tolerant periodic and sporadic task sets experiencing bounded error arrivals under non-preemptive deadline scheduling. A timing analysis algorithm is presented based upon these conditions and its run-time properties are studied. Computational experiments show it to be highly efficient in terms of run-time complexity and competitive ratio when compared to previous approaches.

  9. Super computer made with Linux cluster

    International Nuclear Information System (INIS)

    Lee, Jeong Hun; Oh, Yeong Eun; Kim, Jeong Seok

    2002-01-01

    This book consists of twelve chapters, which introduce super computer made with Linux cluster. The contents of this book are Linux cluster, the principle of cluster, design of Linux cluster, general things for Linux, building up terminal server and client, Bear wolf cluster by Debian GNU/Linux, cluster system with red hat, Monitoring system, application programming-MPI, on set-up and install application programming-PVM, with PVM programming and XPVM application programming-open PBS with composition and install and set-up and GRID with GRID system, GSI, GRAM, MDS, its install and using of tool kit

  10. Simplified non-linear time-history analysis based on the Theory of Plasticity

    DEFF Research Database (Denmark)

    Costa, Joao Domingues

    2005-01-01

    This paper aims at giving a contribution to the problem of developing simplified non-linear time-history (NLTH) analysis of structures which dynamical response is mainly governed by plastic deformations, able to provide designers with sufficiently accurate results. The method to be presented...... is based on the Theory of Plasticity. Firstly, the formulation and the computational procedure to perform time-history analysis of a rigid-plastic single degree of freedom (SDOF) system are presented. The necessary conditions for the method to incorporate pinching as well as strength degradation...

  11. Computational model for real-time determination of tritium inventory in a detritiation installation

    International Nuclear Information System (INIS)

    Bornea, Anisia; Stefanescu, Ioan; Zamfirache, Marius; Stefan, Iuliana; Sofalca, Nicolae; Bidica, Nicolae

    2008-01-01

    Full text: At ICIT Rm.Valcea an experimental pilot plant was built having as main objective the development of a technology for detritiation of heavy water processed in the CANDU-type reactors of the nuclear power plant at Cernavoda, Romania. The aspects related to safeguards and safety for such a detritiation installation being of great importance, a complex computational model has been developed. The model allows real-time calculation of tritium inventory in a working installation. The applied detritiation technology is catalyzed isotopic exchange coupled with cryogenic distillation. Computational models for non-steady working conditions have been developed for each process of isotopic exchange. By coupling these processes tritium inventory can be determined in real-time. The computational model was developed based on the experience gained on the pilot installation. The model uses a set of parameters specific to isotopic exchange processes. These parameters were experimentally determined in the pilot installation. The model is included in the monitoring system and uses as input data the parameters acquired in real-time from automation system of the pilot installation. A friendly interface has been created to visualize the final results as data or graphs. (authors)

  12. Parallelization of simulation code for liquid-gas model of lattice-gas fluid

    International Nuclear Information System (INIS)

    Kawai, Wataru; Ebihara, Kenichi; Kume, Etsuo; Watanabe, Tadashi

    2000-03-01

    A simulation code for hydrodynamical phenomena which is based on the liquid-gas model of lattice-gas fluid is parallelized by using MPI (Message Passing Interface) library. The parallelized code can be applied to the larger size of the simulations than the non-parallelized code. The calculation times of the parallelized code on VPP500 (Vector-Parallel super computer with dispersed memory units), AP3000 (Scalar-parallel server with dispersed memory units), and a workstation cluster decreased in inverse proportion to the number of processors. (author)

  13. Self-Motion Perception: Assessment by Real-Time Computer Generated Animations

    Science.gov (United States)

    Parker, Donald E.

    1999-01-01

    Our overall goal is to develop materials and procedures for assessing vestibular contributions to spatial cognition. The specific objective of the research described in this paper is to evaluate computer-generated animations as potential tools for studying self-orientation and self-motion perception. Specific questions addressed in this study included the following. First, does a non- verbal perceptual reporting procedure using real-time animations improve assessment of spatial orientation? Are reports reliable? Second, do reports confirm expectations based on stimuli to vestibular apparatus? Third, can reliable reports be obtained when self-motion description vocabulary training is omitted?

  14. Diagnostic value of thallium-201 myocardial perfusion IQ-SPECT without and with computed tomography-based attenuation correction to predict clinically significant and insignificant fractional flow reserve: A single-center prospective study.

    Science.gov (United States)

    Tanaka, Haruki; Takahashi, Teruyuki; Ohashi, Norihiko; Tanaka, Koichi; Okada, Takenori; Kihara, Yasuki

    2017-12-01

    The aim of this study was to clarify the predictive value of fractional flow reserve (FFR) determined by myocardial perfusion imaging (MPI) using thallium (Tl)-201 IQ-SPECT without and with computed tomography-based attenuation correction (CT-AC) for patients with stable coronary artery disease (CAD).We assessed 212 angiographically identified diseased vessels using adenosine-stress Tl-201 MPI-IQ-SPECT/CT in 84 consecutive, prospectively identified patients with stable CAD. We compared the FFR in 136 of the 212 diseased vessels using visual semiquantitative interpretations of corresponding territories on MPI-IQ-SPECT images without and with CT-AC.FFR inversely correlated most accurately with regional summed difference scores (rSDS) in images without and with CT-AC (r = -0.584 and r = -0.568, respectively, both P system can predict FFR at an optimal cut-off of reserved.

  15. TESLA GPUs versus MPI with OpenMP for the Forward Modeling of Gravity and Gravity Gradient of Large Prisms Ensemble

    Directory of Open Access Journals (Sweden)

    Carlos Couder-Castañeda

    2013-01-01

    Full Text Available An implementation with the CUDA technology in a single and in several graphics processing units (GPUs is presented for the calculation of the forward modeling of gravitational fields from a tridimensional volumetric ensemble composed by unitary prisms of constant density. We compared the performance results obtained with the GPUs against a previous version coded in OpenMP with MPI, and we analyzed the results on both platforms. Today, the use of GPUs represents a breakthrough in parallel computing, which has led to the development of several applications with various applications. Nevertheless, in some applications the decomposition of the tasks is not trivial, as can be appreciated in this paper. Unlike a trivial decomposition of the domain, we proposed to decompose the problem by sets of prisms and use different memory spaces per processing CUDA core, avoiding the performance decay as a result of the constant calls to kernels functions which would be needed in a parallelization by observations points. The design and implementation created are the main contributions of this work, because the parallelization scheme implemented is not trivial. The performance results obtained are comparable to those of a small processing cluster.

  16. Real-time computational photon-counting LiDAR

    Science.gov (United States)

    Edgar, Matthew; Johnson, Steven; Phillips, David; Padgett, Miles

    2018-03-01

    The availability of compact, low-cost, and high-speed MEMS-based spatial light modulators has generated widespread interest in alternative sampling strategies for imaging systems utilizing single-pixel detectors. The development of compressed sensing schemes for real-time computational imaging may have promising commercial applications for high-performance detectors, where the availability of focal plane arrays is expensive or otherwise limited. We discuss the research and development of a prototype light detection and ranging (LiDAR) system via direct time of flight, which utilizes a single high-sensitivity photon-counting detector and fast-timing electronics to recover millimeter accuracy three-dimensional images in real time. The development of low-cost real time computational LiDAR systems could have importance for applications in security, defense, and autonomous vehicles.

  17. The Fourier decomposition method for nonlinear and non-stationary time series analysis.

    Science.gov (United States)

    Singh, Pushpendra; Joshi, Shiv Dutt; Patney, Rakesh Kumar; Saha, Kaushik

    2017-03-01

    for many decades, there has been a general perception in the literature that Fourier methods are not suitable for the analysis of nonlinear and non-stationary data. In this paper, we propose a novel and adaptive Fourier decomposition method (FDM), based on the Fourier theory, and demonstrate its efficacy for the analysis of nonlinear and non-stationary time series. The proposed FDM decomposes any data into a small number of 'Fourier intrinsic band functions' (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and variable frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank-based multivariate FDM (MFDM), for the analysis of multivariate nonlinear and non-stationary time series, using the FDM. We also present an algorithm to obtain cut-off frequencies for MFDM. The proposed MFDM generates a finite number of band-limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods provide a time-frequency-energy (TFE) distribution that reveals the intrinsic structure of a data. Numerical computations and simulations have been carried out and comparison is made with the empirical mode decomposition algorithms.

  18. How to Build an AppleSeed: A Parallel Macintosh Cluster for Numerically Intensive Computing

    Science.gov (United States)

    Decyk, V. K.; Dauger, D. E.

    We have constructed a parallel cluster consisting of a mixture of Apple Macintosh G3 and G4 computers running the Mac OS, and have achieved very good performance on numerically intensive, parallel plasma particle-incell simulations. A subset of the MPI message-passing library was implemented in Fortran77 and C. This library enabled us to port code, without modification, from other parallel processors to the Macintosh cluster. Unlike Unix-based clusters, no special expertise in operating systems is required to build and run the cluster. This enables us to move parallel computing from the realm of experts to the main stream of computing.

  19. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems.

    Science.gov (United States)

    Shehzad, Danish; Bozkuş, Zeki

    2016-01-01

    Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI) is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA) by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  20. Optimizing NEURON Simulation Environment Using Remote Memory Access with Recursive Doubling on Distributed Memory Systems

    Directory of Open Access Journals (Sweden)

    Danish Shehzad

    2016-01-01

    Full Text Available Increase in complexity of neuronal network models escalated the efforts to make NEURON simulation environment efficient. The computational neuroscientists divided the equations into subnets amongst multiple processors for achieving better hardware performance. On parallel machines for neuronal networks, interprocessor spikes exchange consumes large section of overall simulation time. In NEURON for communication between processors Message Passing Interface (MPI is used. MPI_Allgather collective is exercised for spikes exchange after each interval across distributed memory systems. The increase in number of processors though results in achieving concurrency and better performance but it inversely affects MPI_Allgather which increases communication time between processors. This necessitates improving communication methodology to decrease the spikes exchange time over distributed memory systems. This work has improved MPI_Allgather method using Remote Memory Access (RMA by moving two-sided communication to one-sided communication, and use of recursive doubling mechanism facilitates achieving efficient communication between the processors in precise steps. This approach enhanced communication concurrency and has improved overall runtime making NEURON more efficient for simulation of large neuronal network models.

  1. Myocardial scintigraphy. Clinical use and consequence in a non-invasive cardiological department

    DEFF Research Database (Denmark)

    Dümcke, Christine Elisabeth; Graff, J; Rasmussen, SPL

    2006-01-01

    to analyse the clinical use of MPI in a university hospital without invasive cardiological laboratory. MATERIAL AND METHODS: In the period 01.01.2002 to 31.12.2003, 259 patients (141 women, 118 men) were referred to MPI from our department of cardiology. RESULTS: Normal MPI was seen in 111 patients (43...

  2. Evolution of perturbed dynamical systems: analytical computation with time independent accuracy

    Energy Technology Data Exchange (ETDEWEB)

    Gurzadyan, A.V. [Russian-Armenian (Slavonic) University, Department of Mathematics and Mathematical Modelling, Yerevan (Armenia); Kocharyan, A.A. [Monash University, School of Physics and Astronomy, Clayton (Australia)

    2016-12-15

    An analytical method for investigation of the evolution of dynamical systems with independent on time accuracy is developed for perturbed Hamiltonian systems. The error-free estimation using of computer algebra enables the application of the method to complex multi-dimensional Hamiltonian and dissipative systems. It also opens principal opportunities for the qualitative study of chaotic trajectories. The performance of the method is demonstrated on perturbed two-oscillator systems. It can be applied to various non-linear physical and astrophysical systems, e.g. to long-term planetary dynamics. (orig.)

  3. NMF-mGPU: non-negative matrix factorization on multi-GPU systems.

    Science.gov (United States)

    Mejía-Roa, Edgardo; Tabas-Madrid, Daniel; Setoain, Javier; García, Carlos; Tirado, Francisco; Pascual-Montano, Alberto

    2015-02-13

    In the last few years, the Non-negative Matrix Factorization ( NMF ) technique has gained a great interest among the Bioinformatics community, since it is able to extract interpretable parts from high-dimensional datasets. However, the computing time required to process large data matrices may become impractical, even for a parallel application running on a multiprocessors cluster. In this paper, we present NMF-mGPU, an efficient and easy-to-use implementation of the NMF algorithm that takes advantage of the high computing performance delivered by Graphics-Processing Units ( GPUs ). Driven by the ever-growing demands from the video-games industry, graphics cards usually provided in PCs and laptops have evolved from simple graphics-drawing platforms into high-performance programmable systems that can be used as coprocessors for linear-algebra operations. However, these devices may have a limited amount of on-board memory, which is not considered by other NMF implementations on GPU. NMF-mGPU is based on CUDA ( Compute Unified Device Architecture ), the NVIDIA's framework for GPU computing. On devices with low memory available, large input matrices are blockwise transferred from the system's main memory to the GPU's memory, and processed accordingly. In addition, NMF-mGPU has been explicitly optimized for the different CUDA architectures. Finally, platforms with multiple GPUs can be synchronized through MPI ( Message Passing Interface ). In a four-GPU system, this implementation is about 120 times faster than a single conventional processor, and more than four times faster than a single GPU device (i.e., a super-linear speedup). Applications of GPUs in Bioinformatics are getting more and more attention due to their outstanding performance when compared to traditional processors. In addition, their relatively low price represents a highly cost-effective alternative to conventional clusters. In life sciences, this results in an excellent opportunity to facilitate the

  4. Non-perturbative analytical solutions of the space- and time-fractional Burgers equations

    International Nuclear Information System (INIS)

    Momani, Shaher

    2006-01-01

    Non-perturbative analytical solutions for the generalized Burgers equation with time- and space-fractional derivatives of order α and β, 0 < α, β ≤ 1, are derived using Adomian decomposition method. The fractional derivatives are considered in the Caputo sense. The solutions are given in the form of series with easily computable terms. Numerical solutions are calculated for the fractional Burgers equation to show the nature of solution as the fractional derivative parameter is changed

  5. TimeSet: A computer program that accesses five atomic time services on two continents

    Science.gov (United States)

    Petrakis, P. L.

    1993-01-01

    TimeSet is a shareware program for accessing digital time services by telephone. At its initial release, it was capable of capturing time signals only from the U.S. Naval Observatory to set a computer's clock. Later the ability to synchronize with the National Institute of Standards and Technology was added. Now, in Version 7.10, TimeSet is able to access three additional telephone time services in Europe - in Sweden, Austria, and Italy - making a total of five official services addressable by the program. A companion program, TimeGen, allows yet another source of telephone time data strings for callers equipped with TimeSet version 7.10. TimeGen synthesizes UTC time data strings in the Naval Observatory's format from an accurately set and maintained DOS computer clock, and transmits them to callers. This allows an unlimited number of 'freelance' time generating stations to be created. Timesetting from TimeGen is made feasible by the advent of Becker's RighTime, a shareware program that learns the drift characteristics of a computer's clock and continuously applies a correction to keep it accurate, and also brings .01 second resolution to the DOS clock. With clock regulation by RighTime and periodic update calls by the TimeGen station to an official time source via TimeSet, TimeGen offers the same degree of accuracy within the resolution of the computer clock as any official atomic time source.

  6. Robust Non-Local TV-L1 Optical Flow Estimation with Occlusion Detection.

    Science.gov (United States)

    Zhang, Congxuan; Chen, Zhen; Wang, Mingrun; Li, Ming; Jiang, Shaofeng

    2017-06-05

    In this paper, we propose a robust non-local TV-L1 optical flow method with occlusion detection to address the problem of weak robustness of optical flow estimation with motion occlusion. Firstly, a TV-L1 form for flow estimation is defined using a combination of the brightness constancy and gradient constancy assumptions in the data term and by varying the weight under the Charbonnier function in the smoothing term. Secondly, to handle the potential risk of the outlier in the flow field, a general non-local term is added in the TV-L1 optical flow model to engender the typical non-local TV-L1 form. Thirdly, an occlusion detection method based on triangulation is presented to detect the occlusion regions of the sequence. The proposed non-local TV-L1 optical flow model is performed in a linearizing iterative scheme using improved median filtering and a coarse-to-fine computing strategy. The results of the complex experiment indicate that the proposed method can overcome the significant influence of non-rigid motion, motion occlusion, and large displacement motion. Results of experiments comparing the proposed method and existing state-of-the-art methods by respectively using Middlebury and MPI Sintel database test sequences show that the proposed method has higher accuracy and better robustness.

  7. High-Performance Data Analysis Tools for Sun-Earth Connection Missions

    Science.gov (United States)

    Messmer, Peter

    2011-01-01

    The data analysis tool of choice for many Sun-Earth Connection missions is the Interactive Data Language (IDL) by ITT VIS. The increasing amount of data produced by these missions and the increasing complexity of image processing algorithms requires access to higher computing power. Parallel computing is a cost-effective way to increase the speed of computation, but algorithms oftentimes have to be modified to take advantage of parallel systems. Enhancing IDL to work on clusters gives scientists access to increased performance in a familiar programming environment. The goal of this project was to enable IDL applications to benefit from both computing clusters as well as graphics processing units (GPUs) for accelerating data analysis tasks. The tool suite developed in this project enables scientists now to solve demanding data analysis problems in IDL that previously required specialized software, and it allows them to be solved orders of magnitude faster than on conventional PCs. The tool suite consists of three components: (1) TaskDL, a software tool that simplifies the creation and management of task farms, collections of tasks that can be processed independently and require only small amounts of data communication; (2) mpiDL, a tool that allows IDL developers to use the Message Passing Interface (MPI) inside IDL for problems that require large amounts of data to be exchanged among multiple processors; and (3) GPULib, a tool that simplifies the use of GPUs as mathematical coprocessors from within IDL. mpiDL is unique in its support for the full MPI standard and its support of a broad range of MPI implementations. GPULib is unique in enabling users to take advantage of an inexpensive piece of hardware, possibly already installed in their computer, and achieve orders of magnitude faster execution time for numerically complex algorithms. TaskDL enables the simple setup and management of task farms on compute clusters. The products developed in this project have the

  8. TV time but not computer time is associated with cardiometabolic risk in Dutch young adults.

    Science.gov (United States)

    Altenburg, Teatske M; de Kroon, Marlou L A; Renders, Carry M; Hirasing, Remy; Chinapaw, Mai J M

    2013-01-01

    TV time and total sedentary time have been positively related to biomarkers of cardiometabolic risk in adults. We aim to examine the association of TV time and computer time separately with cardiometabolic biomarkers in young adults. Additionally, the mediating role of waist circumference (WC) is studied. Data of 634 Dutch young adults (18-28 years; 39% male) were used. Cardiometabolic biomarkers included indicators of overweight, blood pressure, blood levels of fasting plasma insulin, cholesterol, glucose, triglycerides and a clustered cardiometabolic risk score. Linear regression analyses were used to assess the cross-sectional association of self-reported TV and computer time with cardiometabolic biomarkers, adjusting for demographic and lifestyle factors. Mediation by WC was checked using the product-of-coefficient method. TV time was significantly associated with triglycerides (B = 0.004; CI = [0.001;0.05]) and insulin (B = 0.10; CI = [0.01;0.20]). Computer time was not significantly associated with any of the cardiometabolic biomarkers. We found no evidence for WC to mediate the association of TV time or computer time with cardiometabolic biomarkers. We found a significantly positive association of TV time with cardiometabolic biomarkers. In addition, we found no evidence for WC as a mediator of this association. Our findings suggest a need to distinguish between TV time and computer time within future guidelines for screen time.

  9. 1995 CERN school of computing. Proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Vandoni, C E [ed.

    1995-10-25

    These proceedings contain a written account of the majority of the lectures given at the 1995 CERN School of Computing. The Scientific Programme was articulated on 8 main themes: Human Computer Interfaces; Collaborative Software Engineering; Information Super Highways; Trends in Computer Architecture/Industry; Parallel Architectures (MPP); Mathematical Computing; Data Acquisition Systems; World-Wide Web for Physics. A number of lectures dealt with general aspects of computing, in particular in the area of Human Computer Interfaces (computer graphics, user interface tools and virtual reality). Applications in HEP of computer graphics (event display) was the subject of two lectures. The main theme of Mathematical Computing covered Mathematica and the usage of statistics packages. The important subject of Data Acqusition Systems was covered by lectures on switching techniques and simulation and modelling tools. A series of lectures dealt with the Information Super Highways and World-Wide Web Technology and its applications to High Energy Physics. Different aspects of Object Oriented Information Engineering Methodology and Object Oriented Programming in HEP were dealt in detail also in connection with data acquisition systems. On the theme `Trends in Computer Architecutre and Industry` lectures were given on: ATM Switching, and FORTRAN90 and High Performance FORTRAN. Computer Parallel Architectures (MPP) lectures delt with very large scale open systems, history and future of computer system architecture, message passing paradigm, features of PVM and MPI. (orig.).

  10. 1995 CERN school of computing. Proceedings

    International Nuclear Information System (INIS)

    Vandoni, C.E.

    1995-01-01

    These proceedings contain a written account of the majority of the lectures given at the 1995 CERN School of Computing. The Scientific Programme was articulated on 8 main themes: Human Computer Interfaces; Collaborative Software Engineering; Information Super Highways; Trends in Computer Architecture/Industry; Parallel Architectures (MPP); Mathematical Computing; Data Acquisition Systems; World-Wide Web for Physics. A number of lectures dealt with general aspects of computing, in particular in the area of Human Computer Interfaces (computer graphics, user interface tools and virtual reality). Applications in HEP of computer graphics (event display) was the subject of two lectures. The main theme of Mathematical Computing covered Mathematica and the usage of statistics packages. The important subject of Data Acqusition Systems was covered by lectures on switching techniques and simulation and modelling tools. A series of lectures dealt with the Information Super Highways and World-Wide Web Technology and its applications to High Energy Physics. Different aspects of Object Oriented Information Engineering Methodology and Object Oriented Programming in HEP were dealt in detail also in connection with data acquisition systems. On the theme 'Trends in Computer Architecutre and Industry' lectures were given on: ATM Switching, and FORTRAN90 and High Performance FORTRAN. Computer Parallel Architectures (MPP) lectures delt with very large scale open systems, history and future of computer system architecture, message passing paradigm, features of PVM and MPI. (orig.)

  11. Factorizable S-matrix for SO(D)/SO(2) circle times SO(D - 2) non-linear σ models with fermions

    International Nuclear Information System (INIS)

    Abdalla, E.; Lima-Santos, A.

    1988-01-01

    The authors compute the exact S matrix for the non-linear sigma model with symmetry SO(D)/SO(2) circle times SO(D-2) coupled to fermions in a minimal or supersymmetric way. The model has some relevance in string theory with non-zero external curvature

  12. Fast and accurate CMB computations in non-flat FLRW universes

    Science.gov (United States)

    Lesgourgues, Julien; Tram, Thomas

    2014-09-01

    We present a new method for calculating CMB anisotropies in a non-flat Friedmann universe, relying on a very stable algorithm for the calculation of hyperspherical Bessel functions, that can be pushed to arbitrary precision levels. We also introduce a new approximation scheme which gradually takes over in the flat space limit and leads to significant reductions of the computation time. Our method is implemented in the Boltzmann code class. It can be used to benchmark the accuracy of the camb code in curved space, which is found to match expectations. For default precision settings, corresponding to 0.1% for scalar temperature spectra and 0.2% for scalar polarisation spectra, our code is two to three times faster, depending on curvature. We also simplify the temperature and polarisation source terms significantly, so the different contributions to the Cl 's are easy to identify inside the code.

  13. Fast and accurate CMB computations in non-flat FLRW universes

    International Nuclear Information System (INIS)

    Lesgourgues, Julien; Tram, Thomas

    2014-01-01

    We present a new method for calculating CMB anisotropies in a non-flat Friedmann universe, relying on a very stable algorithm for the calculation of hyperspherical Bessel functions, that can be pushed to arbitrary precision levels. We also introduce a new approximation scheme which gradually takes over in the flat space limit and leads to significant reductions of the computation time. Our method is implemented in the Boltzmann code class. It can be used to benchmark the accuracy of the camb code in curved space, which is found to match expectations. For default precision settings, corresponding to 0.1% for scalar temperature spectra and 0.2% for scalar polarisation spectra, our code is two to three times faster, depending on curvature. We also simplify the temperature and polarisation source terms significantly, so the different contributions to the C ℓ  's are easy to identify inside the code

  14. Time-of-Flight Cameras in Computer Graphics

    DEFF Research Database (Denmark)

    Kolb, Andreas; Barth, Erhardt; Koch, Reinhard

    2010-01-01

    Computer Graphics, Computer Vision and Human Machine Interaction (HMI). These technologies are starting to have an impact on research and commercial applications. The upcoming generation of ToF sensors, however, will be even more powerful and will have the potential to become “ubiquitous real-time geometry...

  15. 29 CFR 4245.8 - Computation of time.

    Science.gov (United States)

    2010-07-01

    ... 29 Labor 9 2010-07-01 2010-07-01 false Computation of time. 4245.8 Section 4245.8 Labor Regulations Relating to Labor (Continued) PENSION BENEFIT GUARANTY CORPORATION INSOLVENCY, REORGANIZATION, TERMINATION, AND OTHER RULES APPLICABLE TO MULTIEMPLOYER PLANS NOTICE OF INSOLVENCY § 4245.8 Computation of...

  16. Computation Offloading for Frame-Based Real-Time Tasks under Given Server Response Time Guarantees

    Directory of Open Access Journals (Sweden)

    Anas S. M. Toma

    2014-11-01

    Full Text Available Computation offloading has been adopted to improve the performance of embedded systems by offloading the computation of some tasks, especially computation-intensive tasks, to servers or clouds. This paper explores computation offloading for real-time tasks in embedded systems, provided given response time guarantees from the servers, to decide which tasks should be offloaded to get the results in time. We consider frame-based real-time tasks with the same period and relative deadline. When the execution order of the tasks is given, the problem can be solved in linear time. However, when the execution order is not specified, we prove that the problem is NP-complete. We develop a pseudo-polynomial-time algorithm for deriving feasible schedules, if they exist.  An approximation scheme is also developed to trade the error made from the algorithm and the complexity. Our algorithms are extended to minimize the period/relative deadline of the tasks for performance maximization. The algorithms are evaluated with a case study for a surveillance system and synthesized benchmarks.

  17. Increased accuracy of single photon emission computed tomography (SPECT myocardial perfusion scintigraphy using iterative reconstruction of images

    Directory of Open Access Journals (Sweden)

    Stević Miloš

    2016-01-01

    Full Text Available Background/Aim. Filtered back projection (FBP is a common way of processing myocardial perfusion imaging (MPI studies. There are artifacts in FBP which can cause falsepositive results. Iterative reconstruction (IR is developed to reduce false positive findings in MPI studies. The aim of this study was to evaluate the difference in the number of false positive findings in MPI studies, between FBP and IR processing. Methods. We examined 107 patients with angina pectoris with MPI and coronary angiography (CAG, 77 man and 30 woman, aged 32−82. MPI studies were processed with FBP and with IR. Positive finding at MPI was visualization of the perfusion defect. Positive finding at CAG was stenosis of coronary artery. Perfusion defect at MPI without coronary artery stenosis at CAG was considered like false positive. The results were statistically analyzed with bivariate correlation, and with one sample t-test. Results. There were 20.6% normal, and 79.4% pathologic findings at FBP, 30.8% normal and 69.2% pathologic with IR and 37.4% normal and 62.6% pathologic at CAG. FBP produced 19 false-positive findings, at IR 11 false positive findings. The correlation between FBP and CAG was 0.658 (p < 0.01 and between IR and CAG 0.784 (p < 0.01. The number of false positive findings at MPI with IR was significantly lower than at FBP (p < 0.01. Conclusion. Our study shows that IR processing MPI scintigraphy has less number of false positive findings, therefore it is our choice for processing MPI studies.

  18. (Re)engineering Earth System Models to Expose Greater Concurrency for Ultrascale Computing: Practice, Experience, and Musings

    Science.gov (United States)

    Mills, R. T.

    2014-12-01

    As the high performance computing (HPC) community pushes towards the exascale horizon, the importance and prevalence of fine-grained parallelism in new computer architectures is increasing. This is perhaps most apparent in the proliferation of so-called "accelerators" such as the Intel Xeon Phi or NVIDIA GPGPUs, but the trend also holds for CPUs, where serial performance has grown slowly and effective use of hardware threads and vector units are becoming increasingly important to realizing high performance. This has significant implications for weather, climate, and Earth system modeling codes, many of which display impressive scalability across MPI ranks but take relatively little advantage of threading and vector processing. In addition to increasing parallelism, next generation codes will also need to address increasingly deep hierarchies for data movement: NUMA/cache levels, on node vs. off node, local vs. wide neighborhoods on the interconnect, and even in the I/O system. We will discuss some approaches (grounded in experiences with the Intel Xeon Phi architecture) for restructuring Earth science codes to maximize concurrency across multiple levels (vectors, threads, MPI ranks), and also discuss some novel approaches for minimizing expensive data movement/communication.

  19. Computer Aided Process Planning for Non-Axisymmetric Deep Drawing Products

    Science.gov (United States)

    Park, Dong Hwan; Yarlagadda, Prasad K. D. V.

    2004-06-01

    In general, deep drawing products have various cross-section shapes such as cylindrical, rectangular and non-axisymmetric shapes. The application of the surface area calculation to non-axisymmetric deep drawing process has not been published yet. In this research, a surface area calculation for non-axisymmetric deep drawing products with elliptical shape was constructed for a design of blank shape of deep drawing products by using an AutoLISP function of AutoCAD software. A computer-aided process planning (CAPP) system for rotationally symmetric deep drawing products has been developed. However, the application of the system to non-axisymmetric components has not been reported yet. Thus, the CAPP system for non-axisymmetric deep drawing products with elliptical shape was constructed by using process sequence design. The system developed in this work consists of four modules. The first is recognition of shape module to recognize non-axisymmetric products. The second is a three-dimensional (3-D) modeling module to calculate the surface area for non-axisymmetric products. The third is a blank design module to create an oval-shaped blank with the identical surface area. The forth is a process planning module based on the production rules that play the best important role in an expert system for manufacturing. The production rules are generated and upgraded by interviewing field engineers. Especially, the drawing coefficient, the punch and die radii for elliptical shape products are considered as main design parameters. The suitability of this system was verified by applying to a real deep drawing product. This CAPP system constructed would be very useful to reduce lead-time for manufacturing and improve an accuracy of products.

  20. ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

    Directory of Open Access Journals (Sweden)

    Kim Taeho

    2010-09-01

    Full Text Available Abstract Background There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC environment with a greatly extended data storage capacity. Results We designed ClustalXeed, a software system for multiple sequence alignment with incremental improvements over previous versions of the ClustalX and ClustalW-MPI software. The primary advantage of ClustalXeed over other multiple sequence alignment software is its ability to align a large family of protein or nucleic acid sequences. To solve the conventional memory-dependency problem, ClustalXeed uses both physical random access memory (RAM and a distributed file-allocation system for distance matrix construction and pair-align computation. The computation efficiency of disk-storage system was markedly improved by implementing an efficient load-balancing algorithm, called "idle node-seeking task algorithm" (INSTA. The new editing option and the graphical user interface (GUI provide ready access to a parallel-computing environment for users who seek fast and easy alignment of large DNA and protein sequence sets. Conclusions ClustalXeed can now compute a large volume of biological sequence data sets, which were not tractable in any other parallel or single MSA program. The main developments include: 1 the ability to tackle larger sequence alignment problems than possible with previous systems through markedly improved storage-handling capabilities. 2 Implementing an efficient task load-balancing algorithm, INSTA, which improves overall processing times for multiple sequence alignment with input sequences of non-uniform length. 3 Support for both single PC and distributed cluster systems.

  1. Stochastic nonlinear time series forecasting using time-delay reservoir computers: performance and universality.

    Science.gov (United States)

    Grigoryeva, Lyudmila; Henriques, Julie; Larger, Laurent; Ortega, Juan-Pablo

    2014-07-01

    Reservoir computing is a recently introduced machine learning paradigm that has already shown excellent performances in the processing of empirical data. We study a particular kind of reservoir computers called time-delay reservoirs that are constructed out of the sampling of the solution of a time-delay differential equation and show their good performance in the forecasting of the conditional covariances associated to multivariate discrete-time nonlinear stochastic processes of VEC-GARCH type as well as in the prediction of factual daily market realized volatilities computed with intraday quotes, using as training input daily log-return series of moderate size. We tackle some problems associated to the lack of task-universality for individually operating reservoirs and propose a solution based on the use of parallel arrays of time-delay reservoirs. Copyright © 2014 Elsevier Ltd. All rights reserved.

  2. Dependence of Brownian and Néel relaxation times on magnetic field strength

    International Nuclear Information System (INIS)

    Deissler, Robert J.; Wu, Yong; Martens, Michael A.

    2014-01-01

    Purpose: In magnetic particle imaging (MPI) and magnetic particle spectroscopy (MPS) the relaxation time of the magnetization in response to externally applied magnetic fields is determined by the Brownian and Néel relaxation mechanisms. Here the authors investigate the dependence of the relaxation times on the magnetic field strength and the implications for MPI and MPS. Methods: The Fokker–Planck equation with Brownian relaxation and the Fokker–Planck equation with Néel relaxation are solved numerically for a time-varying externally applied magnetic field, including a step-function, a sinusoidally varying, and a linearly ramped magnetic field. For magnetic fields that are applied as a step function, an eigenvalue approach is used to directly calculate both the Brownian and Néel relaxation times for a range of magnetic field strengths. For Néel relaxation, the eigenvalue calculations are compared to Brown's high-barrier approximation formula. Results: The relaxation times due to the Brownian or Néel mechanisms depend on the magnitude of the applied magnetic field. In particular, the Néel relaxation time is sensitive to the magnetic field strength, and varies by many orders of magnitude for nanoparticle properties and magnetic field strengths relevant for MPI and MPS. Therefore, the well-known zero-field relaxation times underestimate the actual relaxation times and, in particular, can underestimate the Néel relaxation time by many orders of magnitude. When only Néel relaxation is present—if the particles are embedded in a solid for instance—the authors found that there can be a strong magnetization response to a sinusoidal driving field, even if the period is much less than the zero-field relaxation time. For a ferrofluid in which both Brownian and Néel relaxation are present, only one relaxation mechanism may dominate depending on the magnetic field strength, the driving frequency (or ramp time), and the phase of the magnetization relative to the

  3. Non-infectious complications of continuous ambulatory peritoneal dialysis: evaluation with peritoneal computed tomography

    International Nuclear Information System (INIS)

    Camsari, T.; Celik, A.; Ozaksoy, D.; Salman, S.; Cavdar, C.; Sifil, A.

    1998-01-01

    The purpose of the study was to evaluate the non-infectious complications of continuous ambulatory peritoneal dialysis (CAPD) using peritoneal computed tomography (PCT). Twenty symptomatic patients were included in the study. Initially 2000 ml of dialysate fluid was infused into the peritoneal cavity and standard peritoneal computed cavity and standard peritoneal computed tomography (SPCT) serial scans with 10 mm thickness were performed from the mid-thoracic region to the genital organs. Afterwards, 100 ml of non-ionic contrast material containing 300 mg/ml iodine was injected through the catheter and was distributed homogeneously in the intra-abdominal dialysate fluid by changing the positions of the patients; after waiting for 2-4 h, the CT scan was repeated as peritoneal contrast computed tomography (PCCT). In patients (n = 20) both SPCT and PCCT revealed 90 % (n = 18) pathological findings. But PCCT showed 60 % (n = 12) additional pathological findings. We believe that PCT is beneficial for evaluation of non-infectious complications of CAPD. But PCCT is superior to SPCT in evaluating non-infectious complications encountered in patients on CAPD treatment. (author)

  4. Computation of a long-time evolution in a Schroedinger system

    International Nuclear Information System (INIS)

    Girard, R.; Kroeger, H.; Labelle, P.; Bajzer, Z.

    1988-01-01

    We compare different techniques for the computation of a long-time evolution and the S matrix in a Schroedinger system. As an application we consider a two-nucleon system interacting via the Yamaguchi potential. We suggest computation of the time evolution for a very short time using Pade approximants, the long-time evolution being obtained by iterative squaring. Within the technique of strong approximation of Moller wave operators (SAM) we compare our calculation with computation of the time evolution in the eigenrepresentation of the Hamiltonian and with the standard Lippmann-Schwinger solution for the S matrix. We find numerical agreement between these alternative methods for time-evolution computation up to half the number of digits of internal machine precision, and fairly rapid convergence of both techniques towards the Lippmann-Schwinger solution

  5. A comparison of observed extreme water levels at the German Bight elaborated through an extreme value analysis (EVA) with extremes derived from a regionally coupled ocean-atmospheric climate model (MPI-OM)

    Science.gov (United States)

    Möller, Jens; Heinrich, Hartmut

    2017-04-01

    As a consequence of climate change atmospheric and oceanographic extremes and their potential impacts on coastal regions are of growing concern for governmental authorities responsible for the transportation infrastructure. Highest risks for shipping as well as for rail and road traffic originate from combined effects of extremes of storm surges and heavy rainfall which sometimes lead to insufficient dewatering of inland waterways. The German Ministry of Transport and digital Infrastructure therefore has tasked its Network of Experts to investigate the possible evolutions of extreme threats for low lands and especially for Kiel Canal, which is an important shortcut for shipping between the North and Baltic Seas. In this study we present results of a comparison of an Extreme Value Analysis (EVA) carried out on gauge observations and values derived from a coupled Regional Ocean-Atmosphere Climate Model (MPI-OM). High water levels at the coasts of the North and Baltic Seas are one of the most important hazards which increase the risk of flooding of the low-lying land and prevents such areas from an adequate dewatering. In this study changes in the intensity (magnitude of the extremes) and duration of extreme water levels (above a selected threshold) are investigated for several gauge stations with data partly reaching back to 1843. Different methods are used for the extreme value statistics, (1) a stationary general Pareto distribution (GPD) model as well as (2) an instationary statistical model for better reproduction of the impact of climate change. Most gauge stations show an increase of the mean water level of about 1-2 mm/year, with a stronger increase of the highest water levels and a decrease (or lower increase) of the lowest water levels. Also, the duration of possible dewatering time intervals for the Kiel-Canal was analysed. The results for the historical gauge station observations are compared to the statistics of modelled water levels from the coupled

  6. Outcomes and challenges of global high-resolution non-hydrostatic atmospheric simulations using the K computer

    Science.gov (United States)

    Satoh, Masaki; Tomita, Hirofumi; Yashiro, Hisashi; Kajikawa, Yoshiyuki; Miyamoto, Yoshiaki; Yamaura, Tsuyoshi; Miyakawa, Tomoki; Nakano, Masuo; Kodama, Chihiro; Noda, Akira T.; Nasuno, Tomoe; Yamada, Yohei; Fukutomi, Yoshiki

    2017-12-01

    This article reviews the major outcomes of a 5-year (2011-2016) project using the K computer to perform global numerical atmospheric simulations based on the non-hydrostatic icosahedral atmospheric model (NICAM). The K computer was made available to the public in September 2012 and was used as a primary resource for Japan's Strategic Programs for Innovative Research (SPIRE), an initiative to investigate five strategic research areas; the NICAM project fell under the research area of climate and weather simulation sciences. Combining NICAM with high-performance computing has created new opportunities in three areas of research: (1) higher resolution global simulations that produce more realistic representations of convective systems, (2) multi-member ensemble simulations that are able to perform extended-range forecasts 10-30 days in advance, and (3) multi-decadal simulations for climatology and variability. Before the K computer era, NICAM was used to demonstrate realistic simulations of intra-seasonal oscillations including the Madden-Julian oscillation (MJO), merely as a case study approach. Thanks to the big leap in computational performance of the K computer, we could greatly increase the number of cases of MJO events for numerical simulations, in addition to integrating time and horizontal resolution. We conclude that the high-resolution global non-hydrostatic model, as used in this five-year project, improves the ability to forecast intra-seasonal oscillations and associated tropical cyclogenesis compared with that of the relatively coarser operational models currently in use. The impacts of the sub-kilometer resolution simulation and the multi-decadal simulations using NICAM are also reviewed.

  7. Imprecise results: Utilizing partial computations in real-time systems

    Science.gov (United States)

    Lin, Kwei-Jay; Natarajan, Swaminathan; Liu, Jane W.-S.

    1987-01-01

    In real-time systems, a computation may not have time to complete its execution because of deadline requirements. In such cases, no result except the approximate results produced by the computations up to that point will be available. It is desirable to utilize these imprecise results if possible. Two approaches are proposed to enable computations to return imprecise results when executions cannot be completed normally. The milestone approach records results periodically, and if a deadline is reached, returns the last recorded result. The sieve approach demarcates sections of code which can be skipped if the time available is insufficient. By using these approaches, the system is able to produce imprecise results when deadlines are reached. The design of the Concord project is described which supports imprecise computations using these techniques. Also presented is a general model of imprecise computations using these techniques, as well as one which takes into account the influence of the environment, showing where the latter approach fits into this model.

  8. Computing wave functions in multichannel collisions with non-local potentials using the R-matrix method

    Science.gov (United States)

    Bonitati, Joey; Slimmer, Ben; Li, Weichuan; Potel, Gregory; Nunes, Filomena

    2017-09-01

    The calculable form of the R-matrix method has been previously shown to be a useful tool in approximately solving the Schrodinger equation in nuclear scattering problems. We use this technique combined with the Gauss quadrature for the Lagrange-mesh method to efficiently solve for the wave functions of projectile nuclei in low energy collisions (1-100 MeV) involving an arbitrary number of channels. We include the local Woods-Saxon potential, the non-local potential of Perey and Buck, a Coulomb potential, and a coupling potential to computationally solve for the wave function of two nuclei at short distances. Object oriented programming is used to increase modularity, and parallel programming techniques are introduced to reduce computation time. We conclude that the R-matrix method is an effective method to predict the wave functions of nuclei in scattering problems involving both multiple channels and non-local potentials. Michigan State University iCER ACRES REU.

  9. Time-frequency analysis of non-stationary fusion plasma signals using an improved Hilbert-Huang transform

    International Nuclear Information System (INIS)

    Liu, Yangqing; Tan, Yi; Xie, Huiqiao; Wang, Wenhao; Gao, Zhe

    2014-01-01

    An improved Hilbert-Huang transform method is developed to the time-frequency analysis of non-stationary signals in tokamak plasmas. Maximal overlap discrete wavelet packet transform rather than wavelet packet transform is proposed as a preprocessor to decompose a signal into various narrow-band components. Then, a correlation coefficient based selection method is utilized to eliminate the irrelevant intrinsic mode functions obtained from empirical mode decomposition of those narrow-band components. Subsequently, a time varying vector autoregressive moving average model instead of Hilbert spectral analysis is performed to compute the Hilbert spectrum, i.e., a three-dimensional time-frequency distribution of the signal. The feasibility and effectiveness of the improved Hilbert-Huang transform method is demonstrated by analyzing a non-stationary simulated signal and actual experimental signals in fusion plasmas

  10. Real-Time Thevenin Impedance Computation

    DEFF Research Database (Denmark)

    Sommer, Stefan Horst; Jóhannsson, Hjörtur

    2013-01-01

    operating state, and strict time constraints are difficult to adhere to as the complexity of the grid increases. Several suggested approaches for real-time stability assessment require Thevenin impedances to be determined for the observed system conditions. By combining matrix factorization, graph reduction......, and parallelization, we develop an algorithm for computing Thevenin impedances an order of magnitude faster than previous approaches. We test the factor-and-solve algorithm with data from several power grids of varying complexity, and we show how the algorithm allows realtime stability assessment of complex power...

  11. Thoracoscopic anatomical lung segmentectomy using 3D computed tomography simulation without tumour markings for non-palpable and non-visualized small lung nodules.

    Science.gov (United States)

    Kato, Hirohisa; Oizumi, Hiroyuki; Suzuki, Jun; Hamada, Akira; Watarai, Hikaru; Sadahiro, Mitsuaki

    2017-09-01

    Although wedge resection can be curative for small lung tumours, tumour marking is sometimes required for resection of non-palpable or visually undetectable lung nodules as a method for identification of tumours. Tumour marking sometimes fails and occasionally causes serious complications. We have performed many thoracoscopic segmentectomies using 3D computed tomography simulation for undetectable small lung tumours without any tumour markings. The aim of this study was to investigate whether thoracoscopic segmentectomy planned with 3D computed tomography simulation could precisely remove non-palpable and visually undetectable tumours. Between January 2012 and March 2016, 58 patients underwent thoracoscopic segmentectomy using 3D computed tomography simulation for non-palpable, visually undetectable tumours. Surgical outcomes were evaluated. A total of 35, 14 and 9 patients underwent segmentectomy, subsegmentectomy and segmentectomy combined with adjacent subsegmentectomy, respectively. All tumours were correctly resected without tumour marking. The median tumour size and distance from the visceral pleura was 14 ± 5.2 mm (range 5-27 mm) and 11.6 mm (range 1-38.8 mm), respectively. Median values related to the procedures were operative time, 176 min (range 83-370 min); blood loss, 43 ml (range 0-419 ml); duration of chest tube placement, 1 day (range 1-8 days); and postoperative hospital stay, 5 days (range 3-12 days). Two cases were converted to open thoracotomy due to bleeding. Three cases required pleurodesis for pleural fistula. No recurrences occurred during the mean follow-up period of 44.4 months (range 5-53 months). Thoracoscopic segmentectomy using 3D computed tomography simulation was feasible and could be performed to resect undetectable tumours with no tumour markings. © The Author 2017. Published by Oxford University Press on behalf of the European Association for Cardio-Thoracic Surgery. All rights reserved.

  12. Havery Mudd 2014-2015 Computer Science Conduit Clinic Final Report

    Energy Technology Data Exchange (ETDEWEB)

    Aspesi, G [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Bai, J [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Deese, R [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Shin, L [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)

    2015-05-12

    Conduit, a new open-source library developed at Lawrence Livermore National Laboratories, provides a C++ application programming interface (API) to describe and access scientific data. Conduit’s primary use is for inmemory data exchange in high performance computing (HPC) applications. Our team tested and improved Conduit to make it more appealing to potential adopters in the HPC community. We extended Conduit’s capabilities by prototyping four libraries: one for parallel communication using MPI, one for I/O functionality, one for aggregating performance data, and one for data visualization.

  13. Computational electromagnetics—retrospective and outlook in honor of Wolfgang J.R. Hoefer

    CERN Document Server

    Chen, Zhizhang

    2015-01-01

    The book will cover the past, present and future developments of field theory and computational electromagnetics. The first two chapters will give an overview of the historical developments and the present the state-of-the-art in computational electromagnetics. These two chapters will set the stage for discussing recent progress, new developments, challenges, trends, and major directions in computational electromagnetics with three main emphases:   a. Modeling of ever larger structures with multi-scale dimensions and multi-level descriptions (behavioral, circuit, network and field levels) and transient behaviours   b. Inclusions of physical effects other than electromagnetic: quantum effects, thermal effects, mechanical effects and nanoscale features   c. New developments in available computer hardware, programming paradigms (MPI, OpenMP, CUDA, and OpenCL) and the associated new modeling approaches   These are the current emerging topics in the area of computational electromagnetics and may provide reader...

  14. Assessment of prognostic value of semiquantitative parameters on gated single photon emission computed tomography myocardial perfusion scintigraphy in a large middle eastern population

    International Nuclear Information System (INIS)

    Chavoshi, Maryam; Fard-Esfahani, Armaghan; Fallahi, Babak; Emami-Ardekani, Alireza; Beiki, Davood; Hassanzadeh-Rad, Arman; Eftekhari, Mohammad

    2005-01-01

    Coronary artery disease is the leading cause of mortality worldwide. The goal of this study is to determine the prognostic value of semiquantitative parameters of electrocardiogram-gated single photon emission computed tomography (SPECT) myocardial perfusion imaging (MPI) in a large Middle Eastern (Iranian) population. This study was a prospective study including all patients referred to our center for myocardial perfusion scan. The patients were followed annually up to 24 months and their survival information was collected. From 1148 patients, 473 (41.2%) men and 675 (58.8%) women, 40.6% had normal MPI, 13.3% near normal and 46.1% abnormal MPI. After follow-up of 929 patients, 97.4% of patients were alive, and 2.6% succumbed to cardiac deaths. Abnormal ejection fraction was related with cardiac events (P = 0.001), but neither transient ischemic dilation (TID) (P = 0.09) nor lung/heart ratio (P = 0.92) showed such relationship. Association between summed difference score (SDS) and soft cardiac events (P < 0.001) was significant. Summed motion score (SMS) and summed thickening score (STS) showed a significant relation with hard cardiac events, including myocardial infarction and cardiac death (P < 0.001 and P = 0.001, respectively). Totally, risk of all cardiac events was significantly higher in abnormal MPI group than normal group (P < 0.001, 0.02, and 0.025, respectively). No significant relationship was found between TID and total cardiac events (P = 0.478). Semiquantitative variables derived from gated SPECT MPI have independent prognostic value. Rate of total cardiac events is higher in patients with higher summed stress score and SDS. Total and hard cardiac events are higher in upper scores of functional parameters (SMS and STS). Total cardiac events are higher in patients with lower left ventricular ejection fraction

  15. Spying on real-time computers to improve performance

    International Nuclear Information System (INIS)

    Taff, L.M.

    1975-01-01

    The sampled program-counter histogram, an established technique for shortening the execution times of programs, is described for a real-time computer. The use of a real-time clock allows particularly easy implementation. (Auth.)

  16. Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture

    KAUST Repository

    AbdulJabbar, Mustafa Abdulmajeed; Al Farhan, Mohammed; Yokota, Rio; Keyes, David E.

    2017-01-01

    Manycore optimizations are essential for achieving performance worthy of anticipated exascale systems. Utilization of manycore chips is inevitable to attain the desired floating point performance of these energy-austere systems. In this work, we revisit ExaFMM, the open source Fast Multiple Method (FMM) library, in light of highly tuned shared-memory parallelization and detailed performance analysis on the new highly parallel Intel manycore architecture, Knights Landing (KNL). We assess scalability and performance gain using task-based parallelism of the FMM tree traversal. We also provide an in-depth analysis of the most computationally intensive part of the traversal kernel (i.e., the particle-to-particle (P2P) kernel), by comparing its performance across KNL and Broadwell architectures. We quantify different configurations that exploit the on-chip 512-bit vector units within different task-based threading paradigms. MPI communication-reducing and NUMA-aware approaches for the FMM’s global tree data exchange are examined with different cluster modes of KNL. By applying several algorithm- and architecture-aware optimizations for FMM, we show that the N-Body kernel on 256 threads of KNL achieves on average 2.8× speedup compared to the non-vectorized version, whereas on 56 threads of Broadwell, it achieves on average 2.9× speedup. In addition, the tree traversal kernel on KNL scales monotonically up to 256 threads with task-based programming models. The MPI-based communication-reducing algorithms show expected improvements of the data locality across the KNL on-chip network.

  17. Performance Evaluation of Computation and Communication Kernels of the Fast Multipole Method on Intel Manycore Architecture

    KAUST Repository

    AbdulJabbar, Mustafa Abdulmajeed

    2017-07-31

    Manycore optimizations are essential for achieving performance worthy of anticipated exascale systems. Utilization of manycore chips is inevitable to attain the desired floating point performance of these energy-austere systems. In this work, we revisit ExaFMM, the open source Fast Multiple Method (FMM) library, in light of highly tuned shared-memory parallelization and detailed performance analysis on the new highly parallel Intel manycore architecture, Knights Landing (KNL). We assess scalability and performance gain using task-based parallelism of the FMM tree traversal. We also provide an in-depth analysis of the most computationally intensive part of the traversal kernel (i.e., the particle-to-particle (P2P) kernel), by comparing its performance across KNL and Broadwell architectures. We quantify different configurations that exploit the on-chip 512-bit vector units within different task-based threading paradigms. MPI communication-reducing and NUMA-aware approaches for the FMM’s global tree data exchange are examined with different cluster modes of KNL. By applying several algorithm- and architecture-aware optimizations for FMM, we show that the N-Body kernel on 256 threads of KNL achieves on average 2.8× speedup compared to the non-vectorized version, whereas on 56 threads of Broadwell, it achieves on average 2.9× speedup. In addition, the tree traversal kernel on KNL scales monotonically up to 256 threads with task-based programming models. The MPI-based communication-reducing algorithms show expected improvements of the data locality across the KNL on-chip network.

  18. Development of a new technic for breast attenuation correction in myocardial perfusion scintigraphy using computational methods

    International Nuclear Information System (INIS)

    Oliveira, Anderson de

    2015-01-01

    Introduction: One of the limitations of nuclear medicine studies are false-positive results that lead to unnecessary exams and procedures associated to morbidity and costs to the individual and society. One of the most frequent causes for reducing the specificity of myocardial perfusion imaging (MPI) is photon attenuation, especially by breast in women. Objective: To develop a new technique to compensate the photon attenuation by women breasts in myocardial perfusion imaging with 99m Tc-sestamibi, using computational methods. Materials and methods: A procedure was proposed which integrates Monte Carlo simulation, computational methods and experimental techniques. Initially, were obtained the chest attenuation correction percentages using a phantom Jaszczak and breast attenuation percentages by Monte Carlo simulation method, using the EGS4 program. The percentages of attenuation correction were linked to individual patients' characteristics by an artificial neural network and a multivariate analysis. A preliminary technical validation was done by comparing the results of the MPI and catheterism (CAT), before and after applying the technique to 4 patients. The t test for parametric data, Wilcoxon, Mann-Whitney and X 2 for the others were used. Probability values less than 0.05 were considered statistically significant. Results: Each increment of 1 cm in the thickness of breast was associated to an average increment of 6% on photon attenuation, while the maximum increase related to breast composition was about 2%. The average chest attenuation percentage per unit was 2.9%. Both, the artificial neural network and linear regression, showed an error less than 3% as predictive models for percentage of female attenuation. The anatomical-functional correlation between MPI and CAT was maintained after the use of the technique. Conclusion: Results suggest that the proposed technique is promising and could be a possible alternative to other conventional methods employed

  19. On the Super-Turing Computational Power of Non-Uniform Families of Neuromata

    Czech Academy of Sciences Publication Activity Database

    Wiedermann, Jiří

    2002-01-01

    Roč. 12, č. 5 (2002), s. 509-516 ISSN 1210-0552. [SOFSEM 2002 Workshop on Soft Computing. Milovy, 28.11.2002-29.11.2002] R&D Projects: GA ČR GA201/00/1489 Institutional research plan: AV0Z1030915 Keywords : neuromata * Turing machines with advice * non-uniform computational complexity * super-Turing computational power Subject RIV: BA - General Mathematics

  20. A real-time extension of density matrix embedding theory for non-equilibrium electron dynamics

    Science.gov (United States)

    Kretchmer, Joshua S.; Chan, Garnet Kin-Lic

    2018-02-01

    We introduce real-time density matrix embedding theory (DMET), a dynamical quantum embedding theory for computing non-equilibrium electron dynamics in strongly correlated systems. As in the previously developed static DMET, real-time DMET partitions the system into an impurity corresponding to the region of interest coupled to the surrounding environment, which is efficiently represented by a quantum bath of the same size as the impurity. In this work, we focus on a simplified single-impurity time-dependent formulation as a first step toward a multi-impurity theory. The equations of motion of the coupled impurity and bath embedding problem are derived using the time-dependent variational principle. The accuracy of real-time DMET is compared to that of time-dependent complete active space self-consistent field (TD-CASSCF) theory and time-dependent Hartree-Fock (TDHF) theory for a variety of quantum quenches in the single impurity Anderson model (SIAM), in which the Hamiltonian is suddenly changed (quenched) to induce a non-equilibrium state. Real-time DMET shows a marked improvement over the mean-field TDHF, converging to the exact answer even in the non-trivial Kondo regime of the SIAM. However, as expected from analogous behavior in static DMET, the constrained structure of the real-time DMET wavefunction leads to a slower convergence with respect to active space size, in the single-impurity formulation, relative to TD-CASSCF. Our initial results suggest that real-time DMET provides a promising framework to simulate non-equilibrium electron dynamics in which strong electron correlation plays an important role, and lays the groundwork for future multi-impurity formulations.

  1. Robust Forecasting of Non-Stationary Time Series

    NARCIS (Netherlands)

    Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.

    2010-01-01

    This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable

  2. NDL-v2.0: A new version of the numerical differentiation library for parallel architectures

    Science.gov (United States)

    Hadjidoukas, P. E.; Angelikopoulos, P.; Voglis, C.; Papageorgiou, D. G.; Lagaris, I. E.

    2014-07-01

    (2009)1404 Does the new version supersede the previous version?: Yes Nature of problem: The numerical estimation of derivatives at several accuracy levels is a common requirement in many computational tasks, such as optimization, solution of nonlinear systems, and sensitivity analysis. For a large number of scientific and engineering applications, the underlying functions correspond to simulation codes for which analytical estimation of derivatives is difficult or almost impossible. A parallel implementation that exploits systems with multiple CPUs is very important for large scale and computationally expensive problems. Solution method: Finite differencing is used with a carefully chosen step that minimizes the sum of the truncation and round-off errors. The parallel versions employ both OpenMP and MPI libraries. Reasons for new version: The updated version was motivated by our endeavors to extend a parallel Bayesian uncertainty quantification framework [1], by incorporating higher order derivative information as in most state-of-the-art stochastic simulation methods such as Stochastic Newton MCMC [2] and Riemannian Manifold Hamiltonian MC [3]. The function evaluations are simulations with significant time-to-solution, which also varies with the input parameters such as in [1, 4]. The runtime of the N-body-type of problem changes considerably with the introduction of a longer cut-off between the bodies. In the first version of the library, the OpenMP-parallel subroutines spawn a new team of threads and distribute the function evaluations with a PARALLEL DO directive. This limits the functionality of the library as multiple concurrent calls require nested parallelism support from the OpenMP environment. Therefore, either their function evaluations will be serialized or processor oversubscription is likely to occur due to the increased number of OpenMP threads. In addition, the Hessian calculations include two explicit parallel regions that compute first the diagonal and then the

  3. The effect of time-dependent coupling on non-equilibrium steady states

    DEFF Research Database (Denmark)

    Cornean, Horia; Neidhardt, Hagen; Zagrebnov, Valentin

    Consider (for simplicity) two one-dimensional semi-infinite leads coupled to a quantum well via time dependent point interactions. In the remote past the system is decoupled, and each of its components is at thermal equilibrium. In the remote future the system is fully coupled. We define...... and compute the non equilibrium steady state (NESS) generated by this evolution. We show that when restricted to the subspace of absolute continuity of the fully coupled system, the state does not depend at all on the switching. Moreover, we show that the stationary charge current has the same invariant...

  4. The effect of time-dependent coupling on non-equilibrium steady states

    DEFF Research Database (Denmark)

    Cornean, Horia; Neidhardt, Hagen; Zagrebnov, Valentin A.

    2009-01-01

    Consider (for simplicity) two one-dimensional semi-infinite leads coupled to a quantum well via time dependent point interactions. In the remote past the system is decoupled, and each of its components is at thermal equilibrium. In the remote future the system is fully coupled. We define...... and compute the non equilibrium steady state (NESS) generated by this evolution. We show that when restricted to the subspace of absolute continuity of the fully coupled system, the state does not depend at all on the switching. Moreover, we show that the stationary charge current has the same invariant...

  5. MPI version of NJOY and its application to multigroup cross-section generation

    Energy Technology Data Exchange (ETDEWEB)

    Alpan, A.; Haghighat, A.

    1999-07-01

    Multigroup cross-section libraries are needed in performing neutronics calculations. These libraries are referred to as broad-group libraries. The number of energy groups and group structure are highly dependent on the application and/or user's objectives. For example, for shielding calculations, broad-group libraries such as SAILOR and BUGLE with 47-neutron and 20-gamma energy groups are used. The common procedure to obtain a broad-group library is a three-step process: (1) processing pointwise ENDF (PENDF) format cross sections; (2) generating fine-group cross sections; and (3) collapsing fine-group cross sections to broad-group. The NJOY code is used to prepare fine-group cross sections by processing pointwise ENDF data. The code has several modules, each one performing a specific task. For instance, the module RECONR performs linearization and reconstruction of the cross sections, and the module GROUPR generates multigroup self-shielded cross sections. After fine-group, i.e., groupwise ENDF (GENDF), cross sections are produced, cross sections are self-shielded, and a one-dimensional transport calculation is performed to obtain flux spectra at specific regions in the model. These fluxes are then used as weighting functions to collapse the fine-group cross sections to obtain a broad-group cross-section library. The third step described is commonly performed by the AMPX code system. SMILER converts NJOY GENDF filed to AMPX master libraries, AJAX collects the master libraries. BONAMI performs self-shielding calculations, NITAWL converts the AMPX master library to a working library, XSDRNPM performs one-dimensional transport calculations, and MALOCS collapses fine-group cross sections to broad-group. Finally, ALPO is used to generate ANISN format libraries. In this three-step procedure, generally NJOY requires the largest amount of CPU time. This time varies depending on the user's specified parameters for each module, such as reconstruction tolerances

  6. MPI version of NJOY and its application to multigroup cross-section generation

    International Nuclear Information System (INIS)

    Alpan, A.; Haghighat, A.

    1999-01-01

    Multigroup cross-section libraries are needed in performing neutronics calculations. These libraries are referred to as broad-group libraries. The number of energy groups and group structure are highly dependent on the application and/or user's objectives. For example, for shielding calculations, broad-group libraries such as SAILOR and BUGLE with 47-neutron and 20-gamma energy groups are used. The common procedure to obtain a broad-group library is a three-step process: (1) processing pointwise ENDF (PENDF) format cross sections; (2) generating fine-group cross sections; and (3) collapsing fine-group cross sections to broad-group. The NJOY code is used to prepare fine-group cross sections by processing pointwise ENDF data. The code has several modules, each one performing a specific task. For instance, the module RECONR performs linearization and reconstruction of the cross sections, and the module GROUPR generates multigroup self-shielded cross sections. After fine-group, i.e., groupwise ENDF (GENDF), cross sections are produced, cross sections are self-shielded, and a one-dimensional transport calculation is performed to obtain flux spectra at specific regions in the model. These fluxes are then used as weighting functions to collapse the fine-group cross sections to obtain a broad-group cross-section library. The third step described is commonly performed by the AMPX code system. SMILER converts NJOY GENDF filed to AMPX master libraries, AJAX collects the master libraries. BONAMI performs self-shielding calculations, NITAWL converts the AMPX master library to a working library, XSDRNPM performs one-dimensional transport calculations, and MALOCS collapses fine-group cross sections to broad-group. Finally, ALPO is used to generate ANISN format libraries. In this three-step procedure, generally NJOY requires the largest amount of CPU time. This time varies depending on the user's specified parameters for each module, such as reconstruction tolerances, temperatures

  7. A Distributed Computing Network for Real-Time Systems.

    Science.gov (United States)

    1980-11-03

    7 ) AU2 o NAVA TUNDEWATER SY$TEMS CENTER NEWPORT RI F/G 9/2 UIS RIBUT E 0 COMPUTIN G N LTWORK FOR REAL - TIME SYSTEMS .(U) UASSIFIED NOV Al 6 1...MORAIS - UT 92 dLEVEL c A Distributed Computing Network for Real - Time Systems . 11 𔃺-1 Gordon E/Morson I7 y tm- ,r - t "en t As J 2 -p .. - 7 I’ cNaval...NUMBER TD 5932 / N 4. TITLE mand SubotI. S. TYPE OF REPORT & PERIOD COVERED A DISTRIBUTED COMPUTING NETWORK FOR REAL - TIME SYSTEMS 6. PERFORMING ORG

  8. Cardiac Time Intervals by Tissue Doppler Imaging M-Mode

    DEFF Research Database (Denmark)

    Biering-Sørensen, Tor; Mogelvang, Rasmus; de Knegt, Martina Chantal

    2016-01-01

    PURPOSE: To define normal values of the cardiac time intervals obtained by tissue Doppler imaging (TDI) M-mode through the mitral valve (MV). Furthermore, to evaluate the association of the myocardial performance index (MPI) obtained by TDI M-mode (MPITDI) and the conventional method of obtaining...

  9. Multiphoton ionization of (Xe)n and (NO)n clusters using a picosecond laser

    International Nuclear Information System (INIS)

    Smith, D.B.; Miller, J.C.

    1989-01-01

    Mass-resolved multiphoton ionization (MPI) spectroscopy is an established technique for detecting and analyzing van der Waals molecules and larger clusters. MPI spectroscopy provides excellent detection sensitivity, moderately high resolution, and selectivity among cluster species. In addition to information provided by the analysis of photoions following MPI, photoelectron spectroscopy can reveal details regarding the structure of ionic states. Unfortunately, the technique is limited by its tendency to produce extensive fragmentation. Fragmentation is also a problem with other ionization techniques (e.g., electron impact ionization), but the intense laser beams required for MPI cause additional dissociation channels to become available. These channels include absorption of additional photons by parent ions (ion ladder mechanism), absorption of additional photons by fragment ions (ladder switching mechanism), and resonances with dissociative states in the neutral manifold. The existence of these dissociation channels can preclude the use of MPI spectroscopy in many situations. Recently, MPI studies of stable molecules using picosecond lasers (pulse length = 1 - 10 ps) have indicated that limitations due to fragmentation might be subdued. With picosecond lasers, dissociation mechanisms can be altered and in some cases fragmentation can be eliminated or reduced. Additional photon absorption competes effectively with dissociation channels when a very short laser pulse or, perhaps more importantly, a sufficiently high peak-power is used. In the case where ionic absorption and fragmentation occurs, it has been shown that picosecond MPI might favor the ion ladder mechanism rather than the ladder switching mechanism

  10. Efficient Transfer Entropy Analysis of Non-Stationary Neural Time Series

    Science.gov (United States)

    Vicente, Raul; Díaz-Pernas, Francisco J.; Wibral, Michael

    2014-01-01

    Information theory allows us to investigate information processing in neural systems in terms of information transfer, storage and modification. Especially the measure of information transfer, transfer entropy, has seen a dramatic surge of interest in neuroscience. Estimating transfer entropy from two processes requires the observation of multiple realizations of these processes to estimate associated probability density functions. To obtain these necessary observations, available estimators typically assume stationarity of processes to allow pooling of observations over time. This assumption however, is a major obstacle to the application of these estimators in neuroscience as observed processes are often non-stationary. As a solution, Gomez-Herrero and colleagues theoretically showed that the stationarity assumption may be avoided by estimating transfer entropy from an ensemble of realizations. Such an ensemble of realizations is often readily available in neuroscience experiments in the form of experimental trials. Thus, in this work we combine the ensemble method with a recently proposed transfer entropy estimator to make transfer entropy estimation applicable to non-stationary time series. We present an efficient implementation of the approach that is suitable for the increased computational demand of the ensemble method's practical application. In particular, we use a massively parallel implementation for a graphics processing unit to handle the computationally most heavy aspects of the ensemble method for transfer entropy estimation. We test the performance and robustness of our implementation on data from numerical simulations of stochastic processes. We also demonstrate the applicability of the ensemble method to magnetoencephalographic data. While we mainly evaluate the proposed method for neuroscience data, we expect it to be applicable in a variety of fields that are concerned with the analysis of information transfer in complex biological, social, and

  11. Robust Forecasting of Non-Stationary Time Series

    OpenAIRE

    Croux, C.; Fried, R.; Gijbels, I.; Mahieu, K.

    2010-01-01

    This paper proposes a robust forecasting method for non-stationary time series. The time series is modelled using non-parametric heteroscedastic regression, and fitted by a localized MM-estimator, combining high robustness and large efficiency. The proposed method is shown to produce reliable forecasts in the presence of outliers, non-linearity, and heteroscedasticity. In the absence of outliers, the forecasts are only slightly less precise than those based on a localized Least Squares estima...

  12. Error Correction for Non-Abelian Topological Quantum Computation

    Directory of Open Access Journals (Sweden)

    James R. Wootton

    2014-03-01

    Full Text Available The possibility of quantum computation using non-Abelian anyons has been considered for over a decade. However, the question of how to obtain and process information about what errors have occurred in order to negate their effects has not yet been considered. This is in stark contrast with quantum computation proposals for Abelian anyons, for which decoding algorithms have been tailor-made for many topological error-correcting codes and error models. Here, we address this issue by considering the properties of non-Abelian error correction, in general. We also choose a specific anyon model and error model to probe the problem in more detail. The anyon model is the charge submodel of D(S_{3}. This shares many properties with important models such as the Fibonacci anyons, making our method more generally applicable. The error model is a straightforward generalization of those used in the case of Abelian anyons for initial benchmarking of error correction methods. It is found that error correction is possible under a threshold value of 7% for the total probability of an error on each physical spin. This is remarkably comparable with the thresholds for Abelian models.

  13. Real-time computing platform for spiking neurons (RT-spike).

    Science.gov (United States)

    Ros, Eduardo; Ortigosa, Eva M; Agís, Rodrigo; Carrillo, Richard; Arnold, Michael

    2006-07-01

    A computing platform is described for simulating arbitrary networks of spiking neurons in real time. A hybrid computing scheme is adopted that uses both software and hardware components to manage the tradeoff between flexibility and computational power; the neuron model is implemented in hardware and the network model and the learning are implemented in software. The incremental transition of the software components into hardware is supported. We focus on a spike response model (SRM) for a neuron where the synapses are modeled as input-driven conductances. The temporal dynamics of the synaptic integration process are modeled with a synaptic time constant that results in a gradual injection of charge. This type of model is computationally expensive and is not easily amenable to existing software-based event-driven approaches. As an alternative we have designed an efficient time-based computing architecture in hardware, where the different stages of the neuron model are processed in parallel. Further improvements occur by computing multiple neurons in parallel using multiple processing units. This design is tested using reconfigurable hardware and its scalability and performance evaluated. Our overall goal is to investigate biologically realistic models for the real-time control of robots operating within closed action-perception loops, and so we evaluate the performance of the system on simulating a model of the cerebellum where the emulation of the temporal dynamics of the synaptic integration process is important.

  14. Differences on the Level of Social Skills between Freshman Computer Gamers and Non-Gamers

    Directory of Open Access Journals (Sweden)

    Joseph B. Campit

    2015-02-01

    Full Text Available Computer games play a large role in socialization and the consequences of playing them have been a topic of debates. This observation led the researcher to conduct the study about the influence of computer games on the social skills of the BSIT first year students of Pangasinan State University, Bayambang Campus, during school year 2012-2013. This study determined the profile of the 115 BSIT first year students according to: preferred computer games and frequency of playing. It investigated the level of social skills among playing and non-playing gamers. This study used the descriptive-comparative method of research. It was found out that crossfire was the most preferred computer game played at least once a week. Computer gamers had lower social skills than non-computer gamers. Gamers have more negative social behaviors compared to non-gamers and there is a negative effect of playing computer games on the level of social skills among first year students. There is a significant difference in the level of social skills of the students when grouped according to frequency of playing computer games. Students who play computer games everyday had significantly lower social skills than who play once a week. Thus, parents and teachers should give proper guidance in the limitation of playing computer games and the choice of games. Teachers should organize seminars on the awareness of the influence and negative effects of violent computer games on social skills. And students should choose educational over violent games to enhance their knowledge and social skills.

  15. 43 CFR 45.3 - How are time periods computed?

    Science.gov (United States)

    2010-10-01

    ... 43 Public Lands: Interior 1 2010-10-01 2010-10-01 false How are time periods computed? 45.3... IN FERC HYDROPOWER LICENSES General Provisions § 45.3 How are time periods computed? (a) General... run is not included. (2) The last day of the period is included. (i) If that day is a Saturday, Sunday...

  16. A Comparison of the Computation Times of Thermal Equilibrium and Non-equilibrium Models of Droplet Field in a Two-Fluid Three-Field Model

    Energy Technology Data Exchange (ETDEWEB)

    Park, Ik Kyu; Cho, Heong Kyu; Kim, Jong Tae; Yoon, Han Young; Jeong, Jae Jun

    2007-12-15

    A computational model for transient, 3 dimensional 2 phase flows was developed by using 'unstructured-FVM-based, non-staggered, semi-implicit numerical scheme' considering the thermally non-equilibrium droplets. The assumption of the thermally equilibrium between liquid and droplets of previous studies was not used any more, and three energy conservation equations for vapor, liquid, liquid droplets were set up. Thus, 9 conservation equations for mass, momentum, and energy were established to simulate 2 phase flows. In this report, the governing equations and a semi-implicit numerical sheme for a transient 1 dimensional 2 phase flows was described considering the thermally non-equilibrium between liquid and liquid droplets. The comparison with the previous model considering the thermally non-equilibrium between liquid and liquid droplets was also reported.

  17. A non-discrete method for computation of residence time in fluid mechanics simulations.

    Science.gov (United States)

    Esmaily-Moghadam, Mahdi; Hsia, Tain-Yen; Marsden, Alison L

    2013-11-01

    Cardiovascular simulations provide a promising means to predict risk of thrombosis in grafts, devices, and surgical anatomies in adult and pediatric patients. Although the pathways for platelet activation and clot formation are not yet fully understood, recent findings suggest that thrombosis risk is increased in regions of flow recirculation and high residence time (RT). Current approaches for calculating RT are typically based on releasing a finite number of Lagrangian particles into the flow field and calculating RT by tracking their positions. However, special care must be taken to achieve temporal and spatial convergence, often requiring repeated simulations. In this work, we introduce a non-discrete method in which RT is calculated in an Eulerian framework using the advection-diffusion equation. We first present the formulation for calculating residence time in a given region of interest using two alternate definitions. The physical significance and sensitivity of the two measures of RT are discussed and their mathematical relation is established. An extension to a point-wise value is also presented. The methods presented here are then applied in a 2D cavity and two representative clinical scenarios, involving shunt placement for single ventricle heart defects and Kawasaki disease. In the second case study, we explored the relationship between RT and wall shear stress, a parameter of particular importance in cardiovascular disease.

  18. Assessment of global and gene-specific DNA methylation in rat liver and kidney in response to non-genotoxic carcinogen exposure

    Energy Technology Data Exchange (ETDEWEB)

    Ozden, Sibel, E-mail: stopuz@istanbul.edu.tr [Department of Pharmaceutical Toxicology, Faculty of Pharmacy, Istanbul University, Istanbul (Turkey); Turgut Kara, Neslihan [Department of Molecular Biology and Genetics, Faculty of Science, Istanbul University, Istanbul (Turkey); Sezerman, Osman Ugur [Department of Biostatistics and Medical Informatics, Acibadem University, Istanbul (Turkey); Durasi, İlknur Melis [Biological Sciences and Bioengineering, Faculty of Engineering and Natural Sciences, Sabancı University, Istanbul (Turkey); Chen, Tao [Department of Toxicology, School of Public Health, Soochow University, Suzhou (China); Demirel, Goksun; Alpertunga, Buket [Department of Pharmaceutical Toxicology, Faculty of Pharmacy, Istanbul University, Istanbul (Turkey); Chipman, J. Kevin [School of Biosciences, The University of Birmingham, Birmingham (United Kingdom); Mally, Angela [Department of Toxicology, University of Würzburg, Würzburg (Germany)

    2015-12-01

    Altered expression of tumor suppressor genes and oncogenes, which is regulated in part at the level of DNA methylation, is an important event involved in non-genotoxic carcinogenesis. This may serve as a marker for early detection of non-genotoxic carcinogens. Therefore, we evaluated the effects of non-genotoxic hepatocarcinogens, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), hexachlorobenzene (HCB), methapyrilene (MPY) and male rat kidney carcinogens, d-limonene, p-dichlorobenzene (DCB), chloroform and ochratoxin A (OTA) on global and CpG island promoter methylation in their respective target tissues in rats. No significant dose-related effects on global DNA hypomethylation were observed in tissues of rats compared to vehicle controls using LC–MS/MS in response to short-term non-genotoxic carcinogen exposure. Initial experiments investigating gene-specific methylation using methylation-specific PCR and bisulfite sequencing, revealed partial methylation of p16 in the liver of rats treated with HCB and TCDD. However, no treatment related effects on the methylation status of Cx32, e-cadherin, VHL, c-myc, Igfbp2, and p15 were observed. We therefore applied genome-wide DNA methylation analysis using methylated DNA immunoprecipitation combined with microarrays to identify alterations in gene-specific methylation. Under the conditions of our study, some genes were differentially methylated in response to MPY and TCDD, whereas d-limonene, DCB and chloroform did not induce any methylation changes. 90-day OTA treatment revealed enrichment of several categories of genes important in protein kinase activity and mTOR cell signaling process which are related to OTA nephrocarcinogenicity. - Highlights: • Studied non-genotoxic carcinogens caused no change on global DNA hypomethylation. • d-Limonene, DCB and chloroform did not show any genome-wide methylation changes. • Some genes were differentially methylated in response to MPY, TCDD and OTA. • Protein kinase activity

  19. Assessment of global and gene-specific DNA methylation in rat liver and kidney in response to non-genotoxic carcinogen exposure

    International Nuclear Information System (INIS)

    Ozden, Sibel; Turgut Kara, Neslihan; Sezerman, Osman Ugur; Durasi, İlknur Melis; Chen, Tao; Demirel, Goksun; Alpertunga, Buket; Chipman, J. Kevin; Mally, Angela

    2015-01-01

    Altered expression of tumor suppressor genes and oncogenes, which is regulated in part at the level of DNA methylation, is an important event involved in non-genotoxic carcinogenesis. This may serve as a marker for early detection of non-genotoxic carcinogens. Therefore, we evaluated the effects of non-genotoxic hepatocarcinogens, 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD), hexachlorobenzene (HCB), methapyrilene (MPY) and male rat kidney carcinogens, d-limonene, p-dichlorobenzene (DCB), chloroform and ochratoxin A (OTA) on global and CpG island promoter methylation in their respective target tissues in rats. No significant dose-related effects on global DNA hypomethylation were observed in tissues of rats compared to vehicle controls using LC–MS/MS in response to short-term non-genotoxic carcinogen exposure. Initial experiments investigating gene-specific methylation using methylation-specific PCR and bisulfite sequencing, revealed partial methylation of p16 in the liver of rats treated with HCB and TCDD. However, no treatment related effects on the methylation status of Cx32, e-cadherin, VHL, c-myc, Igfbp2, and p15 were observed. We therefore applied genome-wide DNA methylation analysis using methylated DNA immunoprecipitation combined with microarrays to identify alterations in gene-specific methylation. Under the conditions of our study, some genes were differentially methylated in response to MPY and TCDD, whereas d-limonene, DCB and chloroform did not induce any methylation changes. 90-day OTA treatment revealed enrichment of several categories of genes important in protein kinase activity and mTOR cell signaling process which are related to OTA nephrocarcinogenicity. - Highlights: • Studied non-genotoxic carcinogens caused no change on global DNA hypomethylation. • d-Limonene, DCB and chloroform did not show any genome-wide methylation changes. • Some genes were differentially methylated in response to MPY, TCDD and OTA. • Protein kinase activity

  20. FAMOUS, faster: using parallel computing techniques to accelerate the FAMOUS/HadCM3 climate model with a focus on the radiative transfer algorithm

    Directory of Open Access Journals (Sweden)

    P. Hanappe

    2011-09-01

    Full Text Available We have optimised the atmospheric radiation algorithm of the FAMOUS climate model on several hardware platforms. The optimisation involved translating the Fortran code to C and restructuring the algorithm around the computation of a single air column. Instead of the existing MPI-based domain decomposition, we used a task queue and a thread pool to schedule the computation of individual columns on the available processors. Finally, four air columns are packed together in a single data structure and computed simultaneously using Single Instruction Multiple Data operations.

    The modified algorithm runs more than 50 times faster on the CELL's Synergistic Processing Element than on its main PowerPC processing element. On Intel-compatible processors, the new radiation code runs 4 times faster. On the tested graphics processor, using OpenCL, we find a speed-up of more than 2.5 times as compared to the original code on the main CPU. Because the radiation code takes more than 60 % of the total CPU time, FAMOUS executes more than twice as fast. Our version of the algorithm returns bit-wise identical results, which demonstrates the robustness of our approach. We estimate that this project required around two and a half man-years of work.

  1. Comparison of conventional and cadmium-zinc-telluride single-photon emission computed tomography for analysis of thallium-201 myocardial perfusion imaging: an exploratory study in normal databases for different ethnicities.

    Science.gov (United States)

    Ishihara, Masaru; Onoguchi, Masahisa; Taniguchi, Yasuyo; Shibutani, Takayuki

    2017-12-01

    The aim of this study was to clarify the differences in thallium-201-chloride (thallium-201) myocardial perfusion imaging (MPI) scans evaluated by conventional anger-type single-photon emission computed tomography (conventional SPECT) versus cadmium-zinc-telluride SPECT (CZT SPECT) imaging in normal databases for different ethnic groups. MPI scans from 81 consecutive Japanese patients were examined using conventional SPECT and CZT SPECT and analyzed with the pre-installed quantitative perfusion SPECT (QPS) software. We compared the summed stress score (SSS), summed rest score (SRS), and summed difference score (SDS) for the two SPECT devices. For a normal MPI reference, we usually use Japanese databases for MPI created by the Japanese Society of Nuclear Medicine, which can be used with conventional SPECT but not with CZT SPECT. In this study, we used new Japanese normal databases constructed in our institution to compare conventional and CZT SPECT. Compared with conventional SPECT, CZT SPECT showed lower SSS (p < 0.001), SRS (p = 0.001), and SDS (p = 0.189) using the pre-installed SPECT database. In contrast, CZT SPECT showed no significant difference from conventional SPECT in QPS analysis using the normal databases from our institution. Myocardial perfusion analyses by CZT SPECT should be evaluated using normal databases based on the ethnic group being evaluated.

  2. Relativistic Photoionization Computations with the Time Dependent Dirac Equation

    Science.gov (United States)

    2016-10-12

    Naval Research Laboratory Washington, DC 20375-5320 NRL/MR/6795--16-9698 Relativistic Photoionization Computations with the Time Dependent Dirac... Photoionization Computations with the Time Dependent Dirac Equation Daniel F. Gordon and Bahman Hafizi Naval Research Laboratory 4555 Overlook Avenue, SW...Unclassified Unlimited Unclassified Unlimited 22 Daniel Gordon (202) 767-5036 Tunneling Photoionization Ionization of inner shell electrons by laser

  3. Parallelism at Cern: real-time and off-line applications in the GP-MIMD2 project

    International Nuclear Information System (INIS)

    Calafiura, P.

    1997-01-01

    A wide range of general purpose high-energy physics applications, ranging from Monte Carlo simulation to data acquisition, from interactive data analysis to on-line filtering, have been ported, or developed, and run in parallel on IBM SP-2 and Meiko CS-2 CERN large multi-processor machines. The ESPRIT project GP-MIMD2 has been a catalyst for the interest in parallel computing at CERN. The project provided the 128 processor Meiko CS-2 system that is now succesfully integrated in the CERN computing environment. The CERN experiment NA48 was involved in the GP-MIMD2 project since the beginning. NA48 physicists run, as part of their day-to-day work, simulation and analysis programs parallelized using the message passing interface MPI. The CS-2 is also a vital component of the experiment data acquisition system and will be used to calibrate in real-time the 13000 channels liquid krypton calorimeter. (orig.)

  4. Marcus canonical integral for non-Gaussian processes and its computation: pathwise simulation and tau-leaping algorithm.

    Science.gov (United States)

    Li, Tiejun; Min, Bin; Wang, Zhiming

    2013-03-14

    The stochastic integral ensuring the Newton-Leibnitz chain rule is essential in stochastic energetics. Marcus canonical integral has this property and can be understood as the Wong-Zakai type smoothing limit when the driving process is non-Gaussian. However, this important concept seems not well-known for physicists. In this paper, we discuss Marcus integral for non-Gaussian processes and its computation in the context of stochastic energetics. We give a comprehensive introduction to Marcus integral and compare three equivalent definitions in the literature. We introduce the exact pathwise simulation algorithm and give the error analysis. We show how to compute the thermodynamic quantities based on the pathwise simulation algorithm. We highlight the information hidden in the Marcus mapping, which plays the key role in determining thermodynamic quantities. We further propose the tau-leaping algorithm, which advance the process with deterministic time steps when tau-leaping condition is satisfied. The numerical experiments and its efficiency analysis show that it is very promising.

  5. Development and benchmark verification of a parallelized Monte Carlo burnup calculation program MCBMPI

    International Nuclear Information System (INIS)

    Yang Wankui; Liu Yaoguang; Ma Jimin; Yang Xin; Wang Guanbo

    2014-01-01

    MCBMPI, a parallelized burnup calculation program, was developed. The program is modularized. Neutron transport calculation module employs the parallelized MCNP5 program MCNP5MPI, and burnup calculation module employs ORIGEN2, with the MPI parallel zone decomposition strategy. The program system only consists of MCNP5MPI and an interface subroutine. The interface subroutine achieves three main functions, i.e. zone decomposition, nuclide transferring and decaying, data exchanging with MCNP5MPI. Also, the program was verified with the Pressurized Water Reactor (PWR) cell burnup benchmark, the results showed that it's capable to apply the program to burnup calculation of multiple zones, and the computation efficiency could be significantly improved with the development of computer hardware. (authors)

  6. Computing the non-Markovian coarse-grained interactions derived from the Mori-Zwanzig formalism in molecular systems: Application to polymer melts

    Science.gov (United States)

    Li, Zhen; Lee, Hee Sun; Darve, Eric; Karniadakis, George Em

    2017-01-01

    Memory effects are often introduced during coarse-graining of a complex dynamical system. In particular, a generalized Langevin equation (GLE) for the coarse-grained (CG) system arises in the context of Mori-Zwanzig formalism. Upon a pairwise decomposition, GLE can be reformulated into its pairwise version, i.e., non-Markovian dissipative particle dynamics (DPD). GLE models the dynamics of a single coarse particle, while DPD considers the dynamics of many interacting CG particles, with both CG systems governed by non-Markovian interactions. We compare two different methods for the practical implementation of the non-Markovian interactions in GLE and DPD systems. More specifically, a direct evaluation of the non-Markovian (NM) terms is performed in LE-NM and DPD-NM models, which requires the storage of historical information that significantly increases computational complexity. Alternatively, we use a few auxiliary variables in LE-AUX and DPD-AUX models to replace the non-Markovian dynamics with a Markovian dynamics in a higher dimensional space, leading to a much reduced memory footprint and computational cost. In our numerical benchmarks, the GLE and non-Markovian DPD models are constructed from molecular dynamics (MD) simulations of star-polymer melts. Results show that a Markovian dynamics with auxiliary variables successfully generates equivalent non-Markovian dynamics consistent with the reference MD system, while maintaining a tractable computational cost. Also, transient subdiffusion of the star-polymers observed in the MD system can be reproduced by the coarse-grained models. The non-interacting particle models, LE-NM/AUX, are computationally much cheaper than the interacting particle models, DPD-NM/AUX. However, the pairwise models with momentum conservation are more appropriate for correctly reproducing the long-time hydrodynamics characterised by an algebraic decay in the velocity autocorrelation function.

  7. Real-time management (RTM) by cloud computing system dynamics (CCSD) for risk analysis of Fukushima nuclear power plant (NPP) accident

    Energy Technology Data Exchange (ETDEWEB)

    Cho, Hyo Sung [Yonsei Univ., Wonju Gangwon-do (Korea, Republic of). Dept. of Radiation Convergence Engineering; Woo, Tae Ho [Yonsei Univ., Wonju Gangwon-do (Korea, Republic of). Dept. of Radiation Convergence Engineering; The Cyber Univ. of Korea, Seoul (Korea, Republic of). Dept. of Mechanical and Control Engineering

    2017-03-15

    The earthquake and tsunami induced accident of nuclear power plant (NPP) in Fukushima disaster is investigated by the real-time management (RTM) method. This non-linear logic of the safety management is applied to enhance the methodological confidence in the NPP reliability. The case study of the earthquake is modeled for the fast reaction characteristics of the RTM. The system dynamics (SD) modeling simulations and cloud computing are applied for the RTM method where the real time simulation has the fast and effective communication for the accident remediation and prevention. Current tablet computing system can improve the safety standard of the NPP. Finally, the procedure of the cloud computing system dynamics (CCSD) modeling is constructed.

  8. Real-time management (RTM) by cloud computing system dynamics (CCSD) for risk analysis of Fukushima nuclear power plant (NPP) accident

    International Nuclear Information System (INIS)

    Cho, Hyo Sung; Woo, Tae Ho; The Cyber Univ. of Korea, Seoul

    2017-01-01

    The earthquake and tsunami induced accident of nuclear power plant (NPP) in Fukushima disaster is investigated by the real-time management (RTM) method. This non-linear logic of the safety management is applied to enhance the methodological confidence in the NPP reliability. The case study of the earthquake is modeled for the fast reaction characteristics of the RTM. The system dynamics (SD) modeling simulations and cloud computing are applied for the RTM method where the real time simulation has the fast and effective communication for the accident remediation and prevention. Current tablet computing system can improve the safety standard of the NPP. Finally, the procedure of the cloud computing system dynamics (CCSD) modeling is constructed.

  9. STICK: Spike Time Interval Computational Kernel, a Framework for General Purpose Computation Using Neurons, Precise Timing, Delays, and Synchrony.

    Science.gov (United States)

    Lagorce, Xavier; Benosman, Ryad

    2015-11-01

    There has been significant research over the past two decades in developing new platforms for spiking neural computation. Current neural computers are primarily developed to mimic biology. They use neural networks, which can be trained to perform specific tasks to mainly solve pattern recognition problems. These machines can do more than simulate biology; they allow us to rethink our current paradigm of computation. The ultimate goal is to develop brain-inspired general purpose computation architectures that can breach the current bottleneck introduced by the von Neumann architecture. This work proposes a new framework for such a machine. We show that the use of neuron-like units with precise timing representation, synaptic diversity, and temporal delays allows us to set a complete, scalable compact computation framework. The framework provides both linear and nonlinear operations, allowing us to represent and solve any function. We show usability in solving real use cases from simple differential equations to sets of nonlinear differential equations leading to chaotic attractors.

  10. Computer-controlled neutron time-of-flight spectrometer. Part II

    International Nuclear Information System (INIS)

    Merriman, S.H.

    1979-12-01

    A time-of-flight spectrometer for neutron inelastic scattering research has been interfaced to a PDP-15/30 computer. The computer is used for experimental data acquisition and analysis and for apparatus control. This report was prepared to summarize the functions of the computer and to act as a users' guide to the software system

  11. Computer-assisted determination of left ventricular endocardial borders reduces variability in the echocardiographic assessment of ejection fraction

    Directory of Open Access Journals (Sweden)

    Lindstrom Lena

    2008-11-01

    Full Text Available Abstract Background Left ventricular size and function are important prognostic factors in heart disease. Their measurement is the most frequent reason for sending patients to the echo lab. These measurements have important implications for therapy but are sensitive to the skill of the operator. Earlier automated echo-based methods have not become widely used. The aim of our study was to evaluate an automatic echocardiographic method (with manual correction if needed for determining left ventricular ejection fraction (LVEF based on an active appearance model of the left ventricle (syngo®AutoEF, Siemens Medical Solutions. Comparisons were made with manual planimetry (manual Simpson, visual assessment and automatically determined LVEF from quantitative myocardial gated single photon emission computed tomography (SPECT. Methods 60 consecutive patients referred for myocardial perfusion imaging (MPI were included in the study. Two-dimensional echocardiography was performed within one hour of MPI at rest. Image quality did not constitute an exclusion criterion. Analysis was performed by five experienced observers and by two novices. Results LVEF (%, end-diastolic and end-systolic volume/BSA (ml/m2 were for uncorrected AutoEF 54 ± 10, 51 ± 16, 24 ± 13, for corrected AutoEF 53 ± 10, 53 ± 18, 26 ± 14, for manual Simpson 51 ± 11, 56 ± 20, 28 ± 15, and for MPI 52 ± 12, 67 ± 26, 35 ± 23. The required time for analysis was significantly different for all four echocardiographic methods and was for uncorrected AutoEF 79 ± 5 s, for corrected AutoEF 159 ± 46 s, for manual Simpson 177 ± 66 s, and for visual assessment 33 ± 14 s. Compared with the expert manual Simpson, limits of agreement for novice corrected AutoEF was lower than for novice manual Simpson (0.8 ± 10.5 vs. -3.2 ± 11.4 LVEF percentage points. Calculated for experts and with LVEF (% categorized into Conclusion Corrected AutoEF reduces the variation in measurements compared with

  12. MPI-AMRVAC 2.0 for Solar and Astrophysical Applications

    Science.gov (United States)

    Xia, C.; Teunissen, J.; El Mellah, I.; Chané, E.; Keppens, R.

    2018-02-01

    We report on the development of MPI-AMRVAC version 2.0, which is an open-source framework for parallel, grid-adaptive simulations of hydrodynamic and magnetohydrodynamic (MHD) astrophysical applications. The framework now supports radial grid stretching in combination with adaptive mesh refinement (AMR). The advantages of this combined approach are demonstrated with one-dimensional, two-dimensional, and three-dimensional examples of spherically symmetric Bondi accretion, steady planar Bondi–Hoyle–Lyttleton flows, and wind accretion in supergiant X-ray binaries. Another improvement is support for the generic splitting of any background magnetic field. We present several tests relevant for solar physics applications to demonstrate the advantages of field splitting on accuracy and robustness in extremely low-plasma β environments: a static magnetic flux rope, a magnetic null-point, and magnetic reconnection in a current sheet with either uniform or anomalous resistivity. Our implementation for treating anisotropic thermal conduction in multi-dimensional MHD applications is also described, which generalizes the original slope-limited symmetric scheme from two to three dimensions. We perform ring diffusion tests that demonstrate its accuracy and robustness, and show that it prevents the unphysical thermal flux present in traditional schemes. The improved parallel scaling of the code is demonstrated with three-dimensional AMR simulations of solar coronal rain, which show satisfactory strong scaling up to 2000 cores. Other framework improvements are also reported: the modernization and reorganization into a library, the handling of automatic regression tests, the use of inline/online Doxygen documentation, and a new future-proof data format for input/output.

  13. Time series modeling, computation, and inference

    CERN Document Server

    Prado, Raquel

    2010-01-01

    The authors systematically develop a state-of-the-art analysis and modeling of time series. … this book is well organized and well written. The authors present various statistical models for engineers to solve problems in time series analysis. Readers no doubt will learn state-of-the-art techniques from this book.-Hsun-Hsien Chang, Computing Reviews, March 2012My favorite chapters were on dynamic linear models and vector AR and vector ARMA models.-William Seaver, Technometrics, August 2011… a very modern entry to the field of time-series modelling, with a rich reference list of the current lit

  14. Heterogeneous real-time computing in radio astronomy

    Science.gov (United States)

    Ford, John M.; Demorest, Paul; Ransom, Scott

    2010-07-01

    Modern computer architectures suited for general purpose computing are often not the best choice for either I/O-bound or compute-bound problems. Sometimes the best choice is not to choose a single architecture, but to take advantage of the best characteristics of different computer architectures to solve your problems. This paper examines the tradeoffs between using computer systems based on the ubiquitous X86 Central Processing Units (CPU's), Field Programmable Gate Array (FPGA) based signal processors, and Graphical Processing Units (GPU's). We will show how a heterogeneous system can be produced that blends the best of each of these technologies into a real-time signal processing system. FPGA's tightly coupled to analog-to-digital converters connect the instrument to the telescope and supply the first level of computing to the system. These FPGA's are coupled to other FPGA's to continue to provide highly efficient processing power. Data is then packaged up and shipped over fast networks to a cluster of general purpose computers equipped with GPU's, which are used for floating-point intensive computation. Finally, the data is handled by the CPU and written to disk, or further processed. Each of the elements in the system has been chosen for its specific characteristics and the role it can play in creating a system that does the most for the least, in terms of power, space, and money.

  15. Ubiquitous computing technology for just-in-time motivation of behavior change.

    Science.gov (United States)

    Intille, Stephen S

    2004-01-01

    This paper describes a vision of health care where "just-in-time" user interfaces are used to transform people from passive to active consumers of health care. Systems that use computational pattern recognition to detect points of decision, behavior, or consequences automatically can present motivational messages to encourage healthy behavior at just the right time. Further, new ubiquitous computing and mobile computing devices permit information to be conveyed to users at just the right place. In combination, computer systems that present messages at the right time and place can be developed to motivate physical activity and healthy eating. Computational sensing technologies can also be used to measure the impact of the motivational technology on behavior.

  16. Temporal trends in non-occupational sedentary behaviours from Australian Time Use Surveys 1992, 1997 and 2006

    Directory of Open Access Journals (Sweden)

    Chau Josephine Y

    2012-06-01

    Full Text Available Abstract Background Current epidemiological data highlight the potential detrimental associations between sedentary behaviours and health outcomes, yet little is known about temporal trends in adult sedentary time. This study used time use data to examine population trends in sedentary behaviours in non-occupational domains and more specifically during leisure time. Methods We conducted secondary analysis of population representative data from the Australian Time Use Surveys 1992, 1997 and 2006 involving respondents aged 20 years and over with completed time use diaries for two days. Weighted samples for each survey year were: n = 5851 (1992, n = 6419 (1997 and n = 5505 (2006. We recoded all primary activities by domain (sleep, occupational, transport, leisure, household, education and intensity (sedentary, light, moderate. Adjusted multiple linear regressions tested for differences in time spent in non-occupational sedentary behaviours in 1992 and 1997 with 2006 as the reference year. Results Total non-occupational sedentary time was slightly lower in 1997 than in 2006 (mean = 894 min/2d and 906 min/2d, respectively; B = −11.2; 95%CI: -21.5, -0.9. Compared with 2006, less time was spent in 1997 in sedentary transport (B-6.7; 95%CI: -10.4, -3.0 and sedentary education (B = −6.3; 95%CI: -10.5, -2.2 while household and leisure sedentary time remained stable. Time engaged in different types of leisure-time sedentary activities changed between 1997 and 2006: leisure-time computer use increased (B = −26.7; 95%CI: -29.5, -23.8, while other leisure-time sedentary behaviours (e.g., reading, listening to music, hobbies and crafts showed small concurrent reductions. In 1992, leisure screen time was lower than in 2006: TV-viewing (B = −24.2; 95%CI: -31.2, -17.2, computer use (B = −35.3; 95%CI: -37.7, -32.8. In 2006, 90 % of leisure time was spent sedentary, of which 53 % was screen time. Conclusions Non

  17. ALGORITHMS AND PROGRAMS FOR STRONG GRAVITATIONAL LENSING IN KERR SPACE-TIME INCLUDING POLARIZATION

    Energy Technology Data Exchange (ETDEWEB)

    Chen, Bin; Maddumage, Prasad [Research Computing Center, Department of Scientific Computing, Florida State University, Tallahassee, FL 32306 (United States); Kantowski, Ronald; Dai, Xinyu; Baron, Eddie, E-mail: bchen3@fsu.edu [Homer L. Dodge Department of Physics and Astronomy, University of Oklahoma, Norman, OK 73019 (United States)

    2015-05-15

    Active galactic nuclei (AGNs) and quasars are important astrophysical objects to understand. Recently, microlensing observations have constrained the size of the quasar X-ray emission region to be of the order of 10 gravitational radii of the central supermassive black hole. For distances within a few gravitational radii, light paths are strongly bent by the strong gravity field of the central black hole. If the central black hole has nonzero angular momentum (spin), then a photon’s polarization plane will be rotated by the gravitational Faraday effect. The observed X-ray flux and polarization will then be influenced significantly by the strong gravity field near the source. Consequently, linear gravitational lensing theory is inadequate for such extreme circumstances. We present simple algorithms computing the strong lensing effects of Kerr black holes, including the effects on polarization. Our algorithms are realized in a program “KERTAP” in two versions: MATLAB and Python. The key ingredients of KERTAP are a graphic user interface, a backward ray-tracing algorithm, a polarization propagator dealing with gravitational Faraday rotation, and algorithms computing observables such as flux magnification and polarization angles. Our algorithms can be easily realized in other programming languages such as FORTRAN, C, and C++. The MATLAB version of KERTAP is parallelized using the MATLAB Parallel Computing Toolbox and the Distributed Computing Server. The Python code was sped up using Cython and supports full implementation of MPI using the “mpi4py” package. As an example, we investigate the inclination angle dependence of the observed polarization and the strong lensing magnification of AGN X-ray emission. We conclude that it is possible to perform complex numerical-relativity related computations using interpreted languages such as MATLAB and Python.

  18. ALGORITHMS AND PROGRAMS FOR STRONG GRAVITATIONAL LENSING IN KERR SPACE-TIME INCLUDING POLARIZATION

    International Nuclear Information System (INIS)

    Chen, Bin; Maddumage, Prasad; Kantowski, Ronald; Dai, Xinyu; Baron, Eddie

    2015-01-01

    Active galactic nuclei (AGNs) and quasars are important astrophysical objects to understand. Recently, microlensing observations have constrained the size of the quasar X-ray emission region to be of the order of 10 gravitational radii of the central supermassive black hole. For distances within a few gravitational radii, light paths are strongly bent by the strong gravity field of the central black hole. If the central black hole has nonzero angular momentum (spin), then a photon’s polarization plane will be rotated by the gravitational Faraday effect. The observed X-ray flux and polarization will then be influenced significantly by the strong gravity field near the source. Consequently, linear gravitational lensing theory is inadequate for such extreme circumstances. We present simple algorithms computing the strong lensing effects of Kerr black holes, including the effects on polarization. Our algorithms are realized in a program “KERTAP” in two versions: MATLAB and Python. The key ingredients of KERTAP are a graphic user interface, a backward ray-tracing algorithm, a polarization propagator dealing with gravitational Faraday rotation, and algorithms computing observables such as flux magnification and polarization angles. Our algorithms can be easily realized in other programming languages such as FORTRAN, C, and C++. The MATLAB version of KERTAP is parallelized using the MATLAB Parallel Computing Toolbox and the Distributed Computing Server. The Python code was sped up using Cython and supports full implementation of MPI using the “mpi4py” package. As an example, we investigate the inclination angle dependence of the observed polarization and the strong lensing magnification of AGN X-ray emission. We conclude that it is possible to perform complex numerical-relativity related computations using interpreted languages such as MATLAB and Python

  19. Continuous-Time Symmetric Hopfield Nets are Computationally Universal

    Czech Academy of Sciences Publication Activity Database

    Šíma, Jiří; Orponen, P.

    2003-01-01

    Roč. 15, č. 3 (2003), s. 693-733 ISSN 0899-7667 R&D Projects: GA AV ČR IAB2030007; GA ČR GA201/02/1456 Institutional research plan: AV0Z1030915 Keywords : continuous-time Hopfield network * Liapunov function * analog computation * computational power * Turing universality Subject RIV: BA - General Mathematics Impact factor: 2.747, year: 2003

  20. Real-time dynamics of lattice gauge theories with a few-qubit quantum computer

    Science.gov (United States)

    Martinez, Esteban A.; Muschik, Christine A.; Schindler, Philipp; Nigg, Daniel; Erhard, Alexander; Heyl, Markus; Hauke, Philipp; Dalmonte, Marcello; Monz, Thomas; Zoller, Peter; Blatt, Rainer

    2016-06-01

    Gauge theories are fundamental to our understanding of interactions between the elementary constituents of matter as mediated by gauge bosons. However, computing the real-time dynamics in gauge theories is a notorious challenge for classical computational methods. This has recently stimulated theoretical effort, using Feynman’s idea of a quantum simulator, to devise schemes for simulating such theories on engineered quantum-mechanical devices, with the difficulty that gauge invariance and the associated local conservation laws (Gauss laws) need to be implemented. Here we report the experimental demonstration of a digital quantum simulation of a lattice gauge theory, by realizing (1 + 1)-dimensional quantum electrodynamics (the Schwinger model) on a few-qubit trapped-ion quantum computer. We are interested in the real-time evolution of the Schwinger mechanism, describing the instability of the bare vacuum due to quantum fluctuations, which manifests itself in the spontaneous creation of electron-positron pairs. To make efficient use of our quantum resources, we map the original problem to a spin model by eliminating the gauge fields in favour of exotic long-range interactions, which can be directly and efficiently implemented on an ion trap architecture. We explore the Schwinger mechanism of particle-antiparticle generation by monitoring the mass production and the vacuum persistence amplitude. Moreover, we track the real-time evolution of entanglement in the system, which illustrates how particle creation and entanglement generation are directly related. Our work represents a first step towards quantum simulation of high-energy theories using atomic physics experiments—the long-term intention is to extend this approach to real-time quantum simulations of non-Abelian lattice gauge theories.

  1. Multiscale Space-Time Computational Methods for Fluid-Structure Interactions

    Science.gov (United States)

    2015-09-13

    thermo-fluid analysis of a ground vehicle and its tires ST-SI Computational Analysis of a Vertical - Axis Wind Turbine We have successfully...of a vertical - axis wind turbine . Multiscale Compressible-Flow Computation with Particle Tracking We have successfully tested the multiscale...Tezduyar, Spenser McIntyre, Nikolay Kostov, Ryan Kolesar, Casey Habluetzel. Space–time VMS computation of wind - turbine rotor and tower aerodynamics

  2. Oak Ridge Institutional Cluster Autotune Test Drive Report

    Energy Technology Data Exchange (ETDEWEB)

    Jibonananda, Sanyal [ORNL; New, Joshua Ryan [ORNL

    2014-02-01

    The Oak Ridge Institutional Cluster (OIC) provides general purpose computational resources for the ORNL staff to run computation heavy jobs that are larger than desktop applications but do not quite require the scale and power of the Oak Ridge Leadership Computing Facility (OLCF). This report details the efforts made and conclusions derived in performing a short test drive of the cluster resources on Phase 5 of the OIC. EnergyPlus was used in the analysis as a candidate user program and the overall software environment was evaluated against anticipated challenges experienced with resources such as the shared memory-Nautilus (JICS) and Titan (OLCF). The OIC performed within reason and was found to be acceptable in the context of running EnergyPlus simulations. The number of cores per node and the availability of scratch space per node allow non-traditional desktop focused applications to leverage parallel ensemble execution. Although only individual runs of EnergyPlus were executed, the software environment on the OIC appeared suitable to run ensemble simulations with some modifications to the Autotune workflow. From a standpoint of general usability, the system supports common Linux libraries, compilers, standard job scheduling software (Torque/Moab), and the OpenMPI library (the only MPI library) for MPI communications. The file system is a Panasas file system which literature indicates to be an efficient file system.

  3. Diagnostic value of early post-exercise 99Tcm-MIBI ECG-gated myocardial perfusion imaging in severe coronary artery disease

    International Nuclear Information System (INIS)

    Li Dianfu; Huang Jun; Feng Jianlin; Cheng Xu; Li Xinli; Cao Kejiang

    2005-01-01

    Objective: To study and compare the diagnostic value in severe coronary artery disease (CAD) of 99 Tc m -methoxyisobutylisonitrile (MIBI) electrocardiogram (ECG)-gated early post-exercise myocardial perfusion imaging (G-MPI) with that of non-ECG-gated myocardial perfusion imaging (NG-MPI). Methods: Two hundred and fifteen suspected CAD patients had undergone G-MPI and coronary artery angiography (CAG) within one month were enrolled and distributed into three-vessel and non-three-vessel CAD groups according to CAG results (≥70%); the diagnostic values in severe CAD of G-MPI and NG-MPI were gained and compared to determine which one of the two protocols would be superior in identification of severe three-vessel CAD. Results: When the ≥70% diameter stenosis CAG was the diagnostic standard of severe CAD, the sensitivity of G-MPI and NG-MPI in the diagnosis of severe CAD were 95.3% (143/150) and 90.7% (136/150, χ 2 =2.509, P=0.113), but when the comparison specifically pinpointed to severe three-vessel CAD, there was significant difference between G-MPI [100%(51/51)] and NG-MPI [92.2% (47/51), χ 2 =4.163, P=0.041]. Diagnostic specificity of G-MPI was 80.0% and that of NG-MPI was 72.3% (χ 2 =1.059, P=0.303). Conclusions: The incremental diagnostic sensitivity of G-MPI adding to the NG-MPI in the diagnosis of severe CAD was mainly from the three-vessel subgroup patients. Exercise stress G-MPI has better diagnostic value in severe three-vessel CAD patients than NG-MPI. (authors)

  4. Polynomial approximation of non-Gaussian unitaries by counting one photon at a time

    Science.gov (United States)

    Arzani, Francesco; Treps, Nicolas; Ferrini, Giulia

    2017-05-01

    In quantum computation with continuous-variable systems, quantum advantage can only be achieved if some non-Gaussian resource is available. Yet, non-Gaussian unitary evolutions and measurements suited for computation are challenging to realize in the laboratory. We propose and analyze two methods to apply a polynomial approximation of any unitary operator diagonal in the amplitude quadrature representation, including non-Gaussian operators, to an unknown input state. Our protocols use as a primary non-Gaussian resource a single-photon counter. We use the fidelity of the transformation with the target one on Fock and coherent states to assess the quality of the approximate gate.

  5. Prognostic Value of Cardiac Time Intervals by Tissue Doppler Imaging M-Mode in Patients With Acute ST-Segment-Elevation Myocardial Infarction Treated With Primary Percutaneous Coronary Intervention

    DEFF Research Database (Denmark)

    Biering-Sørensen, Tor; Mogelvang, Rasmus; Søgaard, Peter

    2013-01-01

    Background- Color tissue Doppler imaging M-mode through the mitral leaflet is an easy and precise method to estimate all cardiac time intervals from 1 cardiac cycle and thereby obtain the myocardial performance index (MPI). However, the prognostic value of the cardiac time intervals and the MPI...... assessed by color tissue Doppler imaging M-mode through the mitral leaflet in patients with ST-segment-elevation myocardial infarction (MI) is unknown. Methods and Results- In total, 391 patients were admitted with an ST-segment-elevation MI, treated with primary percutaneous coronary intervention...

  6. Specialized Computer Systems for Environment Visualization

    Science.gov (United States)

    Al-Oraiqat, Anas M.; Bashkov, Evgeniy A.; Zori, Sergii A.

    2018-06-01

    The need for real time image generation of landscapes arises in various fields as part of tasks solved by virtual and augmented reality systems, as well as geographic information systems. Such systems provide opportunities for collecting, storing, analyzing and graphically visualizing geographic data. Algorithmic and hardware software tools for increasing the realism and efficiency of the environment visualization in 3D visualization systems are proposed. This paper discusses a modified path tracing algorithm with a two-level hierarchy of bounding volumes and finding intersections with Axis-Aligned Bounding Box. The proposed algorithm eliminates the branching and hence makes the algorithm more suitable to be implemented on the multi-threaded CPU and GPU. A modified ROAM algorithm is used to solve the qualitative visualization of reliefs' problems and landscapes. The algorithm is implemented on parallel systems—cluster and Compute Unified Device Architecture-networks. Results show that the implementation on MPI clusters is more efficient than Graphics Processing Unit/Graphics Processing Clusters and allows real-time synthesis. The organization and algorithms of the parallel GPU system for the 3D pseudo stereo image/video synthesis are proposed. With realizing possibility analysis on a parallel GPU-architecture of each stage, 3D pseudo stereo synthesis is performed. An experimental prototype of a specialized hardware-software system 3D pseudo stereo imaging and video was developed on the CPU/GPU. The experimental results show that the proposed adaptation of 3D pseudo stereo imaging to the architecture of GPU-systems is efficient. Also it accelerates the computational procedures of 3D pseudo-stereo synthesis for the anaglyph and anamorphic formats of the 3D stereo frame without performing optimization procedures. The acceleration is on average 11 and 54 times for test GPUs.

  7. Applied time series analysis and innovative computing

    CERN Document Server

    Ao, Sio-Iong

    2010-01-01

    This text is a systematic, state-of-the-art introduction to the use of innovative computing paradigms as an investigative tool for applications in time series analysis. It includes frontier case studies based on recent research.

  8. Experiences in the parallelization of the discrete ordinates method using OpenMP and MPI

    Energy Technology Data Exchange (ETDEWEB)

    Pautz, A. [TUV Hannover/Sachsen-Anhalt e.V. (Germany); Langenbuch, S. [Gesellschaft fur Anlagen- und Reaktorsicherheit (GRS) mbH (Germany)

    2003-07-01

    The method of Discrete Ordinates is in principle parallelizable to a high degree, since the transport 'mesh sweeps' are mutually independent for all angular directions. However, in the well-known production code Dort such a type of angular domain decomposition has to be done on a spatial line-byline basis, causing the parallelism in the code to be very fine-grained. The construction of scalar fluxes and moments requires a large effort for inter-thread or inter-process communication. We have implemented two different parallelization approaches in Dort: firstly, we have used a shared-memory model suitable for SMP (Symmetric Multiprocessor) machines based on the standard OpenMP. The second approach uses the well-known Message Passing Interface (MPI) to establish communication between parallel processes running in a distributed-memory environment. We investigate the benefits and drawbacks of both models and show first results on performance and scaling behaviour of the parallel Dort code. (authors)

  9. Experiences in the parallelization of the discrete ordinates method using OpenMP and MPI

    International Nuclear Information System (INIS)

    Pautz, A.; Langenbuch, S.

    2003-01-01

    The method of Discrete Ordinates is in principle parallelizable to a high degree, since the transport 'mesh sweeps' are mutually independent for all angular directions. However, in the well-known production code Dort such a type of angular domain decomposition has to be done on a spatial line-byline basis, causing the parallelism in the code to be very fine-grained. The construction of scalar fluxes and moments requires a large effort for inter-thread or inter-process communication. We have implemented two different parallelization approaches in Dort: firstly, we have used a shared-memory model suitable for SMP (Symmetric Multiprocessor) machines based on the standard OpenMP. The second approach uses the well-known Message Passing Interface (MPI) to establish communication between parallel processes running in a distributed-memory environment. We investigate the benefits and drawbacks of both models and show first results on performance and scaling behaviour of the parallel Dort code. (authors)

  10. Non-adaptive measurement-based quantum computation and multi-party Bell inequalities

    International Nuclear Information System (INIS)

    Hoban, Matty J; Campbell, Earl T; Browne, Dan E; Loukopoulos, Klearchos

    2011-01-01

    Quantum correlations exhibit behaviour that cannot be resolved with a local hidden variable picture of the world. In quantum information, they are also used as resources for information processing tasks, such as measurement-based quantum computation (MQC). In MQC, universal quantum computation can be achieved via adaptive measurements on a suitable entangled resource state. In this paper, we look at a version of MQC in which we remove the adaptivity of measurements and aim to understand what computational abilities remain in the resource. We show that there are explicit connections between this model of computation and the question of non-classicality in quantum correlations. We demonstrate this by focusing on deterministic computation of Boolean functions, in which natural generalizations of the Greenberger-Horne-Zeilinger paradox emerge; we then explore probabilistic computation via, which multipartite Bell inequalities can be defined. We use this correspondence to define families of multi-party Bell inequalities, which we show to have a number of interesting contrasting properties.

  11. Non-adaptive measurement-based quantum computation and multi-party Bell inequalities

    Energy Technology Data Exchange (ETDEWEB)

    Hoban, Matty J; Campbell, Earl T; Browne, Dan E [Department of Physics and Astronomy, University College London, Gower Street, London WC1E 6BT (United Kingdom); Loukopoulos, Klearchos, E-mail: m.hoban@ucl.ac.uk [Department of Materials, Oxford University, Parks Road, Oxford OX1 4PH (United Kingdom)

    2011-02-15

    Quantum correlations exhibit behaviour that cannot be resolved with a local hidden variable picture of the world. In quantum information, they are also used as resources for information processing tasks, such as measurement-based quantum computation (MQC). In MQC, universal quantum computation can be achieved via adaptive measurements on a suitable entangled resource state. In this paper, we look at a version of MQC in which we remove the adaptivity of measurements and aim to understand what computational abilities remain in the resource. We show that there are explicit connections between this model of computation and the question of non-classicality in quantum correlations. We demonstrate this by focusing on deterministic computation of Boolean functions, in which natural generalizations of the Greenberger-Horne-Zeilinger paradox emerge; we then explore probabilistic computation via, which multipartite Bell inequalities can be defined. We use this correspondence to define families of multi-party Bell inequalities, which we show to have a number of interesting contrasting properties.

  12. Project APhiD: A Lorenz-gauged A-Φ decomposition for parallelized computation of ultra-broadband electromagnetic induction in a fully heterogeneous Earth

    Science.gov (United States)

    Weiss, Chester J.

    2013-08-01

    An essential element for computational hypothesis testing, data inversion and experiment design for electromagnetic geophysics is a robust forward solver, capable of easily and quickly evaluating the electromagnetic response of arbitrary geologic structure. The usefulness of such a solver hinges on the balance among competing desires like ease of use, speed of forward calculation, scalability to large problems or compute clusters, parsimonious use of memory access, accuracy and by necessity, the ability to faithfully accommodate a broad range of geologic scenarios over extremes in length scale and frequency content. This is indeed a tall order. The present study addresses recent progress toward the development of a forward solver with these properties. Based on the Lorenz-gauged Helmholtz decomposition, a new finite volume solution over Cartesian model domains endowed with complex-valued electrical properties is shown to be stable over the frequency range 10-2-1010 Hz and range 10-3-105 m in length scale. Benchmark examples are drawn from magnetotellurics, exploration geophysics, geotechnical mapping and laboratory-scale analysis, showing excellent agreement with reference analytic solutions. Computational efficiency is achieved through use of a matrix-free implementation of the quasi-minimum-residual (QMR) iterative solver, which eliminates explicit storage of finite volume matrix elements in favor of "on the fly" computation as needed by the iterative Krylov sequence. Further efficiency is achieved through sparse coupling matrices between the vector and scalar potentials whose non-zero elements arise only in those parts of the model domain where the conductivity gradient is non-zero. Multi-thread parallelization in the QMR solver through OpenMP pragmas is used to reduce the computational cost of its most expensive step: the single matrix-vector product at each iteration. High-level MPI communicators farm independent processes to available compute nodes for

  13. 22 CFR 1429.21 - Computation of time for filing papers.

    Science.gov (United States)

    2010-04-01

    ... 22 Foreign Relations 2 2010-04-01 2010-04-01 true Computation of time for filing papers. 1429.21... MISCELLANEOUS AND GENERAL REQUIREMENTS General Requirements § 1429.21 Computation of time for filing papers. In... subchapter requires the filing of any paper, such document must be received by the Board or the officer or...

  14. Thallium-201 is comparable to technetium-99m-sestamibi for estimating cardiac function in patients with abnormal myocardial perfusion imaging

    Directory of Open Access Journals (Sweden)

    Ming-Che Wu

    2015-11-01

    Full Text Available We analyzed the left-ventricular functional data obtained by cardiac-gated single-photon emission computed tomography myocardial perfusion imaging (MPI with thallium-201 (Tl-201 and technetium-99m-sestamibi (MIBI protocols in different groups of patients, and compared the data between Tl-201 and MIBI. Two hundred and seventy-two patients undergoing dipyridamole stress/redistribution Tl-201 MPI and 563 patients undergoing 1-day rest/dipyridamole stress MIBI MPI were included. Higher mean stress ejection fraction (EF, rest EF, and change in EF (ΔEF were noticed in the normal MPI groups by both Tl-201 and MIBI protocols. Higher mean EF was observed in the females with normal MPI results despite their higher mean age. Comparisons between the Tl-201 and MIBI groups suggested a significant difference in all functional parameters, except for the rest end diastolic volume/end systolic volume and ΔEF between groups with negative MPI results. For the positive MPI groups, there was no significant difference in all parameters, except for the change in end diastolic volume and change in end systolic volume after stress between both protocols. The Tl-201 provides comparable left-ventricular functional data to MIBI cardiac-gated single-photon emission computed tomography in patients with positive MPI results, and may therefore be undertaken routinely for incremental functional information that is especially valuable to this patient group.

  15. Highly reliable computer network for real time system

    International Nuclear Information System (INIS)

    Mohammed, F.A.; Omar, A.A.; Ayad, N.M.A.; Madkour, M.A.I.; Ibrahim, M.K.

    1988-01-01

    Many of computer networks have been studied different trends regarding the network architecture and the various protocols that govern data transfers and guarantee a reliable communication among all a hierarchical network structure has been proposed to provide a simple and inexpensive way for the realization of a reliable real-time computer network. In such architecture all computers in the same level are connected to a common serial channel through intelligent nodes that collectively control data transfers over the serial channel. This level of computer network can be considered as a local area computer network (LACN) that can be used in nuclear power plant control system since it has geographically dispersed subsystems. network expansion would be straight the common channel for each added computer (HOST). All the nodes are designed around a microprocessor chip to provide the required intelligence. The node can be divided into two sections namely a common section that interfaces with serial data channel and a private section to interface with the host computer. This part would naturally tend to have some variations in the hardware details to match the requirements of individual host computers. fig 7

  16. Computer simulations of long-time tails: what's new?

    NARCIS (Netherlands)

    Hoef, van der M.A.; Frenkel, D.

    1995-01-01

    Twenty five years ago Alder and Wainwright discovered, by simulation, the 'long-time tails' in the velocity autocorrelation function of a single particle in fluid [1]. Since then, few qualitatively new results on long-time tails have been obtained by computer simulations. However, within the

  17. Spike-timing-based computation in sound localization.

    Directory of Open Access Journals (Sweden)

    Dan F M Goodman

    2010-11-01

    Full Text Available Spike timing is precise in the auditory system and it has been argued that it conveys information about auditory stimuli, in particular about the location of a sound source. However, beyond simple time differences, the way in which neurons might extract this information is unclear and the potential computational advantages are unknown. The computational difficulty of this task for an animal is to locate the source of an unexpected sound from two monaural signals that are highly dependent on the unknown source signal. In neuron models consisting of spectro-temporal filtering and spiking nonlinearity, we found that the binaural structure induced by spatialized sounds is mapped to synchrony patterns that depend on source location rather than on source signal. Location-specific synchrony patterns would then result in the activation of location-specific assemblies of postsynaptic neurons. We designed a spiking neuron model which exploited this principle to locate a variety of sound sources in a virtual acoustic environment using measured human head-related transfer functions. The model was able to accurately estimate the location of previously unknown sounds in both azimuth and elevation (including front/back discrimination in a known acoustic environment. We found that multiple representations of different acoustic environments could coexist as sets of overlapping neural assemblies which could be associated with spatial locations by Hebbian learning. The model demonstrates the computational relevance of relative spike timing to extract spatial information about sources independently of the source signal.

  18. DISPATCH: A Numerical Simulation Framework for the Exa-scale Era. I. Fundamentals

    Science.gov (United States)

    Nordlund, Åke; P Ramsey, Jon; Popovas, Andrius; Küffmeier, Michael

    2018-03-01

    We introduce a high-performance simulation framework that permits the semi-independent, task-based solution of sets of partial differential equations, typically manifesting as updates to a collection of `patches' in space-time. A hybrid MPI/OpenMP execution model is adopted, where work tasks are controlled by a rank-local `dispatcher' which selects, from a set of tasks generally much larger than the number of physical cores (or hardware threads), tasks that are ready for updating. The definition of a task can vary, for example, with some solving the equations of ideal magnetohydrodynamics (MHD), others non-ideal MHD, radiative transfer, or particle motion, and yet others applying particle-in-cell (PIC) methods. Tasks do not have to be grid-based, while tasks that are, may use either Cartesian or orthogonal curvilinear meshes. Patches may be stationary or moving. Mesh refinement can be static or dynamic. A feature of decisive importance for the overall performance of the framework is that time steps are determined and applied locally; this allows potentially large reductions in the total number of updates required in cases when the signal speed varies greatly across the computational domain, and therefore a corresponding reduction in computing time. Another feature is a load balancing algorithm that operates `locally' and aims to simultaneously minimise load and communication imbalance. The framework generally relies on already existing solvers, whose performance is augmented when run under the framework, due to more efficient cache usage, vectorisation, local time-stepping, plus near-linear and, in principle, unlimited OpenMP and MPI scaling.

  19. DISPATCH: a numerical simulation framework for the exa-scale era - I. Fundamentals

    Science.gov (United States)

    Nordlund, Åke; Ramsey, Jon P.; Popovas, Andrius; Küffmeier, Michael

    2018-06-01

    We introduce a high-performance simulation framework that permits the semi-independent, task-based solution of sets of partial differential equations, typically manifesting as updates to a collection of `patches' in space-time. A hybrid MPI/OpenMP execution model is adopted, where work tasks are controlled by a rank-local `dispatcher' which selects, from a set of tasks generally much larger than the number of physical cores (or hardware threads), tasks that are ready for updating. The definition of a task can vary, for example, with some solving the equations of ideal magnetohydrodynamics (MHD), others non-ideal MHD, radiative transfer, or particle motion, and yet others applying particle-in-cell (PIC) methods. Tasks do not have to be grid based, while tasks that are, may use either Cartesian or orthogonal curvilinear meshes. Patches may be stationary or moving. Mesh refinement can be static or dynamic. A feature of decisive importance for the overall performance of the framework is that time-steps are determined and applied locally; this allows potentially large reductions in the total number of updates required in cases when the signal speed varies greatly across the computational domain, and therefore a corresponding reduction in computing time. Another feature is a load balancing algorithm that operates `locally' and aims to simultaneously minimize load and communication imbalance. The framework generally relies on already existing solvers, whose performance is augmented when run under the framework, due to more efficient cache usage, vectorization, local time-stepping, plus near-linear and, in principle, unlimited OpenMP and MPI scaling.

  20. 5 CFR 831.703 - Computation of annuities for part-time service.

    Science.gov (United States)

    2010-01-01

    ... 5 Administrative Personnel 2 2010-01-01 2010-01-01 false Computation of annuities for part-time... part-time service. (a) Purpose. The computational method in this section shall be used to determine the annuity for an employee who has part-time service on or after April 7, 1986. (b) Definitions. In this...

  1. Real-time data acquisition and feedback control using Linux Intel computers

    International Nuclear Information System (INIS)

    Penaflor, B.G.; Ferron, J.R.; Piglowski, D.A.; Johnson, R.D.; Walker, M.L.

    2006-01-01

    This paper describes the experiences of the DIII-D programming staff in adapting Linux based Intel computing hardware for use in real-time data acquisition and feedback control systems. Due to the highly dynamic and unstable nature of magnetically confined plasmas in tokamak fusion experiments, real-time data acquisition and feedback control systems are in routine use with all major tokamaks. At DIII-D, plasmas are created and sustained using a real-time application known as the digital plasma control system (PCS). During each experiment, the PCS periodically samples data from hundreds of diagnostic signals and provides these data to control algorithms implemented in software. These algorithms compute the necessary commands to send to various actuators that affect plasma performance. The PCS consists of a group of rack mounted Intel Xeon computer systems running an in-house customized version of the Linux operating system tailored specifically to meet the real-time performance needs of the plasma experiments. This paper provides a more detailed description of the real-time computing hardware and custom developed software, including recent work to utilize dual Intel Xeon equipped computers within the PCS

  2. Long-time and large-distance asymptotic behavior of the current-current correlators in the non-linear Schroedinger model

    Energy Technology Data Exchange (ETDEWEB)

    Kozlowski, K.K. [Deutsches Elektronen-Synchrotron (DESY), Hamburg (Germany); Terras, V. [CNRS, ENS Lyon (France). Lab. de Physique

    2010-12-15

    We present a new method allowing us to derive the long-time and large-distance asymptotic behavior of the correlations functions of quantum integrable models from their exact representations. Starting from the form factor expansion of the correlation functions in finite volume, we explain how to reduce the complexity of the computation in the so-called interacting integrable models to the one appearing in free fermion equivalent models. We apply our method to the time-dependent zero-temperature current-current correlation function in the non-linear Schroedinger model and compute the first few terms in its asymptotic expansion. Our result goes beyond the conformal field theory based predictions: in the time-dependent case, other types of excitations than the ones on the Fermi surface contribute to the leading orders of the asymptotics. (orig.)

  3. Long-time and large-distance asymptotic behavior of the current-current correlators in the non-linear Schroedinger model

    International Nuclear Information System (INIS)

    Kozlowski, K.K.; Terras, V.

    2010-12-01

    We present a new method allowing us to derive the long-time and large-distance asymptotic behavior of the correlations functions of quantum integrable models from their exact representations. Starting from the form factor expansion of the correlation functions in finite volume, we explain how to reduce the complexity of the computation in the so-called interacting integrable models to the one appearing in free fermion equivalent models. We apply our method to the time-dependent zero-temperature current-current correlation function in the non-linear Schroedinger model and compute the first few terms in its asymptotic expansion. Our result goes beyond the conformal field theory based predictions: in the time-dependent case, other types of excitations than the ones on the Fermi surface contribute to the leading orders of the asymptotics. (orig.)

  4. Computational Performance of a Parallelized Three-Dimensional High-Order Spectral Element Toolbox

    Science.gov (United States)

    Bosshard, Christoph; Bouffanais, Roland; Clémençon, Christian; Deville, Michel O.; Fiétier, Nicolas; Gruber, Ralf; Kehtari, Sohrab; Keller, Vincent; Latt, Jonas

    In this paper, a comprehensive performance review of an MPI-based high-order three-dimensional spectral element method C++ toolbox is presented. The focus is put on the performance evaluation of several aspects with a particular emphasis on the parallel efficiency. The performance evaluation is analyzed with help of a time prediction model based on a parameterization of the application and the hardware resources. A tailor-made CFD computation benchmark case is introduced and used to carry out this review, stressing the particular interest for clusters with up to 8192 cores. Some problems in the parallel implementation have been detected and corrected. The theoretical complexities with respect to the number of elements, to the polynomial degree, and to communication needs are correctly reproduced. It is concluded that this type of code has a nearly perfect speed up on machines with thousands of cores, and is ready to make the step to next-generation petaflop machines.

  5. Architecture and method for a burst buffer using flash technology

    Science.gov (United States)

    Tzelnic, Percy; Faibish, Sorin; Gupta, Uday K.; Bent, John; Grider, Gary Alan; Chen, Hsing-bung

    2016-03-15

    A parallel supercomputing cluster includes compute nodes interconnected in a mesh of data links for executing an MPI job, and solid-state storage nodes each linked to a respective group of the compute nodes for receiving checkpoint data from the respective compute nodes, and magnetic disk storage linked to each of the solid-state storage nodes for asynchronous migration of the checkpoint data from the solid-state storage nodes to the magnetic disk storage. Each solid-state storage node presents a file system interface to the MPI job, and multiple MPI processes of the MPI job write the checkpoint data to a shared file in the solid-state storage in a strided fashion, and the solid-state storage node asynchronously migrates the checkpoint data from the shared file in the solid-state storage to the magnetic disk storage and writes the checkpoint data to the magnetic disk storage in a sequential fashion.

  6. New Flutter Analysis Technique for Time-Domain Computational Aeroelasticity

    Science.gov (United States)

    Pak, Chan-Gi; Lung, Shun-Fat

    2017-01-01

    A new time-domain approach for computing flutter speed is presented. Based on the time-history result of aeroelastic simulation, the unknown unsteady aerodynamics model is estimated using a system identification technique. The full aeroelastic model is generated via coupling the estimated unsteady aerodynamic model with the known linear structure model. The critical dynamic pressure is computed and used in the subsequent simulation until the convergence of the critical dynamic pressure is achieved. The proposed method is applied to a benchmark cantilevered rectangular wing.

  7. Revisiting Newtonian and Non-Newtonian Fluid Mechanics Using Computer Algebra

    Science.gov (United States)

    Knight, D. G.

    2006-01-01

    This article illustrates how a computer algebra system, such as Maple[R], can assist in the study of theoretical fluid mechanics, for both Newtonian and non-Newtonian fluids. The continuity equation, the stress equations of motion, the Navier-Stokes equations, and various constitutive equations are treated, using a full, but straightforward,…

  8. Project Energise: Using participatory approaches and real time computer prompts to reduce occupational sitting and increase work time physical activity in office workers.

    Science.gov (United States)

    Gilson, Nicholas D; Ng, Norman; Pavey, Toby G; Ryde, Gemma C; Straker, Leon; Brown, Wendy J

    2016-11-01

    This efficacy study assessed the added impact real time computer prompts had on a participatory approach to reduce occupational sedentary exposure and increase physical activity. Quasi-experimental. 57 Australian office workers (mean [SD]; age=47 [11] years; BMI=28 [5]kg/m 2 ; 46 men) generated a menu of 20 occupational 'sit less and move more' strategies through participatory workshops, and were then tasked with implementing strategies for five months (July-November 2014). During implementation, a sub-sample of workers (n=24) used a chair sensor/software package (Sitting Pad) that gave real time prompts to interrupt desk sitting. Baseline and intervention sedentary behaviour and physical activity (GENEActiv accelerometer; mean work time percentages), and minutes spent sitting at desks (Sitting Pad; mean total time and longest bout) were compared between non-prompt and prompt workers using a two-way ANOVA. Workers spent close to three quarters of their work time sedentary, mostly sitting at desks (mean [SD]; total desk sitting time=371 [71]min/day; longest bout spent desk sitting=104 [43]min/day). Intervention effects were four times greater in workers who used real time computer prompts (8% decrease in work time sedentary behaviour and increase in light intensity physical activity; pcomputer prompts facilitated the impact of a participatory approach on reductions in occupational sedentary exposure, and increases in physical activity. Copyright © 2016 Sports Medicine Australia. Published by Elsevier Ltd. All rights reserved.

  9. Proposal for nanoscale cascaded plasmonic majority gates for non-Boolean computation.

    Science.gov (United States)

    Dutta, Sourav; Zografos, Odysseas; Gurunarayanan, Surya; Radu, Iuliana; Soree, Bart; Catthoor, Francky; Naeemi, Azad

    2017-12-19

    Surface-plasmon-polariton waves propagating at the interface between a metal and a dielectric, hold the key to future high-bandwidth, dense on-chip integrated logic circuits overcoming the diffraction limitation of photonics. While recent advances in plasmonic logic have witnessed the demonstration of basic and universal logic gates, these CMOS oriented digital logic gates cannot fully utilize the expressive power of this novel technology. Here, we aim at unraveling the true potential of plasmonics by exploiting an enhanced native functionality - the majority voter. Contrary to the state-of-the-art plasmonic logic devices, we use the phase of the wave instead of the intensity as the state or computational variable. We propose and demonstrate, via numerical simulations, a comprehensive scheme for building a nanoscale cascadable plasmonic majority logic gate along with a novel referencing scheme that can directly translate the information encoded in the amplitude and phase of the wave into electric field intensity at the output. Our MIM-based 3-input majority gate displays a highly improved overall area of only 0.636 μm 2 for a single-stage compared with previous works on plasmonic logic. The proposed device demonstrates non-Boolean computational capability and can find direct utility in highly parallel real-time signal processing applications like pattern recognition.

  10. Distributed computing for real-time petroleum reservoir monitoring

    Energy Technology Data Exchange (ETDEWEB)

    Ayodele, O. R. [University of Alberta, Edmonton, AB (Canada)

    2004-05-01

    Computer software architecture is presented to illustrate how the concept of distributed computing can be applied to real-time reservoir monitoring processes, permitting the continuous monitoring of the dynamic behaviour of petroleum reservoirs at much shorter intervals. The paper describes the fundamental technologies driving distributed computing, namely Java 2 Platform Enterprise edition (J2EE) by Sun Microsystems, and the Microsoft Dot-Net (Microsoft.Net) initiative, and explains the challenges involved in distributed computing. These are: (1) availability of permanently placed downhole equipment to acquire and transmit seismic data; (2) availability of high bandwidth to transmit the data; (3) security considerations; (4) adaptation of existing legacy codes to run on networks as downloads on demand; and (5) credibility issues concerning data security over the Internet. Other applications of distributed computing in the petroleum industry are also considered, specifically MWD, LWD and SWD (measurement-while-drilling, logging-while-drilling, and simulation-while-drilling), and drill-string vibration monitoring. 23 refs., 1 fig.

  11. Computational modeling of pitching cylinder-type ocean wave energy converters using 3D MPI-parallel simulations

    Science.gov (United States)

    Freniere, Cole; Pathak, Ashish; Raessi, Mehdi

    2016-11-01

    Ocean Wave Energy Converters (WECs) are devices that convert energy from ocean waves into electricity. To aid in the design of WECs, an advanced computational framework has been developed which has advantages over conventional methods. The computational framework simulates the performance of WECs in a virtual wave tank by solving the full Navier-Stokes equations in 3D, capturing the fluid-structure interaction, nonlinear and viscous effects. In this work, we present simulations of the performance of pitching cylinder-type WECs and compare against experimental data. WECs are simulated at both model and full scales. The results are used to determine the role of the Keulegan-Carpenter (KC) number. The KC number is representative of viscous drag behavior on a bluff body in an oscillating flow, and is considered an important indicator of the dynamics of a WEC. Studying the effects of the KC number is important for determining the validity of the Froude scaling and the inviscid potential flow theory, which are heavily relied on in the conventional approaches to modeling WECs. Support from the National Science Foundation is gratefully acknowledged.

  12. Real Time Animation of Trees Based on BBSC in Computer Games

    Directory of Open Access Journals (Sweden)

    Xuefeng Ao

    2009-01-01

    Full Text Available That researchers in the field of computer games usually find it is difficult to simulate the motion of actual 3D model trees lies in the fact that the tree model itself has very complicated structure, and many sophisticated factors need to be considered during the simulation. Though there are some works on simulating 3D tree and its motion, few of them are used in computer games due to the high demand for real-time in computer games. In this paper, an approach of animating trees in computer games based on a novel tree model representation—Ball B-Spline Curves (BBSCs are proposed. By taking advantage of the good features of the BBSC-based model, physical simulation of the motion of leafless trees with wind blowing becomes easier and more efficient. The method can generate realistic 3D tree animation in real-time, which meets the high requirement for real time in computer games.

  13. MUSIDH, multiple use of simulated demographic histories, a novel method to reduce computation time in microsimulation models of infectious diseases.

    Science.gov (United States)

    Fischer, E A J; De Vlas, S J; Richardus, J H; Habbema, J D F

    2008-09-01

    Microsimulation of infectious diseases requires simulation of many life histories of interacting individuals. In particular, relatively rare infections such as leprosy need to be studied in very large populations. Computation time increases disproportionally with the size of the simulated population. We present a novel method, MUSIDH, an acronym for multiple use of simulated demographic histories, to reduce computation time. Demographic history refers to the processes of birth, death and all other demographic events that should be unrelated to the natural course of an infection, thus non-fatal infections. MUSIDH attaches a fixed number of infection histories to each demographic history, and these infection histories interact as if being the infection history of separate individuals. With two examples, mumps and leprosy, we show that the method can give a factor 50 reduction in computation time at the cost of a small loss in precision. The largest reductions are obtained for rare infections with complex demographic histories.

  14. Computer network time synchronization the network time protocol on earth and in space

    CERN Document Server

    Mills, David L

    2010-01-01

    Carefully coordinated, reliable, and accurate time synchronization is vital to a wide spectrum of fields-from air and ground traffic control, to buying and selling goods and services, to TV network programming. Ill-gotten time could even lead to the unimaginable and cause DNS caches to expire, leaving the entire Internet to implode on the root servers.Written by the original developer of the Network Time Protocol (NTP), Computer Network Time Synchronization: The Network Time Protocol on Earth and in Space, Second Edition addresses the technological infrastructure of time dissemination, distrib

  15. Comparison of neuronal spike exchange methods on a Blue Gene/P supercomputer

    Directory of Open Access Journals (Sweden)

    Michael eHines

    2011-11-01

    Full Text Available The performance of several spike exchange methods using a Blue Gene/P supercomputerhas been tested with 8K to 128K cores using randomly connected networks of up to 32M cells with 1k connections per cell and 4M cells with 10k connections per cell. The spike exchange methods used are the standard Message Passing Interface collective, MPI_Allgather, and several variants of the non-blocking multisend method either implemented via non-blocking MPI_Isend, or exploiting the possibility of very low overhead direct memory access communication available on the Blue Gene/P. In all cases the worst performing method was that using MPI_Isend due to the high overhead of initiating a spike communication. The two best performing methods --- the persistent multisend method using the Record-Replay feature of the Deep Computing Messaging Framework DCMF_Multicast;and a two phase multisend in which a DCMF_Multicast is used to first send to a subset of phase 1 destination cores which then pass it on to their subset of phase 2 destination cores --- had similar performance with very low overhead for the initiation of spike communication. Departure from ideal scaling for the multisend methods is almost completely due to load imbalance caused by the largevariation in number of cells that fire on each processor in the interval between synchronization. Spike exchange time itself is negligible since transmission overlaps with computation and is handled by a direct memory access controller. We conclude that ideal performance scaling will be ultimately limited by imbalance between incoming processor spikes between synchronization intervals. Thus, counterintuitively, maximization of load balance requires that the distribution of cells on processors should not reflect neural net architecture but be randomly distributed so that sets of cells which are burst firing together should be on different processors with their targets on as large a set of processors as possible.

  16. Computer/Mobile Device Screen Time of Children and Their Eye Care Behavior: The Roles of Risk Perception and Parenting.

    Science.gov (United States)

    Chang, Fong-Ching; Chiu, Chiung-Hui; Chen, Ping-Hung; Miao, Nae-Fang; Chiang, Jeng-Tung; Chuang, Hung-Yi

    2018-03-01

    This study assessed the computer/mobile device screen time and eye care behavior of children and examined the roles of risk perception and parental practices. Data were obtained from a sample of 2,454 child-parent dyads recruited from 30 primary schools in Taipei city and New Taipei city, Taiwan, in 2016. Self-administered questionnaires were collected from students and parents. Fifth-grade students spend more time on new media (computer/smartphone/tablet: 16 hours a week) than on traditional media (television: 10 hours a week). The average daily screen time (3.5 hours) for these children exceeded the American Academy of Pediatrics recommendations (≤2 hours). Multivariate analysis results showed that after controlling for demographic factors, the parents with higher levels of risk perception and parental efficacy were more likely to mediate their child's eye care behavior. Children who reported lower academic performance, who were from non-intact families, reported lower levels of risk perception of mobile device use, had parents who spent more time using computers and mobile devices, and had lower levels of parental mediation were more likely to spend more time using computers and mobile devices; whereas children who reported higher academic performance, higher levels of risk perception, and higher levels of parental mediation were more likely to engage in higher levels of eye care behavior. Risk perception by children and parental practices are associated with the amount of screen time that children regularly engage in and their level of eye care behavior.

  17. Non-Hermitian photonics based on parity-time symmetry

    Science.gov (United States)

    Feng, Liang; El-Ganainy, Ramy; Ge, Li

    2017-12-01

    Nearly one century after the birth of quantum mechanics, parity-time symmetry is revolutionizing and extending quantum theories to include a unique family of non-Hermitian Hamiltonians. While conceptually striking, experimental demonstration of parity-time symmetry remains unexplored in quantum electronic systems. The flexibility of photonics allows for creating and superposing non-Hermitian eigenstates with ease using optical gain and loss, which makes it an ideal platform to explore various non-Hermitian quantum symmetry paradigms for novel device functionalities. Such explorations that employ classical photonic platforms not only deepen our understanding of fundamental quantum physics but also facilitate technological breakthroughs for photonic applications. Research into non-Hermitian photonics therefore advances and benefits both fields simultaneously.

  18. Investigating the influence of eating habits, body weight and television programme preferences on television viewing time and domestic computer usage.

    Science.gov (United States)

    Raptou, Elena; Papastefanou, Georgios; Mattas, Konstadinos

    2017-01-01

    The present study explored the influence of eating habits, body weight and television programme preference on television viewing time and domestic computer usage, after adjusting for sociodemographic characteristics and home media environment indicators. In addition, potential substitution or complementarity in screen time was investigated. Individual level data were collected via questionnaires that were administered to a random sample of 2,946 Germans. The econometric analysis employed a seemingly unrelated bivariate ordered probit model to conjointly estimate television viewing time and time engaged in domestic computer usage. Television viewing and domestic computer usage represent two independent behaviours in both genders and across all age groups. Dietary habits have a significant impact on television watching with less healthy food choices associated with increasing television viewing time. Body weight is found to be positively correlated with television screen time in both men and women, and overweight individuals have a higher propensity for heavy television viewing. Similar results were obtained for age groups where an increasing body mass index (BMI) in adults over 24 years old is more likely to be positively associated with a higher duration of television watching. With respect to dietary habits of domestic computer users, participants aged over 24 years of both genders seem to adopt more healthy dietary patterns. A downward trend in the BMI of domestic computer users was observed in women and adults aged 25-60 years. On the contrary, young domestic computer users 18-24 years old have a higher body weight than non-users. Television programme preferences also affect television screen time with clear differences to be observed between genders and across different age groups. In order to reduce total screen time, health interventions should target different types of screen viewing audiences separately.

  19. Spatial Air Index Based on Largest Empty Rectangles for Non-Flat Wireless Broadcast in Pervasive Computing

    Directory of Open Access Journals (Sweden)

    Jun-Hong Shen

    2016-11-01

    Full Text Available In pervasive computing, location-based services (LBSs are valuable for mobile clients based on their current locations. LBSs use spatial window queries to enable useful applications for mobile clients. Based on skewed access patterns of mobile clients, non-flat wireless broadcast has been shown to efficiently disseminate spatial objects to mobile clients. In this paper, we consider a scenario in which spatial objects are broadcast to mobile clients over a wireless channel in a non-flat broadcast manner to process window queries. For such a scenario, we propose an efficient spatial air index method to handle window query access in non-flat wireless broadcast environments. The concept of largest empty rectangles is used to avoid unnecessary examination of the broadcast content, thus reducing the processing time for window queries. Simulation results show that the proposed spatial air index method outperforms the existing methods under various settings.

  20. Modeling of Volatility with Non-linear Time Series Model

    OpenAIRE

    Kim Song Yon; Kim Mun Chol

    2013-01-01

    In this paper, non-linear time series models are used to describe volatility in financial time series data. To describe volatility, two of the non-linear time series are combined into form TAR (Threshold Auto-Regressive Model) with AARCH (Asymmetric Auto-Regressive Conditional Heteroskedasticity) error term and its parameter estimation is studied.

  1. Development of Computer Program for Analysis of Irregular Non Homogenous Radiation Shielding

    International Nuclear Information System (INIS)

    Bang Rozali; Nina Kusumah; Hendro Tjahjono; Darlis

    2003-01-01

    A computer program for radiation shielding analysis has been developed to obtain radiation attenuation calculation in non-homogenous radiation shielding and irregular geometry. By determining radiation source strength, geometrical shape of radiation source, location, dimension and geometrical shape of radiation shielding, radiation level of a point at certain position from radiation source can be calculated. By using a computer program, calculation result of radiation distribution analysis can be obtained for some analytical points simultaneously. (author)

  2. Current Trends in Numerical Simulation for Parallel Engineering Environments New Directions and Work-in-Progress

    International Nuclear Information System (INIS)

    Trinitis, C; Schulz, M

    2006-01-01

    In today's world, the use of parallel programming and architectures is essential for simulating practical problems in engineering and related disciplines. Remarkable progress in CPU architecture, system scalability, and interconnect technology continues to provide new opportunities, as well as new challenges for both system architects and software developers. These trends are paralleled by progress in parallel algorithms, simulation techniques, and software integration from multiple disciplines. ParSim brings together researchers from both application disciplines and computer science and aims at fostering closer cooperation between these fields. Since its successful introduction in 2002, ParSim has established itself as an integral part of the EuroPVM/MPI conference series. In contrast to traditional conferences, emphasis is put on the presentation of up-to-date results with a short turn-around time. This offers a unique opportunity to present new aspects in this dynamic field and discuss them with a wide, interdisciplinary audience. The EuroPVM/MPI conference series, as one of the prime events in parallel computation, serves as an ideal surrounding for ParSim. This combination enables the participants to present and discuss their work within the scope of both the session and the host conference. This year, eleven papers from authors in nine countries were submitted to ParSim, and we selected five of them. They cover a wide range of different application fields including gas flow simulations, thermo-mechanical processes in nuclear waste storage, and cosmological simulations. At the same time, the selected contributions also address the computer science side of their codes and discuss different parallelization strategies, programming models and languages, as well as the use nonblocking collective operations in MPI. We are confident that this provides an attractive program and that ParSim will be an informal setting for lively discussions and for fostering new

  3. Soft Real-Time PID Control on a VME Computer

    Science.gov (United States)

    Karayan, Vahag; Sander, Stanley; Cageao, Richard

    2007-01-01

    microPID (uPID) is a computer program for real-time proportional + integral + derivative (PID) control of a translation stage in a Fourier-transform ultraviolet spectrometer. microPID implements a PID control loop over a position profile at sampling rate of 8 kHz (sampling period 125microseconds). The software runs in a strippeddown Linux operating system on a VersaModule Eurocard (VME) computer operating in real-time priority queue using an embedded controller, a 16-bit digital-to-analog converter (D/A) board, and a laser-positioning board (LPB). microPID consists of three main parts: (1) VME device-driver routines, (2) software that administers a custom protocol for serial communication with a control computer, and (3) a loop section that obtains the current position from an LPB-driver routine, calculates the ideal position from the profile, and calculates a new voltage command by use of an embedded PID routine all within each sampling period. The voltage command is sent to the D/A board to control the stage. microPID uses special kernel headers to obtain microsecond timing resolution. Inasmuch as microPID implements a single-threaded process and all other processes are disabled, the Linux operating system acts as a soft real-time system.

  4. Present and future aspects of PROSA - A computer program for near real time accountancy

    International Nuclear Information System (INIS)

    Beedgen, R.

    1987-01-01

    The methods of near real time accountancy (NRTA) for safeguarding nuclear material received a lot of attention in the last years. They developed PROSA 1.0 as a computer program to evaluate a sequence of material balance data based on three statistical tests for a selected false alarm probability. A new NRTA test procedure will be included and an option for the calculation of detection probabilities of hypothetical loss patterns will be made available in future releases of PROSA. Under a non-loss assumption, PROSA may also be used for the analysis of facility measurement models

  5. DIRProt: a computational approach for discriminating insecticide resistant proteins from non-resistant proteins.

    Science.gov (United States)

    Meher, Prabina Kumar; Sahu, Tanmaya Kumar; Banchariya, Anjali; Rao, Atmakuri Ramakrishna

    2017-03-24

    Insecticide resistance is a major challenge for the control program of insect pests in the fields of crop protection, human and animal health etc. Resistance to different insecticides is conferred by the proteins encoded from certain class of genes of the insects. To distinguish the insecticide resistant proteins from non-resistant proteins, no computational tool is available till date. Thus, development of such a computational tool will be helpful in predicting the insecticide resistant proteins, which can be targeted for developing appropriate insecticides. Five different sets of feature viz., amino acid composition (AAC), di-peptide composition (DPC), pseudo amino acid composition (PAAC), composition-transition-distribution (CTD) and auto-correlation function (ACF) were used to map the protein sequences into numeric feature vectors. The encoded numeric vectors were then used as input in support vector machine (SVM) for classification of insecticide resistant and non-resistant proteins. Higher accuracies were obtained under RBF kernel than that of other kernels. Further, accuracies were observed to be higher for DPC feature set as compared to others. The proposed approach achieved an overall accuracy of >90% in discriminating resistant from non-resistant proteins. Further, the two classes of resistant proteins i.e., detoxification-based and target-based were discriminated from non-resistant proteins with >95% accuracy. Besides, >95% accuracy was also observed for discrimination of proteins involved in detoxification- and target-based resistance mechanisms. The proposed approach not only outperformed Blastp, PSI-Blast and Delta-Blast algorithms, but also achieved >92% accuracy while assessed using an independent dataset of 75 insecticide resistant proteins. This paper presents the first computational approach for discriminating the insecticide resistant proteins from non-resistant proteins. Based on the proposed approach, an online prediction server DIRProt has

  6. The use of computers for the performance and analysis of non-destructive testing

    International Nuclear Information System (INIS)

    Edelmann, X.; Pfister, O.

    1988-01-01

    Examples of the use of computers in non-destructive testing are related. Ultrasonic testing is especially addressed. The employment of computers means improvements for the user, the possibility of registering the reflector position, storage of test data and help with documentation. The test can be automated. The introduction of expert systems is expected for the future. 8 figs., 12 refs

  7. Analysis of 2D Torus and Hub Topologies of 100Mb/s Ethernet for the Whitney Commodity Computing Testbed

    Science.gov (United States)

    Pedretti, Kevin T.; Fineberg, Samuel A.; Kutler, Paul (Technical Monitor)

    1997-01-01

    A variety of different network technologies and topologies are currently being evaluated as part of the Whitney Project. This paper reports on the implementation and performance of a Fast Ethernet network configured in a 4x4 2D torus topology in a testbed cluster of 'commodity' Pentium Pro PCs. Several benchmarks were used for performance evaluation: an MPI point to point message passing benchmark, an MPI collective communication benchmark, and the NAS Parallel Benchmarks version 2.2 (NPB2). Our results show that for point to point communication on an unloaded network, the hub and 1 hop routes on the torus have about the same bandwidth and latency. However, the bandwidth decreases and the latency increases on the torus for each additional route hop. Collective communication benchmarks show that the torus provides roughly four times more aggregate bandwidth and eight times faster MPI barrier synchronizations than a hub based network for 16 processor systems. Finally, the SOAPBOX benchmarks, which simulate real-world CFD applications, generally demonstrated substantially better performance on the torus than on the hub. In the few cases the hub was faster, the difference was negligible. In total, our experimental results lead to the conclusion that for Fast Ethernet networks, the torus topology has better performance and scales better than a hub based network.

  8. The prediction of surface temperature in the new seasonal prediction system based on the MPI-ESM coupled climate model

    Science.gov (United States)

    Baehr, J.; Fröhlich, K.; Botzet, M.; Domeisen, D. I. V.; Kornblueh, L.; Notz, D.; Piontek, R.; Pohlmann, H.; Tietsche, S.; Müller, W. A.

    2015-05-01

    A seasonal forecast system is presented, based on the global coupled climate model MPI-ESM as used for CMIP5 simulations. We describe the initialisation of the system and analyse its predictive skill for surface temperature. The presented system is initialised in the atmospheric, oceanic, and sea ice component of the model from reanalysis/observations with full field nudging in all three components. For the initialisation of the ensemble, bred vectors with a vertically varying norm are implemented in the ocean component to generate initial perturbations. In a set of ensemble hindcast simulations, starting each May and November between 1982 and 2010, we analyse the predictive skill. Bias-corrected ensemble forecasts for each start date reproduce the observed surface temperature anomalies at 2-4 months lead time, particularly in the tropics. Niño3.4 sea surface temperature anomalies show a small root-mean-square error and predictive skill up to 6 months. Away from the tropics, predictive skill is mostly limited to the ocean, and to regions which are strongly influenced by ENSO teleconnections. In summary, the presented seasonal prediction system based on a coupled climate model shows predictive skill for surface temperature at seasonal time scales comparable to other seasonal prediction systems using different underlying models and initialisation strategies. As the same model underlying our seasonal prediction system—with a different initialisation—is presently also used for decadal predictions, this is an important step towards seamless seasonal-to-decadal climate predictions.

  9. Computing return times or return periods with rare event algorithms

    Science.gov (United States)

    Lestang, Thibault; Ragone, Francesco; Bréhier, Charles-Edouard; Herbert, Corentin; Bouchet, Freddy

    2018-04-01

    The average time between two occurrences of the same event, referred to as its return time (or return period), is a useful statistical concept for practical applications. For instance insurances or public agencies may be interested by the return time of a 10 m flood of the Seine river in Paris. However, due to their scarcity, reliably estimating return times for rare events is very difficult using either observational data or direct numerical simulations. For rare events, an estimator for return times can be built from the extrema of the observable on trajectory blocks. Here, we show that this estimator can be improved to remain accurate for return times of the order of the block size. More importantly, we show that this approach can be generalised to estimate return times from numerical algorithms specifically designed to sample rare events. So far those algorithms often compute probabilities, rather than return times. The approach we propose provides a computationally extremely efficient way to estimate numerically the return times of rare events for a dynamical system, gaining several orders of magnitude of computational costs. We illustrate the method on two kinds of observables, instantaneous and time-averaged, using two different rare event algorithms, for a simple stochastic process, the Ornstein–Uhlenbeck process. As an example of realistic applications to complex systems, we finally discuss extreme values of the drag on an object in a turbulent flow.

  10. Real-time non-rigid target tracking for ultrasound-guided clinical interventions

    Science.gov (United States)

    Zachiu, C.; Ries, M.; Ramaekers, P.; Guey, J.-L.; Moonen, C. T. W.; de Senneville, B. Denis

    2017-10-01

    Biological motion is a problem for non- or mini-invasive interventions when conducted in mobile/deformable organs due to the targeted pathology moving/deforming with the organ. This may lead to high miss rates and/or incomplete treatment of the pathology. Therefore, real-time tracking of the target anatomy during the intervention would be beneficial for such applications. Since the aforementioned interventions are often conducted under B-mode ultrasound (US) guidance, target tracking can be achieved via image registration, by comparing the acquired US images to a separate image established as positional reference. However, such US images are intrinsically altered by speckle noise, introducing incoherent gray-level intensity variations. This may prove problematic for existing intensity-based registration methods. In the current study we address US-based target tracking by employing the recently proposed EVolution registration algorithm. The method is, by construction, robust to transient gray-level intensities. Instead of directly matching image intensities, EVolution aligns similar contrast patterns in the images. Moreover, the displacement is computed by evaluating a matching criterion for image sub-regions rather than on a point-by-point basis, which typically provides more robust motion estimates. However, unlike similar previously published approaches, which assume rigid displacements in the image sub-regions, the EVolution algorithm integrates the matching criterion in a global functional, allowing the estimation of an elastic dense deformation. The approach was validated for soft tissue tracking under free-breathing conditions on the abdomen of seven healthy volunteers. Contact echography was performed on all volunteers, while three of the volunteers also underwent standoff echography. Each of the two modalities is predominantly specific to a particular type of non- or mini-invasive clinical intervention. The method demonstrated on average an accuracy of

  11. Near real-time digital holographic microscope based on GPU parallel computing

    Science.gov (United States)

    Zhu, Gang; Zhao, Zhixiong; Wang, Huarui; Yang, Yan

    2018-01-01

    A transmission near real-time digital holographic microscope with in-line and off-axis light path is presented, in which the parallel computing technology based on compute unified device architecture (CUDA) and digital holographic microscopy are combined. Compared to other holographic microscopes, which have to implement reconstruction in multiple focal planes and are time-consuming the reconstruction speed of the near real-time digital holographic microscope can be greatly improved with the parallel computing technology based on CUDA, so it is especially suitable for measurements of particle field in micrometer and nanometer scale. Simulations and experiments show that the proposed transmission digital holographic microscope can accurately measure and display the velocity of particle field in micrometer scale, and the average velocity error is lower than 10%.With the graphic processing units(GPU), the computing time of the 100 reconstruction planes(512×512 grids) is lower than 120ms, while it is 4.9s using traditional reconstruction method by CPU. The reconstruction speed has been raised by 40 times. In other words, it can handle holograms at 8.3 frames per second and the near real-time measurement and display of particle velocity field are realized. The real-time three-dimensional reconstruction of particle velocity field is expected to achieve by further optimization of software and hardware. Keywords: digital holographic microscope,

  12. Diagnostic performance of combined noninvasive coronary angiography and myocardial perfusion imaging using 320 row detector computed tomography

    DEFF Research Database (Denmark)

    Vavere, Andrea L; Simon, Gregory G; George, Richard T

    2013-01-01

    Multidetector coronary computed tomography angiography (CTA) is a promising modality for widespread clinical application because of its noninvasive nature and high diagnostic accuracy as found in previous studies using 64 to 320 simultaneous detector rows. It is, however, limited in its ability...... to detect myocardial ischemia. In this article, we describe the design of the CORE320 study ("Combined coronary atherosclerosis and myocardial perfusion evaluation using 320 detector row computed tomography"). This prospective, multicenter, multinational study is unique in that it is designed to assess...... the diagnostic performance of combined 320-row CTA and myocardial CT perfusion imaging (CTP) in comparison with the combination of invasive coronary angiography and single-photon emission computed tomography myocardial perfusion imaging (SPECT-MPI). The trial is being performed at 16 medical centers located in 8...

  13. Increased control and data acquisition capabilities via microprocessor-based timed reading and time plot CAMAC modules

    International Nuclear Information System (INIS)

    Barsotti, E.J.; Purvis, D.M.; Loveless, R.L.; Hance, R.D.

    1977-01-01

    By implementing a microprocessor-based CAMAC module capable of being programmed to function as a time plot or a timed reading controller, the capabilities of the experimental area serial CAMAC control and data acquisition system at Fermilab have been extensively increased. These modules provide real-time data gathering and pre-processing functions synchronized to the main accelerator cycle clock while adding only a minimal amount to the host computer's CPU time and memory requirements. Critical data requiring a fast system response can be read by the host computer immediately following the request for this data. The vast majority of data, being non-critical, can be read via a block transfer during a non-busy time in the main accelerator cycle. Each of Fermilab's experimental areas, Meson, Neutrino and Proton, are controlled primarily by a Lockheed MAC-16 computer. Each of these three minicomputers is linked to a larger Digital Equipment Corporation PDP-11/50 computer. The PDP-11 computers are used primarily for data analysis and reduction. Presently two PDP-11's are linked to the three MAC-16 computers

  14. Computer tomographic evaluation of digestive tract non-Hodgkin lymphomas.

    Science.gov (United States)

    Lupescu, Ioana G; Grasu, Mugur; Goldis, Gheorghe; Popa, Gelu; Gheorghe, Cristian; Vasilescu, Catalin; Moicean, Andreea; Herlea, Vlad; Georgescu, Serban A

    2007-09-01

    Computer Tomographic (CT) study is crucial for defining distribution, characteristics and staging of primary gastrointestinal lymphomas. The presence of multifocal sites, the wall thickening with diffuse infiltration of the affected gastrointestinal (GI) segment in association with regional adenopathies, permit the orientation of the CT diagnosis for primary GI lymphomas. The gold standard for diagnosis remains, in all cases of digestive tract non-Hodgkin lymphomas (NHL), the histological examination, which allows a tissue diagnosis, performed preferably by transmural biopsy.

  15. Mean first-passage times in confined media: from Markovian to non-Markovian processes

    International Nuclear Information System (INIS)

    Bénichou, O; Voituriez, R; Guérin, T

    2015-01-01

    We review recent theoretical works that enable the accurate evaluation of the mean first passage time (MFPT) of a random walker to a target in confinement for Markovian (memory-less) and non-Markovian walkers. For the Markovian problem, we present a general theory which allows one to accurately evaluate the MFPT and its extensions to related first-passage observables such as splitting probabilities and occupation times. We show that this analytical approach provides a universal scaling dependence of the MFPT on both the volume of the confining domain and the source–target distance in the case of general scale-invariant processes. This analysis is applicable to a broad range of stochastic processes characterized by length scale-invariant properties, and reveals the key role that can be played by the starting position of the random walker. We then present an extension to non-Markovian walks by taking the specific example of a tagged monomer of a polymer chain looking for a target in confinement. We show that the MFPT can be calculated accurately by computing the distribution of the positions of all the monomers in the chain at the instant of reaction. Such a theory can be used to derive asymptotic relations that generalize the scaling dependence with the volume and the initial distance to the target derived for Markovian walks. Finally, we present an application of this theory to the problem of the first contact time between the two ends of a polymer chain, and review the various theoretical approaches of this non- Markovian problem. (topical review)

  16. Distributed Memory Parallel Computing with SEAWAT

    Science.gov (United States)

    Verkaik, J.; Huizer, S.; van Engelen, J.; Oude Essink, G.; Ram, R.; Vuik, K.

    2017-12-01

    Fresh groundwater reserves in coastal aquifers are threatened by sea-level rise, extreme weather conditions, increasing urbanization and associated groundwater extraction rates. To counteract these threats, accurate high-resolution numerical models are required to optimize the management of these precious reserves. The major model drawbacks are long run times and large memory requirements, limiting the predictive power of these models. Distributed memory parallel computing is an efficient technique for reducing run times and memory requirements, where the problem is divided over multiple processor cores. A new Parallel Krylov Solver (PKS) for SEAWAT is presented. PKS has recently been applied to MODFLOW and includes Conjugate Gradient (CG) and Biconjugate Gradient Stabilized (BiCGSTAB) linear accelerators. Both accelerators are preconditioned by an overlapping additive Schwarz preconditioner in a way that: a) subdomains are partitioned using Recursive Coordinate Bisection (RCB) load balancing, b) each subdomain uses local memory only and communicates with other subdomains by Message Passing Interface (MPI) within the linear accelerator, c) it is fully integrated in SEAWAT. Within SEAWAT, the PKS-CG solver replaces the Preconditioned Conjugate Gradient (PCG) solver for solving the variable-density groundwater flow equation and the PKS-BiCGSTAB solver replaces the Generalized Conjugate Gradient (GCG) solver for solving the advection-diffusion equation. PKS supports the third-order Total Variation Diminishing (TVD) scheme for computing advection. Benchmarks were performed on the Dutch national supercomputer (https://userinfo.surfsara.nl/systems/cartesius) using up to 128 cores, for a synthetic 3D Henry model (100 million cells) and the real-life Sand Engine model ( 10 million cells). The Sand Engine model was used to investigate the potential effect of the long-term morphological evolution of a large sand replenishment and climate change on fresh groundwater resources

  17. Time reversibility, computer simulation, algorithms, chaos

    CERN Document Server

    Hoover, William Graham

    2012-01-01

    A small army of physicists, chemists, mathematicians, and engineers has joined forces to attack a classic problem, the "reversibility paradox", with modern tools. This book describes their work from the perspective of computer simulation, emphasizing the author's approach to the problem of understanding the compatibility, and even inevitability, of the irreversible second law of thermodynamics with an underlying time-reversible mechanics. Computer simulation has made it possible to probe reversibility from a variety of directions and "chaos theory" or "nonlinear dynamics" has supplied a useful vocabulary and a set of concepts, which allow a fuller explanation of irreversibility than that available to Boltzmann or to Green, Kubo and Onsager. Clear illustration of concepts is emphasized throughout, and reinforced with a glossary of technical terms from the specialized fields which have been combined here to focus on a common theme. The book begins with a discussion, contrasting the idealized reversibility of ba...

  18. Large-distance and long-time asymptotic behavior of the reduced density matrix in the non-linear Schroedinger model

    Energy Technology Data Exchange (ETDEWEB)

    Kozlowski, K.K.

    2010-12-15

    Starting from the form factor expansion in finite volume, we derive the multidimensional generalization of the so-called Natte series for the zero-temperature, time and distance dependent reduced density matrix in the non-linear Schroedinger model. This representation allows one to read-off straightforwardly the long-time/large-distance asymptotic behavior of this correlator. Our method of analysis reduces the complexity of the computation of the asymptotic behavior of correlation functions in the so-called interacting integrable models, to the one appearing in free fermion equivalent models. We compute explicitly the first few terms appearing in the asymptotic expansion. Part of these terms stems from excitations lying away from the Fermi boundary, and hence go beyond what can be obtained by using the CFT/Luttinger liquid based predictions. (orig.)

  19. Computational time-resolved and resonant x-ray scattering of strongly correlated materials

    Energy Technology Data Exchange (ETDEWEB)

    Bansil, Arun [Northeastern Univ., Boston, MA (United States)

    2016-11-09

    Basic-Energy Sciences of the Department of Energy (BES/DOE) has made large investments in x-ray sources in the U.S. (NSLS-II, LCLS, NGLS, ALS, APS) as powerful enabling tools for opening up unprecedented new opportunities for exploring properties of matter at various length and time scales. The coming online of the pulsed photon source, literally allows us to see and follow the dynamics of processes in materials at their natural timescales. There is an urgent need therefore to develop theoretical methodologies and computational models for understanding how x-rays interact with matter and the related spectroscopies of materials. The present project addressed aspects of this grand challenge of x-ray science. In particular, our Collaborative Research Team (CRT) focused on developing viable computational schemes for modeling x-ray scattering and photoemission spectra of strongly correlated materials in the time-domain. The vast arsenal of formal/numerical techniques and approaches encompassed by the members of our CRT were brought to bear through appropriate generalizations and extensions to model the pumped state and the dynamics of this non-equilibrium state, and how it can be probed via x-ray absorption (XAS), emission (XES), resonant and non-resonant x-ray scattering, and photoemission processes. We explored the conceptual connections between the time-domain problems and other second-order spectroscopies, such as resonant inelastic x-ray scattering (RIXS) because RIXS may be effectively thought of as a pump-probe experiment in which the incoming photon acts as the pump, and the fluorescent decay is the probe. Alternatively, when the core-valence interactions are strong, one can view K-edge RIXS for example, as the dynamic response of the material to the transient presence of a strong core-hole potential. Unlike an actual pump-probe experiment, here there is no mechanism for adjusting the time-delay between the pump and the probe. However, the core hole

  20. Comparison of the Amount of Time Spent on Computer Games and Aggressive Behavior in Male Middle School Students of Tehran

    Directory of Open Access Journals (Sweden)

    Mehrangiz Shoaa Kazemi

    2016-12-01

    Full Text Available Background and Objectives: Modern technologies have a prominent role in adolescent's daily life. These technologies include specific cultural and moral patterns, which could be highly effective on adolescents. This research aimed at comparing the amount of time spent on computer games and aggressive behavior in male middle school students of Tehran. Materials and Methods: This study had a descriptive design. The study population included all male students of middle school of Tehran, and the sample included 120 male students, of which 60 were dependent on computer games with aggressive behavior and 60 were non-dependent on computer games with normal behavior; the sample was randomly selected from Tehran regions (south, north, west, and east regions with random multi-stage sampling. Data were gathered using questionnaires, including Aggressive Questionnaire (AGQ and a researcher-made questionnaire consisting of 10 multiple questions that measure the use or non-use of computer games. Data were analyzed using SPSS-19 statistical software. For data analysis, Pearson correlation and t test were used. Results: The results showed that there was a meaningful relationship between computer gaming and aggressive behavior and also between duration of using computer games and aggressive behaviors (P <0.05. Conclusions: According to the results, it seems that children could be kept safe from the adverse effects of computer games by controlling the duration and the type of the games that they play.

  1. Hard real-time quick EXAFS data acquisition with all open source software on a commodity personal computer

    International Nuclear Information System (INIS)

    So, I.; Siddons, D.P.; Caliebe, W.A.; Khalid, S.

    2007-01-01

    We describe here the data acquisition subsystem of the Quick EXAFS (QEXAFS) experiment at the National Synchrotron Light Source of Brookhaven National Laboratory. For ease of future growth and flexibility, almost all software components are open source with very active maintainers. Among them, Linux running on x86 desktop computer, RTAI for real-time response, COMEDI driver for the data acquisition hardware, Qt and PyQt for graphical user interface, PyQwt for plotting, and Python for scripting. The signal (A/D) and energy-reading (IK220 encoder) devices in the PCI computer are also EPICS enabled. The control system scans the monochromator energy through a networked EPICS motor. With the real-time kernel, the system is capable of deterministic data-sampling period of tens of micro-seconds with typical timing-jitter of several micro-seconds. At the same time, Linux is running in other non-real-time processes handling the user-interface. A modern Qt-based controls-frontend enhances productivity. The fast plotting and zooming of data in time or energy coordinates let the experimenters verify the quality of the data before detailed analysis. Python scripting is built-in for automation. The typical data-rate for continuous runs are around 10 M bytes/min

  2. Response properties of the refractory auditory nerve fiber.

    Science.gov (United States)

    Miller, C A; Abbas, P J; Robinson, B K

    2001-09-01

    The refractory characteristics of auditory nerve fibers limit their ability to accurately encode temporal information. Therefore, they are relevant to the design of cochlear prostheses. It is also possible that the refractory property could be exploited by prosthetic devices to improve information transfer, as refractoriness may enhance the nerve's stochastic properties. Furthermore, refractory data are needed for the development of accurate computational models of auditory nerve fibers. We applied a two-pulse forward-masking paradigm to a feline model of the human auditory nerve to assess refractory properties of single fibers. Each fiber was driven to refractoriness by a single (masker) current pulse delivered intracochlearly. Properties of firing efficiency, latency, jitter, spike amplitude, and relative spread (a measure of dynamic range and stochasticity) were examined by exciting fibers with a second (probe) pulse and systematically varying the masker-probe interval (MPI). Responses to monophasic cathodic current pulses were analyzed. We estimated the mean absolute refractory period to be about 330 micros and the mean recovery time constant to be about 410 micros. A significant proportion of fibers (13 of 34) responded to the probe pulse with MPIs as short as 500 micros. Spike amplitude decreased with decreasing MPI, a finding relevant to the development of computational nerve-fiber models, interpretation of gross evoked potentials, and models of more central neural processing. A small mean decrement in spike jitter was noted at small MPI values. Some trends (such as spike latency-vs-MPI) varied across fibers, suggesting that sites of excitation varied across fibers. Relative spread was found to increase with decreasing MPI values, providing direct evidence that stochastic properties of fibers are altered under conditions of refractoriness.

  3. Neighborhood communication paradigm to increase scalability in large-scale dynamic scientific applications

    KAUST Repository

    Ovcharenko, Aleksandr

    2012-03-01

    This paper introduces a general-purpose communication package built on top of MPI which is aimed at improving inter-processor communications independently of the supercomputer architecture being considered. The package is developed to support parallel applications that rely on computation characterized by large number of messages of various sizes, often small, that are focused within processor neighborhoods. In some cases, such as solvers having static mesh partitions, the number and size of messages are known a priori. However, in other cases such as mesh adaptation, the messages evolve and vary in number and size and include the dynamic movement of partition objects. The current package provides a utility for dynamic applications based on two key attributes that are: (i) explicit consideration of the neighborhood communication pattern to avoid many-to-many calls and also to reduce the number of collective calls to a minimum, and (ii) use of non-blocking MPI functions along with message packing to manage message flow control and reduce the number and time of communication calls. The test application demonstrated is parallel unstructured mesh adaptation. Results on IBM Blue Gene/P and Cray XE6 computers show that the use of neighborhood-based communication control leads to scalable results when executing generally imbalanced mesh adaptation runs. © 2011 Elsevier B.V. All rights reserved.

  4. A Parallel Supercomputer Implementation of a Biological Inspired Neural Network and its use for Pattern Recognition

    International Nuclear Information System (INIS)

    De Ladurantaye, Vincent; Lavoie, Jean; Bergeron, Jocelyn; Parenteau, Maxime; Lu Huizhong; Pichevar, Ramin; Rouat, Jean

    2012-01-01

    A parallel implementation of a large spiking neural network is proposed and evaluated. The neural network implements the binding by synchrony process using the Oscillatory Dynamic Link Matcher (ODLM). Scalability, speed and performance are compared for 2 implementations: Message Passing Interface (MPI) and Compute Unified Device Architecture (CUDA) running on clusters of multicore supercomputers and NVIDIA graphical processing units respectively. A global spiking list that represents at each instant the state of the neural network is described. This list indexes each neuron that fires during the current simulation time so that the influence of their spikes are simultaneously processed on all computing units. Our implementation shows a good scalability for very large networks. A complex and large spiking neural network has been implemented in parallel with success, thus paving the road towards real-life applications based on networks of spiking neurons. MPI offers a better scalability than CUDA, while the CUDA implementation on a GeForce GTX 285 gives the best cost to performance ratio. When running the neural network on the GTX 285, the processing speed is comparable to the MPI implementation on RQCHP's Mammouth parallel with 64 notes (128 cores).

  5. Gender differences in leisure-time versus non-leisure-time physical activity among Saudi adolescents.

    Science.gov (United States)

    Al-Sobayel, Hana; Al-Hazzaa, Hazzaa M; Abahussain, Nanda A; Qahwaji, Dina M; Musaiger, Abdulrahman O

    2015-01-01

    The aim of the study was to examine the gender differences and predictors of leisure versus non-leisure time physical activities among Saudi adolescents aged 14-19 years. The multistage stratified cluster random sampling technique was used. A sample of 1,388 males and 1,500 females enrolled in secondary schools in three major cities in Saudi Arabia was included. Anthropometric measurements were performed and Body Mass Index was calculated. Physical activity, sedentary behaviours and dietary habits were measured using a self-reported validated questionnaire. The total time spent in leisure and non-leisure physical activity per week was 90 and 77 minutes, respectively. The males spent more time per week in leisure-time physical activities than females. Females in private schools spent more time during the week in leisure-time physical activities, compared to females in Stateschools. There was a significant difference between genders by obesity status interaction in leisure-time physical activity. Gender, and other factors, predicted total duration spent in leisure-time and non-leisure-time physical activity. The study showed that female adolescents are much less active than males, especially in leisure-time physical activities. Programmes to promote physical activity among adolescents are urgently needed, with consideration of gender differences.

  6. Real-Time Accumulative Computation Motion Detectors

    Directory of Open Access Journals (Sweden)

    Saturnino Maldonado-Bascón

    2009-12-01

    Full Text Available The neurally inspired accumulative computation (AC method and its application to motion detection have been introduced in the past years. This paper revisits the fact that many researchers have explored the relationship between neural networks and finite state machines. Indeed, finite state machines constitute the best characterized computational model, whereas artificial neural networks have become a very successful tool for modeling and problem solving. The article shows how to reach real-time performance after using a model described as a finite state machine. This paper introduces two steps towards that direction: (a A simplification of the general AC method is performed by formally transforming it into a finite state machine. (b A hardware implementation in FPGA of such a designed AC module, as well as an 8-AC motion detector, providing promising performance results. We also offer two case studies of the use of AC motion detectors in surveillance applications, namely infrared-based people segmentation and color-based people tracking, respectively.

  7. Communication: Influence of external static and alternating electric fields on water from long-time non-equilibrium ab initio molecular dynamics

    Science.gov (United States)

    Futera, Zdenek; English, Niall J.

    2017-07-01

    The response of water to externally applied electric fields is of central relevance in the modern world, where many extraneous electric fields are ubiquitous. Historically, the application of external fields in non-equilibrium molecular dynamics has been restricted, by and large, to relatively inexpensive, more or less sophisticated, empirical models. Here, we report long-time non-equilibrium ab initio molecular dynamics in both static and oscillating (time-dependent) external electric fields, therefore opening up a new vista in rigorous studies of electric-field effects on dynamical systems with the full arsenal of electronic-structure methods. In so doing, we apply this to liquid water with state-of-the-art non-local treatment of dispersion, and we compute a range of field effects on structural and dynamical properties, such as diffusivities and hydrogen-bond kinetics.

  8. 7 CFR 1484.52 - What are the guidelines for computing the value of non-cash contributions?

    Science.gov (United States)

    2010-01-01

    ... value of indirect expenditures. Allocate value on the basis of sound management and accounting... 7 Agriculture 10 2010-01-01 2010-01-01 false What are the guidelines for computing the value of... Reimbursements § 1484.52 What are the guidelines for computing the value of non-cash contributions? (a) Computing...

  9. Television viewing, computer use and total screen time in Canadian youth.

    Science.gov (United States)

    Mark, Amy E; Boyce, William F; Janssen, Ian

    2006-11-01

    Research has linked excessive television viewing and computer use in children and adolescents to a variety of health and social problems. Current recommendations are that screen time in children and adolescents should be limited to no more than 2 h per day. To determine the percentage of Canadian youth meeting the screen time guideline recommendations. The representative study sample consisted of 6942 Canadian youth in grades 6 to 10 who participated in the 2001/2002 World Health Organization Health Behaviour in School-Aged Children survey. Only 41% of girls and 34% of boys in grades 6 to 10 watched 2 h or less of television per day. Once the time of leisure computer use was included and total daily screen time was examined, only 18% of girls and 14% of boys met the guidelines. The prevalence of those meeting the screen time guidelines was higher in girls than boys. Fewer than 20% of Canadian youth in grades 6 to 10 met the total screen time guidelines, suggesting that increased public health interventions are needed to reduce the number of leisure time hours that Canadian youth spend watching television and using the computer.

  10. Neural Computations in a Dynamical System with Multiple Time Scales.

    Science.gov (United States)

    Mi, Yuanyuan; Lin, Xiaohan; Wu, Si

    2016-01-01

    Neural systems display rich short-term dynamics at various levels, e.g., spike-frequency adaptation (SFA) at the single-neuron level, and short-term facilitation (STF) and depression (STD) at the synapse level. These dynamical features typically cover a broad range of time scales and exhibit large diversity in different brain regions. It remains unclear what is the computational benefit for the brain to have such variability in short-term dynamics. In this study, we propose that the brain can exploit such dynamical features to implement multiple seemingly contradictory computations in a single neural circuit. To demonstrate this idea, we use continuous attractor neural network (CANN) as a working model and include STF, SFA and STD with increasing time constants in its dynamics. Three computational tasks are considered, which are persistent activity, adaptation, and anticipative tracking. These tasks require conflicting neural mechanisms, and hence cannot be implemented by a single dynamical feature or any combination with similar time constants. However, with properly coordinated STF, SFA and STD, we show that the network is able to implement the three computational tasks concurrently. We hope this study will shed light on the understanding of how the brain orchestrates its rich dynamics at various levels to realize diverse cognitive functions.

  11. From experiment to design -- Fault characterization and detection in parallel computer systems using computational accelerators

    Science.gov (United States)

    Yim, Keun Soo

    This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure characteristics of CPU-GPU hybrid computer systems under various types of hardware faults. The main characterization targets were faults that are difficult to detect and/or recover from, e.g., faults that cause long latency failures (Ch. 3), faults in dynamically allocated resources (Ch. 4), faults in GPUs (Ch. 5), faults in MPI programs (Ch. 6), and microarchitecture-level faults with specific timing features (Ch. 7). The co-design studies were based on the characterization results. One of the co-designed systems has a set of source-to-source translators that customize and strategically place error detectors in the source code of target GPU programs (Ch. 5). Another co-designed system uses an extension card to learn the normal behavioral and semantic execution patterns of message-passing processes executing on CPUs, and to detect abnormal behaviors of those parallel processes (Ch. 6). The third co-designed system is a co-processor that has a set of new instructions in order to support software-implemented fault detection techniques (Ch. 7). The work described in this dissertation gains more importance because heterogeneous processors have become an essential component of state-of-the-art supercomputers. GPUs were used in three of the five fastest supercomputers that were operating in 2011. Our work included comprehensive fault characterization studies in CPU-GPU hybrid computers. In CPUs, we monitored the target systems for a long period of time after injecting faults (a temporally comprehensive experiment), and injected faults into various types of

  12. 12 CFR 516.10 - How does OTS compute time periods under this part?

    Science.gov (United States)

    2010-01-01

    ... 12 Banks and Banking 5 2010-01-01 2010-01-01 false How does OTS compute time periods under this part? 516.10 Section 516.10 Banks and Banking OFFICE OF THRIFT SUPERVISION, DEPARTMENT OF THE TREASURY APPLICATION PROCESSING PROCEDURES § 516.10 How does OTS compute time periods under this part? In computing...

  13. Non-real-time computed tomography-guided percutaneous ethanol injection therapy for heapocellular carcinoma undetectable by ultrasonography

    International Nuclear Information System (INIS)

    Ueda, Kazushige; Ohkawara, Tohru; Minami, Masahito; Sawa, Yoshihiko; Morinaga, Osamu; Kohli, Yoshihiro; Ohkawara, Yasuo

    1998-01-01

    The purpose of this study was to evaluate the feasibility of non-real-time CT-guided percutaneous ethanol injection therapy (PEIT) for hepatocellular carcinoma (HCC, 37 lesions) untreatable by ultrasonography-guided (US)-PEIT. The HCC lesion was localized on the lipiodol CT image with a graduated grid system. We advanced a 21 G or 22 G needle in a stepwise fashion with intermittent localization scans using a tandem method to position the tip of the needle in the lesion. Ethanol containing contrast medium was injected with monitoring scans obtained after incremental volumes of injection, until perfusion of the lesion was judged to be complete. A total of 44 CT-PEIT procedures were performed. The average number of needle passes from the skin to the liver in each CT-PEIT procedure was 2.3, the average amount of ethanol injected was 14.4 ml, and the average time required was 49.3 minutes. Complete perfusion of the lesion by ethanol on monitoring CT images was achieved in all lesions with only a single or double CT-PEIT procedure without severe complication. Local recurrence was detected only in 5 lesions. At present, it is more time-consuming to perform CT-PEIT than US-PEIT because conventional CT guidance is not real-time imaging. However, it is expected that this limitation of CT-PEIT will be overcome in the near future with the introduction of CT fluoroscopy. In conclusion, CT-PEIT should prove to be a feasible, acceptable treatment for challenging cases of HCC undetectable by US. (author)

  14. Accessible high performance computing solutions for near real-time image processing for time critical applications

    Science.gov (United States)

    Bielski, Conrad; Lemoine, Guido; Syryczynski, Jacek

    2009-09-01

    High Performance Computing (HPC) hardware solutions such as grid computing and General Processing on a Graphics Processing Unit (GPGPU) are now accessible to users with general computing needs. Grid computing infrastructures in the form of computing clusters or blades are becoming common place and GPGPU solutions that leverage the processing power of the video card are quickly being integrated into personal workstations. Our interest in these HPC technologies stems from the need to produce near real-time maps from a combination of pre- and post-event satellite imagery in support of post-disaster management. Faster processing provides a twofold gain in this situation: 1. critical information can be provided faster and 2. more elaborate automated processing can be performed prior to providing the critical information. In our particular case, we test the use of the PANTEX index which is based on analysis of image textural measures extracted using anisotropic, rotation-invariant GLCM statistics. The use of this index, applied in a moving window, has been shown to successfully identify built-up areas in remotely sensed imagery. Built-up index image masks are important input to the structuring of damage assessment interpretation because they help optimise the workload. The performance of computing the PANTEX workflow is compared on two different HPC hardware architectures: (1) a blade server with 4 blades, each having dual quad-core CPUs and (2) a CUDA enabled GPU workstation. The reference platform is a dual CPU-quad core workstation and the PANTEX workflow total computing time is measured. Furthermore, as part of a qualitative evaluation, the differences in setting up and configuring various hardware solutions and the related software coding effort is presented.

  15. Message Passing Framework for Globally Interconnected Clusters

    International Nuclear Information System (INIS)

    Hafeez, M; Riaz, N; Asghar, S; Malik, U A; Rehman, A

    2011-01-01

    In prevailing technology trends it is apparent that the network requirements and technologies will advance in future. Therefore the need of High Performance Computing (HPC) based implementation for interconnecting clusters is comprehensible for scalability of clusters. Grid computing provides global infrastructure of interconnecting clusters consisting of dispersed computing resources over Internet. On the other hand the leading model for HPC programming is Message Passing Interface (MPI). As compared to Grid computing, MPI is better suited for solving most of the complex computational problems. MPI itself is restricted to a single cluster. It does not support message passing over the internet to use the computing resources of different clusters in an optimal way. We propose a model that provides message passing capabilities between parallel applications over the internet. The proposed model is based on Architecture for Java Universal Message Passing (A-JUMP) framework and Enterprise Service Bus (ESB) named as High Performance Computing Bus. The HPC Bus is built using ActiveMQ. HPC Bus is responsible for communication and message passing in an asynchronous manner. Asynchronous mode of communication offers an assurance for message delivery as well as a fault tolerance mechanism for message passing. The idea presented in this paper effectively utilizes wide-area intercluster networks. It also provides scheduling, dynamic resource discovery and allocation, and sub-clustering of resources for different jobs. Performance analysis and comparison study of the proposed framework with P2P-MPI are also presented in this paper.

  16. Effect of the MCNP model definition on the computation time

    International Nuclear Information System (INIS)

    Šunka, Michal

    2017-01-01

    The presented work studies the influence of the method of defining the geometry in the MCNP transport code and its impact on the computational time, including the difficulty of preparing an input file describing the given geometry. Cases using different geometric definitions including the use of basic 2-dimensional and 3-dimensional objects and theirs combinations were studied. The results indicate that an inappropriate definition can increase the computational time by up to 59% (a more realistic case indicates 37%) for the same results and the same statistical uncertainty. (orig.)

  17. The Fetal Modified Myocardial Performance Index: Is Automation the Future?

    Directory of Open Access Journals (Sweden)

    Priya Maheshwari

    2015-01-01

    Full Text Available The fetal modified myocardial performance index (Mod-MPI is a noninvasive, pulsed-wave Doppler-derived measure of global myocardial function. This review assesses the progress in technical refinements of its measurement and the potential for automation to be the crucial next step. The Mod-MPI is a ratio of isovolumetric to ejection time cardiac time intervals, and the potential for the left ventricular Mod-MPI as a tool to clinically assess fetal cardiac function is well-established. However, there are wide variations in published reference ranges, as (1 a standardised method of selecting cardiac time intervals used in Mod-MPI calculation has not been established; (2 cardiac time interval measurement currently requires manual, inherently subjective placement of callipers on Doppler ultrasound waveforms; and (3 ultrasound machine settings and ultrasound system type have been found to affect Mod-MPI measurement. Collectively these factors create potential for significant inter- and intraobserver measurement variability. Automated measurement of the Mod-MPI may be the next key development which propels the Mod-MPI into routine clinical use. A novel automated system of Mod-MPI measurement is briefly presented and its implications for the future of the Mod-MPI in fetal cardiology are discussed.

  18. Iron oxide nanoparticle-micelles (ION-micelles for sensitive (molecular magnetic particle imaging and magnetic resonance imaging.

    Directory of Open Access Journals (Sweden)

    Lucas W E Starmans

    Full Text Available BACKGROUND: Iron oxide nanoparticles (IONs are a promising nanoplatform for contrast-enhanced MRI. Recently, magnetic particle imaging (MPI was introduced as a new imaging modality, which is able to directly visualize magnetic particles and could serve as a more sensitive and quantitative alternative to MRI. However, MPI requires magnetic particles with specific magnetic properties for optimal use. Current commercially available iron oxide formulations perform suboptimal in MPI, which is triggering research into optimized synthesis strategies. Most synthesis procedures aim at size control of iron oxide nanoparticles rather than control over the magnetic properties. In this study, we report on the synthesis, characterization and application of a novel ION platform for sensitive MPI and MRI. METHODS AND RESULTS: IONs were synthesized using a thermal-decomposition method and subsequently phase-transferred by encapsulation into lipidic micelles (ION-Micelles. Next, the material and magnetic properties of the ION-Micelles were analyzed. Most notably, vibrating sample magnetometry measurements showed that the effective magnetic core size of the IONs is 16 nm. In addition, magnetic particle spectrometry (MPS measurements were performed. MPS is essentially zero-dimensional MPI and therefore allows to probe the potential of iron oxide formulations for MPI. ION-Micelles induced up to 200 times higher signal in MPS measurements than commercially available iron oxide formulations (Endorem, Resovist and Sinerem and thus likely allow for significantly more sensitive MPI. In addition, the potential of the ION-Micelle platform for molecular MPI and MRI was showcased by MPS and MRI measurements of fibrin-binding peptide functionalized ION-Micelles (FibPep-ION-Micelles bound to blood clots. CONCLUSIONS: The presented data underlines the potential of the ION-Micelle nanoplatform for sensitive (molecular MPI and warrants further investigation of the Fib

  19. Evaluation of Oceanic Surface Observation for Reproducing the Upper Ocean Structure in ECHAM5/MPI-OM

    Science.gov (United States)

    Luo, Hao; Zheng, Fei; Zhu, Jiang

    2017-12-01

    Better constraints of initial conditions from data assimilation are necessary for climate simulations and predictions, and they are particularly important for the ocean due to its long climate memory; as such, ocean data assimilation (ODA) is regarded as an effective tool for seasonal to decadal predictions. In this work, an ODA system is established for a coupled climate model (ECHAM5/MPI-OM), which can assimilate all available oceanic observations using an ensemble optimal interpolation approach. To validate and isolate the performance of different surface observations in reproducing air-sea climate variations in the model, a set of observing system simulation experiments (OSSEs) was performed over 150 model years. Generally, assimilating sea surface temperature, sea surface salinity, and sea surface height (SSH) can reasonably reproduce the climate variability and vertical structure of the upper ocean, and assimilating SSH achieves the best results compared to the true states. For the El Niño-Southern Oscillation (ENSO), assimilating different surface observations captures true aspects of ENSO well, but assimilating SSH can further enhance the accuracy of ENSO-related feedback processes in the coupled model, leading to a more reasonable ENSO evolution and air-sea interaction over the tropical Pacific. For ocean heat content, there are still limitations in reproducing the long time-scale variability in the North Atlantic, even if SSH has been taken into consideration. These results demonstrate the effectiveness of assimilating surface observations in capturing the interannual signal and, to some extent, the decadal signal but still highlight the necessity of assimilating profile data to reproduce specific decadal variability.

  20. Coherence and computational complexity of quantifier-free dependence logic formulas

    NARCIS (Netherlands)

    Kontinen, J.; Kontinen, J.; Väänänen, J.

    2010-01-01

    We study the computational complexity of the model checking for quantifier-free dependence logic (D) formulas. We point out three thresholds in the computational complexity: logarithmic space, non- deterministic logarithmic space and non-deterministic polynomial time.

  1. NNSA?s Computing Strategy, Acquisition Plan, and Basis for Computing Time Allocation

    Energy Technology Data Exchange (ETDEWEB)

    Nikkel, D J

    2009-07-21

    This report is in response to the Omnibus Appropriations Act, 2009 (H.R. 1105; Public Law 111-8) in its funding of the National Nuclear Security Administration's (NNSA) Advanced Simulation and Computing (ASC) Program. This bill called for a report on ASC's plans for computing and platform acquisition strategy in support of stockpile stewardship. Computer simulation is essential to the stewardship of the nation's nuclear stockpile. Annual certification of the country's stockpile systems, Significant Finding Investigations (SFIs), and execution of Life Extension Programs (LEPs) are dependent on simulations employing the advanced ASC tools developed over the past decade plus; indeed, without these tools, certification would not be possible without a return to nuclear testing. ASC is an integrated program involving investments in computer hardware (platforms and computing centers), software environments, integrated design codes and physical models for these codes, and validation methodologies. The significant progress ASC has made in the past derives from its focus on mission and from its strategy of balancing support across the key investment areas necessary for success. All these investment areas must be sustained for ASC to adequately support current stockpile stewardship mission needs and to meet ever more difficult challenges as the weapons continue to age or undergo refurbishment. The appropriations bill called for this report to address three specific issues, which are responded to briefly here but are expanded upon in the subsequent document: (1) Identify how computing capability at each of the labs will specifically contribute to stockpile stewardship goals, and on what basis computing time will be allocated to achieve the goal of a balanced program among the labs. (2) Explain the NNSA's acquisition strategy for capacity and capability of machines at each of the labs and how it will fit within the existing budget constraints. (3

  2. LHC Computing Grid Project Launches intAction with International Support. A thousand times more computing power by 2006

    CERN Multimedia

    2001-01-01

    The first phase of the LHC Computing Grid project was approved at an extraordinary meeting of the Council on 20 September 2001. CERN is preparing for the unprecedented avalanche of data that will be produced by the Large Hadron Collider experiments. A thousand times more computer power will be needed by 2006! CERN's need for a dramatic advance in computing capacity is urgent. As from 2006, the four giant detectors observing trillions of elementary particle collisions at the LHC will accumulate over ten million Gigabytes of data, equivalent to the contents of about 20 million CD-ROMs, each year of its operation. A thousand times more computing power will be needed than is available to CERN today. The strategy the collabortations have adopted to analyse and store this unprecedented amount of data is the coordinated deployment of Grid technologies at hundreds of institutes which will be able to search out and analyse information from an interconnected worldwide grid of tens of thousands of computers and storag...

  3. Non-linear corrections to the time-covariance function derived from a multi-state chemical master equation.

    Science.gov (United States)

    Scott, M

    2012-08-01

    The time-covariance function captures the dynamics of biochemical fluctuations and contains important information about the underlying kinetic rate parameters. Intrinsic fluctuations in biochemical reaction networks are typically modelled using a master equation formalism. In general, the equation cannot be solved exactly and approximation methods are required. For small fluctuations close to equilibrium, a linearisation of the dynamics provides a very good description of the relaxation of the time-covariance function. As the number of molecules in the system decrease, deviations from the linear theory appear. Carrying out a systematic perturbation expansion of the master equation to capture these effects results in formidable algebra; however, symbolic mathematics packages considerably expedite the computation. The authors demonstrate that non-linear effects can reveal features of the underlying dynamics, such as reaction stoichiometry, not available in linearised theory. Furthermore, in models that exhibit noise-induced oscillations, non-linear corrections result in a shift in the base frequency along with the appearance of a secondary harmonic.

  4. Real-time brain computer interface using imaginary movements

    DEFF Research Database (Denmark)

    El-Madani, Ahmad; Sørensen, Helge Bjarup Dissing; Kjær, Troels W.

    2015-01-01

    Background: Brain Computer Interface (BCI) is the method of transforming mental thoughts and imagination into actions. A real-time BCI system can improve the quality of life of patients with severe neuromuscular disorders by enabling them to communicate with the outside world. In this paper...

  5. Generic Cospark of a Matrix Can Be Computed in Polynomial Time

    OpenAIRE

    Zhong, Sichen; Zhao, Yue

    2017-01-01

    The cospark of a matrix is the cardinality of the sparsest vector in the column space of the matrix. Computing the cospark of a matrix is well known to be an NP hard problem. Given the sparsity pattern (i.e., the locations of the non-zero entries) of a matrix, if the non-zero entries are drawn from independently distributed continuous probability distributions, we prove that the cospark of the matrix equals, with probability one, to a particular number termed the generic cospark of the matrix...

  6. Non-invasive coronary angiography with multislice computed tomography. Technology, methods, preliminary experience and prospects.

    Science.gov (United States)

    Traversi, Egidio; Bertoli, Giuseppe; Barazzoni, Giancarlo; Baldi, Maurizia; Tramarin, Roberto

    2004-02-01

    The recent technical developments in multislice computed tomography (MSCT), with ECG retro-gated image reconstruction, have elicited great interest in the possibility of accurate non-invasive imaging of the coronary arteries. The latest generation of MSCT systems with 8-16 rows of detectors permits acquisition of the whole cardiac volume during a single 15-20 s breath-hold with a submillimetric definition of the images and an outstanding signal-to-noise ratio. Thus the race which, between MSCT, electron beam computed tomography and cardiac magnetic resonance imaging, can best provide routine and reliable imaging of the coronary arteries in clinical practice has recommenced. Currently available MSCT systems offer different options for both cardiac image acquisition and reconstruction, including multiplanar and curved multiplanar reconstruction, three-dimensional volume rendering, maximum intensity projection, and virtual angioscopy. In our preliminary experience including 176 patients suffering from known or suspected coronary artery disease, MSCT was feasible in 161 (91.5%) and showed a sensitivity of 80.4% and a specificity of 80.3%, with respect to standard coronary angiography, in detecting critical stenosis in coronary arteries and artery or venous bypass grafts. These results correspond to a positive predictive value of 58.6% and a negative predictive value of 92.2%. The true role that MSCT is likely to play in the future in non-invasive coronary imaging is still to be defined. Nevertheless, the huge amount of data obtainable by MSCT along with the rapid technological advances, shorter acquisition times and reconstruction algorithm developments will make the technique stronger, and possible applications are expected not only for non-invasive coronary angiography, but also for cardiac function and myocardial perfusion evaluation, as an all-in-one examination.

  7. Computer codes in particle transport physics

    International Nuclear Information System (INIS)

    Pesic, M.

    2004-01-01

    Simulation of transport and interaction of various particles in complex media and wide energy range (from 1 MeV up to 1 TeV) is very complicated problem that requires valid model of a real process in nature and appropriate solving tool - computer code and data library. A brief overview of computer codes based on Monte Carlo techniques for simulation of transport and interaction of hadrons and ions in wide energy range in three dimensional (3D) geometry is shown. Firstly, a short attention is paid to underline the approach to the solution of the problem - process in nature - by selection of the appropriate 3D model and corresponding tools - computer codes and cross sections data libraries. Process of data collection and evaluation from experimental measurements and theoretical approach to establishing reliable libraries of evaluated cross sections data is Ion g, difficult and not straightforward activity. For this reason, world reference data centers and specialized ones are acknowledged, together with the currently available, state of art evaluated nuclear data libraries, as the ENDF/B-VI, JEF, JENDL, CENDL, BROND, etc. Codes for experimental and theoretical data evaluations (e.g., SAMMY and GNASH) together with the codes for data processing (e.g., NJOY, PREPRO and GRUCON) are briefly described. Examples of data evaluation and data processing to generate computer usable data libraries are shown. Among numerous and various computer codes developed in transport physics of particles, the most general ones are described only: MCNPX, FLUKA and SHIELD. A short overview of basic application of these codes, physical models implemented with their limitations, energy ranges of particles and types of interactions, is given. General information about the codes covers also programming language, operation system, calculation speed and the code availability. An example of increasing computation speed of running MCNPX code using a MPI cluster compared to the code sequential option

  8. Analysis of prognostic value of clinical information and myocardial perfusion imaging in diabetic patients on cardiac events occurrence

    International Nuclear Information System (INIS)

    Wu Zhifang; Li Sijin

    2004-01-01

    Objective: To explore the risk factors of cardiac event (CE) occurrence and evaluate the prognostic value of myocardial perfusion imaging (MPI) in diabetic patients. Methods: We conducted a study with 172(16.4%) consecutively registered patients with diabetes (132 males, 40 females; age range 16-90 years, mean age 55.94±12.46 years) and 875(83.6%) patients without diabetes with known or suspected coronary artery disease (CAD) undergoing SPECT MPI. Follow-up information was obtained through telephone interviews. Patients were followed up for at least 18 months. End points were defined as death due to primary cardiac cause, or nonfatal acute myocardial infarction and revascularization. The mean time of follow-up was 33.25±14.95 (1∼56) months. Results: Logistic stepwise regression analysis evaluated history of smoking and drinking, hypertension, hyperlipemia and the family history of CAD as predictors. A multiple regression formula was obtained: Y=-5.593+0.958X1+0.921 X2+0.428X3, (Y=cardiac events, X1=diabetes, X2=the family history of CAD, X3=hypertension). Diabetes, the family history of CAD and hypertension were dangerous factors for cardiac events, but hyperlipemia, history of smoking and drinking were protective factors for cardiac events. Over the follow-up period, there are 42 cardiac events in diabetic group, 86 in non-diabetic group. Patients with diabetes had significantly higher rates of cardiac events (24.4% versus 9.8%; chi-square 28.5, P<0.0001), compared with rates among patients without diabetes (table 1). Kaplan-Meier survival curves analyzing the no-CE rates in the diabetic and non-diabetic groups, diabetic patients were significantly lower than non-diabetic ones (Log-rank statistic, chi-square 28.75, P <0.0001). Of 172 diabetic patients, 32.2% of the patients with abnormal MPI occurred cardiac events, but only 7.4% of the patients with normal ones did(chi-square 12.34, P <0.001) (figure 1). Abnormal SPECT MPI was associated with the higher rate

  9. Non-cardiac findings on coronary computed tomography and magnetic resonance imaging

    International Nuclear Information System (INIS)

    Dewey, Marc; Schnapauff, Dirk; Teige, Florian; Hamm, Bernd

    2007-01-01

    Both multislice computed tomography (CT) and magnetic resonance imaging (MRI) are emerging as methods to detect coronary artery stenoses and assess cardiac function and morphology. Non-cardiac structures are also amenable to assessment by these non-invasive tests. We investigated the rate of significant and insignificant non-cardiac findings using CT and MRI. A total of 108 consecutive patients suspected of having coronary artery disease and without contraindications to CT and MRI were included in this study. Significant non-cardiac findings were defined as findings that required additional clinical or radiological follow-up. CT and MR images were read independently in a blinded fashion. CT yielded five significant non-cardiac findings in five patients (5%). These included a pulmonary embolism, large pleural effusions, sarcoid, a large hiatal hernia, and a pulmonary nodule (>1.0 cm). Two of these significant non-cardiac findings were also seen on MRI (pleural effusions and sarcoid, 2%). Insignificant non-cardiac findings were more frequent than significant findings on both CT (n = 11, 10%) and MRI (n = 7, 6%). Incidental non-cardiac findings on CT and MRI of the coronary arteries are common, which is why images should be analyzed by radiologists to ensure that important findings are not missed and unnecessary follow-up examinations are avoided. (orig.)

  10. Non-cardiac findings on coronary computed tomography and magnetic resonance imaging

    Energy Technology Data Exchange (ETDEWEB)

    Dewey, Marc; Schnapauff, Dirk; Teige, Florian; Hamm, Bernd [Charite-Universitaetsmedizin Berlin, Humboldt-Universitaet zu Berlin, Department of Radiology, Chariteplatz 1, P.O. Box 10098, Berlin (Germany)

    2007-08-15

    Both multislice computed tomography (CT) and magnetic resonance imaging (MRI) are emerging as methods to detect coronary artery stenoses and assess cardiac function and morphology. Non-cardiac structures are also amenable to assessment by these non-invasive tests. We investigated the rate of significant and insignificant non-cardiac findings using CT and MRI. A total of 108 consecutive patients suspected of having coronary artery disease and without contraindications to CT and MRI were included in this study. Significant non-cardiac findings were defined as findings that required additional clinical or radiological follow-up. CT and MR images were read independently in a blinded fashion. CT yielded five significant non-cardiac findings in five patients (5%). These included a pulmonary embolism, large pleural effusions, sarcoid, a large hiatal hernia, and a pulmonary nodule (>1.0 cm). Two of these significant non-cardiac findings were also seen on MRI (pleural effusions and sarcoid, 2%). Insignificant non-cardiac findings were more frequent than significant findings on both CT (n = 11, 10%) and MRI (n = 7, 6%). Incidental non-cardiac findings on CT and MRI of the coronary arteries are common, which is why images should be analyzed by radiologists to ensure that important findings are not missed and unnecessary follow-up examinations are avoided. (orig.)

  11. Stress-induced ST-segment deviation in relation to the presence and severity of coronary artery disease in patients with normal myocardial perfusion imaging.

    Science.gov (United States)

    Weinsaft, Jonathan W; Manoushagian, Shant J; Patel, Taral; Shakoor, Aqsa; Kim, Robert J; Mirchandani, Sunil; Lin, Fay; Wong, Franklin J; Szulc, Massimiliano; Okin, Peter M; Kligfield, Paul D; Min, James K

    2009-01-01

    To assess the utility of stress electrocardiography (ECG) for identifying the presence and severity of obstructive coronary artery disease (CAD) defined by coronary computed tomographic angiography (CCTA) among patients with normal nuclear myocardial perfusion imaging (MPI). The study population comprised 119 consecutive patients with normal MPI who also underwent CCTA (interval 3.5+/-3.8 months). Stress ECG was performed at the time of MPI. CCTA and MPI were interpreted using established scoring systems, and CCTA was used to define the presence and extent of CAD, which was quantified by a coronary artery jeopardy score. Within this population, 28 patients (24%) had obstructive CAD identified by CCTA. The most common CAD pattern was single-vessel CAD (61%), although proximal vessel involvement was present in 46% of patients. Patients with CAD were nearly three times more likely to have positive standard test responses (1 mm ST-segment deviation) than patients with patent coronary arteries (36 vs. 13%, P=0.007). In multivariate analysis, a positive ST-segment test response was an independent marker for CAD (odds ratio: 2.02, confidence interval: 1.09-3.78, P=0.03) even after adjustment for a composite of clinical cardiac risk factors (odds ratio: 1.85, confidence interval: 1.05-3.23, P=0.03). Despite uniformly normal MPI, mean coronary jeopardy score was three-fold higher among patients with positive compared to those with negative ST-segment response to exercise or dobutamine stress (1.9+/-2.7 vs. 0.5+/-1.4, P=0.03). Stress-induced ST-segment deviation is an independent marker for obstructive CAD among patients with normal MPI. A positive stress ECG identifies patients with a greater anatomic extent of CAD as quantified by coronary jeopardy score.

  12. Time-dependent transport of energetic particles in magnetic turbulence: computer simulations versus analytical theory

    Science.gov (United States)

    Arendt, V.; Shalchi, A.

    2018-06-01

    We explore numerically the transport of energetic particles in a turbulent magnetic field configuration. A test-particle code is employed to compute running diffusion coefficients as well as particle distribution functions in the different directions of space. Our numerical findings are compared with models commonly used in diffusion theory such as Gaussian distribution functions and solutions of the cosmic ray Fokker-Planck equation. Furthermore, we compare the running diffusion coefficients across the mean magnetic field with solutions obtained from the time-dependent version of the unified non-linear transport theory. In most cases we find that particle distribution functions are indeed of Gaussian form as long as a two-component turbulence model is employed. For turbulence setups with reduced dimensionality, however, the Gaussian distribution can no longer be obtained. It is also shown that the unified non-linear transport theory agrees with simulated perpendicular diffusion coefficients as long as the pure two-dimensional model is excluded.

  13. High-performance computational fluid dynamics: a custom-code approach

    International Nuclear Information System (INIS)

    Fannon, James; Náraigh, Lennon Ó; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain

    2016-01-01

    We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier–Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing. (paper)

  14. High-performance computational fluid dynamics: a custom-code approach

    Science.gov (United States)

    Fannon, James; Loiseau, Jean-Christophe; Valluri, Prashant; Bethune, Iain; Náraigh, Lennon Ó.

    2016-07-01

    We introduce a modified and simplified version of the pre-existing fully parallelized three-dimensional Navier-Stokes flow solver known as TPLS. We demonstrate how the simplified version can be used as a pedagogical tool for the study of computational fluid dynamics (CFDs) and parallel computing. TPLS is at its heart a two-phase flow solver, and uses calls to a range of external libraries to accelerate its performance. However, in the present context we narrow the focus of the study to basic hydrodynamics and parallel computing techniques, and the code is therefore simplified and modified to simulate pressure-driven single-phase flow in a channel, using only relatively simple Fortran 90 code with MPI parallelization, but no calls to any other external libraries. The modified code is analysed in order to both validate its accuracy and investigate its scalability up to 1000 CPU cores. Simulations are performed for several benchmark cases in pressure-driven channel flow, including a turbulent simulation, wherein the turbulence is incorporated via the large-eddy simulation technique. The work may be of use to advanced undergraduate and graduate students as an introductory study in CFDs, while also providing insight for those interested in more general aspects of high-performance computing.

  15. Influence of the ionic liquid [C4mpy][Tf2N] on the structure of the miniprotein Trp-cage.

    Science.gov (United States)

    Baker, Joseph L; Furbish, Jeffrey; Lindberg, Gerrick E

    2015-11-01

    We examine the effect of the ionic liquid [C4mpy][Tf2N] on the structure of the miniprotein Trp-cage and contrast these results with the behavior of Trp-cage in water. We find the ionic liquid has a dramatic effect on Trp-cage, though many similarities with aqueous Trp-cage are observed. We assess Trp-cage folding by monitoring root mean square deviation from the crystallographic structure, radius of gyration, proline cis/trans isomerization state, protein secondary structure, amino acid contact formation and distance, and native and non-native contact formation. Starting from an unfolded configuration, Trp-cage folds in water at 298 K in less than 500 ns of simulation, but has very little mobility in the ionic liquid at the same temperature, which can be ascribed to the higher ionic liquid viscosity. At 365 K, the mobility of the ionic liquid is increased and initial stages of Trp-cage folding are observed, however Trp-cage does not reach the native folded state in 2 μs of simulation in the ionic liquid. Therefore, in addition to conventional molecular dynamics, we also employ scaled molecular dynamics to expedite sampling, and we demonstrate that Trp-cage in the ionic liquid does closely approach the aqueous folded state. Interestingly, while the reduced mobility of the ionic liquid is found to restrict Trp-cage motion, the ionic liquid does facilitate proline cis/trans isomerization events that are not seen in our aqueous simulations. Copyright © 2015 Elsevier Inc. All rights reserved.

  16. Sorting on STAR. [CDC computer algorithm timing comparison

    Science.gov (United States)

    Stone, H. S.

    1978-01-01

    Timing comparisons are given for three sorting algorithms written for the CDC STAR computer. One algorithm is Hoare's (1962) Quicksort, which is the fastest or nearly the fastest sorting algorithm for most computers. A second algorithm is a vector version of Quicksort that takes advantage of the STAR's vector operations. The third algorithm is an adaptation of Batcher's (1968) sorting algorithm, which makes especially good use of vector operations but has a complexity of N(log N)-squared as compared with a complexity of N log N for the Quicksort algorithms. In spite of its worse complexity, Batcher's sorting algorithm is competitive with the serial version of Quicksort for vectors up to the largest that can be treated by STAR. Vector Quicksort outperforms the other two algorithms and is generally preferred. These results indicate that unusual instruction sets can introduce biases in program execution time that counter results predicted by worst-case asymptotic complexity analysis.

  17. Variation in computer time with geometry prescription in monte carlo code KENO-IV

    International Nuclear Information System (INIS)

    Gopalakrishnan, C.R.

    1988-01-01

    In most studies, the Monte Carlo criticality code KENO-IV has been compared with other Monte Carlo codes, but evaluation of its performance with different box descriptions has not been done so far. In Monte Carlo computations, any fractional savings of computing time is highly desirable. Variation in computation time with box description in KENO for two different fast reactor fuel subassemblies of FBTR and PFBR is studied. The K eff of an infinite array of fuel subassemblies is calculated by modelling the subassemblies in two different ways (i) multi-region, (ii) multi-box. In addition to these two cases, excess reactivity calculations of FBTR are also performed in two ways to study this effect in a complex geometry. It is observed that the K eff values calculated by multi-region and multi-box models agree very well. However the increase in computation time from the multi-box to the multi-region is considerable, while the difference in computer storage requirements for the two models is negligible. This variation in computing time arises from the way the neutron is tracked in the two cases. (author)

  18. SLMRACE: a noise-free RACE implementation with reduced computational time

    Science.gov (United States)

    Chauvin, Juliet; Provenzi, Edoardo

    2017-05-01

    We present a faster and noise-free implementation of the RACE algorithm. RACE has mixed characteristics between the famous Retinex model of Land and McCann and the automatic color equalization (ACE) color-correction algorithm. The original random spray-based RACE implementation suffers from two main problems: its computational time and the presence of noise. Here, we will show that it is possible to adapt two techniques recently proposed by Banić et al. to the RACE framework in order to drastically decrease the computational time and noise generation. The implementation will be called smart-light-memory-RACE (SLMRACE).

  19. Efficient Geo-Computational Algorithms for Constructing Space-Time Prisms in Road Networks

    Directory of Open Access Journals (Sweden)

    Hui-Ping Chen

    2016-11-01

    Full Text Available The Space-time prism (STP is a key concept in time geography for analyzing human activity-travel behavior under various Space-time constraints. Most existing time-geographic studies use a straightforward algorithm to construct STPs in road networks by using two one-to-all shortest path searches. However, this straightforward algorithm can introduce considerable computational overhead, given the fact that accessible links in a STP are generally a small portion of the whole network. To address this issue, an efficient geo-computational algorithm, called NTP-A*, is proposed. The proposed NTP-A* algorithm employs the A* and branch-and-bound techniques to discard inaccessible links during two shortest path searches, and thereby improves the STP construction performance. Comprehensive computational experiments are carried out to demonstrate the computational advantage of the proposed algorithm. Several implementation techniques, including the label-correcting technique and the hybrid link-node labeling technique, are discussed and analyzed. Experimental results show that the proposed NTP-A* algorithm can significantly improve STP construction performance in large-scale road networks by a factor of 100, compared with existing algorithms.

  20. Feasibility and diagnostic power of transthoracic coronary Doppler for coronary flow velocity reserve in patients referred for myocardial perfusion imaging

    Directory of Open Access Journals (Sweden)

    Nylander Eva

    2008-03-01

    Full Text Available Abstract Background Myocardial perfusion imaging (MPI, using single photon emission computed tomography (SPECT is a validated method for detecting coronary artery disease. Transthoracic Doppler echocardiography (TTDE of flow at rest and during adenosine provocation has previously been evaluated in selected patient groups. We therefore wanted to compare the diagnostic ability of TTDE in the left anterior descending coronary artery (LAD to that of MPI in an unselected population of patients with chest pain referred for MPI. Our hypothesis was that TTDE with high accuracy would identify healthy individuals and exclude them from the need for further studies, enabling invasive investigations to be reserved for patients with a high probability of disease. Methods Sixty-nine patients, 44 men and 25 women, age 61 ± 10 years (range 35–82, with a clinical suspicion of stress induced myocardial ischemia, were investigated. TTDE was performed at rest and during adenosine stress for myocardial scintigraphy. Results We found that coronary flow velocity reserve (CFVR determined from diastolic measurements separated normal from abnormal MPI findings with statistical significance. TTDE identified coronary artery disease, defined from MPI, as reversible ischemia and/or permanent defect, with a sensitivity of 60% and a specificity of 79%. The positive predictive value was 43% and the negative predictive value was 88%. There was an overlap between groups which could be due to abnormal endothelial function in patients with normal myocardial perfusion having either hypertension or diabetes. Conclusion TTDE is an attractive non-invasive method to evaluate chest pain without the use of isotopes, but the diagnostic power is strongly dependent on the population investigated. Even in our heterogeneous clinical cardiac population, we found that CFVR>2 in the LAD excluded significant coronary artery disease detected by MPI.

  1. Computational Procedures for a Class of GI/D/k Systems in Discrete Time

    Directory of Open Access Journals (Sweden)

    Md. Mostafizur Rahman

    2009-01-01

    Full Text Available A class of discrete time GI/D/k systems is considered for which the interarrival times have finite support and customers are served in first-in first-out (FIFO order. The system is formulated as a single server queue with new general independent interarrival times and constant service duration by assuming cyclic assignment of customers to the identical servers. Then the queue length is set up as a quasi-birth-death (QBD type Markov chain. It is shown that this transformed GI/D/1 system has special structures which make the computation of the matrix R simple and efficient, thereby reducing the number of multiplications in each iteration significantly. As a result we were able to keep the computation time very low. Moreover, use of the resulting structural properties makes the computation of the distribution of queue length of the transformed system efficient. The computation of the distribution of waiting time is also shown to be simple by exploiting the special structures.

  2. An efficient implementation of parallel molecular dynamics method on SMP cluster architecture

    International Nuclear Information System (INIS)

    Suzuki, Masaaki; Okuda, Hiroshi; Yagawa, Genki

    2003-01-01

    The authors have applied MPI/OpenMP hybrid parallel programming model to parallelize a molecular dynamics (MD) method on a symmetric multiprocessor (SMP) cluster architecture. In that architecture, it can be expected that the hybrid parallel programming model, which uses the message passing library such as MPI for inter-SMP node communication and the loop directive such as OpenMP for intra-SNP node parallelization, is the most effective one. In this study, the parallel performance of the hybrid style has been compared with that of conventional flat parallel programming style, which uses only MPI, both in cases the fast multipole method (FMM) is employed for computing long-distance interactions and that is not employed. The computer environments used here are Hitachi SR8000/MPP placed at the University of Tokyo. The results of calculation are as follows. Without FMM, the parallel efficiency using 16 SMP nodes (128 PEs) is: 90% with the hybrid style, 75% with the flat-MPI style for MD simulation with 33,402 atoms. With FMM, the parallel efficiency using 16 SMP nodes (128 PEs) is: 60% with the hybrid style, 48% with the flat-MPI style for MD simulation with 117,649 atoms. (author)

  3. Metric and topology on a non-standard real line and non-standard space-time

    International Nuclear Information System (INIS)

    Tahir Shah, K.

    1981-04-01

    We study metric and topological properties of extended real line R* and compare it with the non-standard model of real line *R. We show that some properties, like triangular inequality, cannot be carried over R* from R. This confirms F. Wattenberg's result for measure theory on Dedekind completion of *R. Based on conclusions from these results we propose a non-standard model of space-time. This space-time is without undefined objects like singularities. (author)

  4. In-Network Computation is a Dumb Idea Whose Time Has Come

    KAUST Repository

    Sapio, Amedeo; Abdelaziz, Ibrahim; Aldilaijan, Abdulla; Canini, Marco; Kalnis, Panos

    2017-01-01

    Programmable data plane hardware creates new opportunities for infusing intelligence into the network. This raises a fundamental question: what kinds of computation should be delegated to the network? In this paper, we discuss the opportunities and challenges for co-designing data center distributed systems with their network layer. We believe that the time has finally come for offloading part of their computation to execute in-network. However, in-network computation tasks must be judiciously crafted to match the limitations of the network machine architecture of programmable devices. With the help of our experiments on machine learning and graph analytics workloads, we identify that aggregation functions raise opportunities to exploit the limited computation power of networking hardware to lessen network congestion and improve the overall application performance. Moreover, as a proof-of-concept, we propose DAIET, a system that performs in-network data aggregation. Experimental results with an initial prototype show a large data reduction ratio (86.9%-89.3%) and a similar decrease in the workers' computation time.

  5. In-Network Computation is a Dumb Idea Whose Time Has Come

    KAUST Repository

    Sapio, Amedeo

    2017-11-27

    Programmable data plane hardware creates new opportunities for infusing intelligence into the network. This raises a fundamental question: what kinds of computation should be delegated to the network? In this paper, we discuss the opportunities and challenges for co-designing data center distributed systems with their network layer. We believe that the time has finally come for offloading part of their computation to execute in-network. However, in-network computation tasks must be judiciously crafted to match the limitations of the network machine architecture of programmable devices. With the help of our experiments on machine learning and graph analytics workloads, we identify that aggregation functions raise opportunities to exploit the limited computation power of networking hardware to lessen network congestion and improve the overall application performance. Moreover, as a proof-of-concept, we propose DAIET, a system that performs in-network data aggregation. Experimental results with an initial prototype show a large data reduction ratio (86.9%-89.3%) and a similar decrease in the workers\\' computation time.

  6. A Convex Formulation for Magnetic Particle Imaging X-Space Reconstruction.

    Science.gov (United States)

    Konkle, Justin J; Goodwill, Patrick W; Hensley, Daniel W; Orendorff, Ryan D; Lustig, Michael; Conolly, Steven M

    2015-01-01

    Magnetic Particle Imaging (mpi) is an emerging imaging modality with exceptional promise for clinical applications in rapid angiography, cell therapy tracking, cancer imaging, and inflammation imaging. Recent publications have demonstrated quantitative mpi across rat sized fields of view with x-space reconstruction methods. Critical to any medical imaging technology is the reliability and accuracy of image reconstruction. Because the average value of the mpi signal is lost during direct-feedthrough signal filtering, mpi reconstruction algorithms must recover this zero-frequency value. Prior x-space mpi recovery techniques were limited to 1d approaches which could introduce artifacts when reconstructing a 3d image. In this paper, we formulate x-space reconstruction as a 3d convex optimization problem and apply robust a priori knowledge of image smoothness and non-negativity to reduce non-physical banding and haze artifacts. We conclude with a discussion of the powerful extensibility of the presented formulation for future applications.

  7. On the possibility of non-invasive multilayer temperature estimation using soft-computing methods.

    Science.gov (United States)

    Teixeira, C A; Pereira, W C A; Ruano, A E; Ruano, M Graça

    2010-01-01

    considered, i.e. five 5-mm spaced spatial points and eight therapeutic intensities (I(SATA)): 0.3, 0.5, 0.7, 1.0, 1.3, 1.5, 1.7 and 2.0W/cm(2). Models were trained and selected to estimate temperature at only four intensities, then during the validation phase, the best-fitted models were analyzed in data collected at the eight intensities. This procedure leads to a more realistic evaluation of the generalisation level of the best-obtained structures. At the end of the identification phase, 82 (preferable) estimator models were achieved. The majority of them present an average maximum absolute error (MAE) inferior to 0.5 degrees C. The best-fitted estimator presents a MAE of only 0.4 degrees C for both the 40 operating conditions. This means that the gold-standard maximum error (0.5 degrees C) pointed for hyperthermia was fulfilled independently of the intensity and spatial position considered, showing the improved generalisation capacity of the identified estimator models. As the majority of the preferable estimator models, the best one presents 6 inputs and 11 neurons. In addition to the appropriate error performance, the estimator models present also a reduced computational complexity and then the possibility to be applied in real-time. A non-invasive temperature estimation model, based on soft-computing technique, was proposed for a three-layered phantom. The best-achieved estimator models presented an appropriate error performance regardless of the spatial point considered (inside or at the interface of the layers) and of the intensity applied. Other methodologies published so far, estimate temperature only in homogeneous media. The main drawback of the proposed methodology is the necessity of a-priory knowledge of the temperature behavior. Data used for training and optimisation should be representative, i.e., they should cover all possible physical situations of the estimation environment.

  8. Reduced computational cost in the calculation of worst case response time for real time systems

    OpenAIRE

    Urriza, José M.; Schorb, Lucas; Orozco, Javier D.; Cayssials, Ricardo

    2009-01-01

    Modern Real Time Operating Systems require reducing computational costs even though the microprocessors become more powerful each day. It is usual that Real Time Operating Systems for embedded systems have advance features to administrate the resources of the applications that they support. In order to guarantee either the schedulability of the system or the schedulability of a new task in a dynamic Real Time System, it is necessary to know the Worst Case Response Time of the Real Time tasks ...

  9. Relaxation in x-space magnetic particle imaging.

    Science.gov (United States)

    Croft, Laura R; Goodwill, Patrick W; Conolly, Steven M

    2012-12-01

    Magnetic particle imaging (MPI) is a new imaging modality that noninvasively images the spatial distribution of superparamagnetic iron oxide nanoparticles (SPIOs). MPI has demonstrated high contrast and zero attenuation with depth, and MPI promises superior safety compared to current angiography methods, X-ray, computed tomography, and magnetic resonance imaging angiography. Nanoparticle relaxation can delay the SPIO magnetization, and in this work we investigate the open problem of the role relaxation plays in MPI scanning and its effect on the image. We begin by amending the x-space theory of MPI to include nanoparticle relaxation effects. We then validate the amended theory with experiments from a Berkeley x-space relaxometer and a Berkeley x-space projection MPI scanner. Our theory and experimental data indicate that relaxation reduces SNR and asymmetrically blurs the image in the scanning direction. While relaxation effects can have deleterious effects on the MPI scan, we show theoretically and experimentally that x-space reconstruction remains robust in the presence of relaxation. Furthermore, the role of relaxation in x-space theory provides guidance as we develop methods to minimize relaxation-induced blurring. This will be an important future area of research for the MPI community.

  10. The impact of chronic kidney disease as a predictor of major cardiac events in patients with no evidence of coronary artery disease

    International Nuclear Information System (INIS)

    Furuhashi, Tatsuhiko; Moroi, Masao; Joki, Nobuhiko; Hase, Hiroki; Masai, Hirofumi; Kunimasa, Taeko; Nakazato, Ryo; Fukuda, Hiroshi; Sugi, Kaoru

    2010-01-01

    Normal stress myocardial perfusion images (MPI) generally show good prognosis for cardiovascular events. However, chronic kidney disease (CKD) is one of the important risk factors for coronary artery disease (CAD), and the interpretation of normal stress MPI has not been well established in CKD patients with no evidence of CAD. The purpose of this study was to evaluate the long-term prognostic value of stress MPI in CKD patients with no evidence of myocardial ischemia or infarction. Patients who had no history but were suspected of CAD and had normal stress MPI (n=307, male=208, age=67 years, CKD/non-CKD=46/261) were followed-up for 4.5 years. CKD was defined as a glomerular filtration ratio of 2 and/or persistent proteinuria. Cardiac death, non-fatal myocardial infarction, and unstable angina requiring hospitalization were defined as major cardiac events. Major cardiac events were observed in 3 of 261 (1.1%) non-CKD patients and 6 of 46 (13%) CKD patients (p<0.001, with log-rank test). CKD was an independent risk factor for major cardiac events (hazard ratio=13.1, p<0.001, multivariate Cox regression analysis). Normal stress MPI does not always promise a good prognosis for major cardiac events. Even in patients with no evidence of CAD from stress MPI, CKD can be an independent and significant risk factor for major cardiac events. (author)

  11. Parallel hyperbolic PDE simulation on clusters: Cell versus GPU

    Science.gov (United States)

    Rostrup, Scott; De Sterck, Hans

    2010-12-01

    Increasingly, high-performance computing is looking towards data-parallel computational devices to enhance computational performance. Two technologies that have received significant attention are IBM's Cell Processor and NVIDIA's CUDA programming model for graphics processing unit (GPU) computing. In this paper we investigate the acceleration of parallel hyperbolic partial differential equation simulation on structured grids with explicit time integration on clusters with Cell and GPU backends. The message passing interface (MPI) is used for communication between nodes at the coarsest level of parallelism. Optimizations of the simulation code at the several finer levels of parallelism that the data-parallel devices provide are described in terms of data layout, data flow and data-parallel instructions. Optimized Cell and GPU performance are compared with reference code performance on a single x86 central processing unit (CPU) core in single and double precision. We further compare the CPU, Cell and GPU platforms on a chip-to-chip basis, and compare performance on single cluster nodes with two CPUs, two Cell processors or two GPUs in a shared memory configuration (without MPI). We finally compare performance on clusters with 32 CPUs, 32 Cell processors, and 32 GPUs using MPI. Our GPU cluster results use NVIDIA Tesla GPUs with GT200 architecture, but some preliminary results on recently introduced NVIDIA GPUs with the next-generation Fermi architecture are also included. This paper provides computational scientists and engineers who are considering porting their codes to accelerator environments with insight into how structured grid based explicit algorithms can be optimized for clusters with Cell and GPU accelerators. It also provides insight into the speed-up that may be gained on current and future accelerator architectures for this class of applications. Program summaryProgram title: SWsolver Catalogue identifier: AEGY_v1_0 Program summary URL

  12. Time-Domain Terahertz Computed Axial Tomography NDE System

    Science.gov (United States)

    Zimdars, David

    2012-01-01

    NASA has identified the need for advanced non-destructive evaluation (NDE) methods to characterize aging and durability in aircraft materials to improve the safety of the nation's airline fleet. 3D THz tomography can play a major role in detection and characterization of flaws and degradation in aircraft materials, including Kevlar-based composites and Kevlar and Zylon fabric covers for soft-shell fan containment where aging and durability issues are critical. A prototype computed tomography (CT) time-domain (TD) THz imaging system has been used to generate 3D images of several test objects including a TUFI tile (a thermal protection system tile used on the Space Shuttle and possibly the Orion or similar capsules). This TUFI tile had simulated impact damage that was located and the depth of damage determined. The CT motion control gan try was designed and constructed, and then integrated with a T-Ray 4000 control unit and motion controller to create a complete CT TD-THz imaging system prototype. A data collection software script was developed that takes multiple z-axis slices in sequence and saves the data for batch processing. The data collection software was integrated with the ability to batch process the slice data with the CT TD-THz image reconstruction software. The time required to take a single CT slice was decreased from six minutes to approximately one minute by replacing the 320 ps, 100-Hz waveform acquisition system with an 80 ps, 1,000-Hz waveform acquisition system. The TD-THZ computed tomography system was built from pre-existing commercial off-the-shelf subsystems. A CT motion control gantry was constructed from COTS components that can handle larger samples. The motion control gantry allows inspection of sample sizes of up to approximately one cubic foot (.0.03 cubic meters). The system reduced to practice a CT-TDTHz system incorporating a COTS 80- ps/l-kHz waveform scanner. The incorporation of this scanner in the system allows acquisition of 3D

  13. Short-term effects of air quality and thermal stress on non-accidental morbidity-a multivariate meta-analysis comparing indices to single measures.

    Science.gov (United States)

    Lokys, Hanna Leona; Junk, Jürgen; Krein, Andreas

    2018-01-01

    Air quality and thermal stress lead to increased morbidity and mortality. Studies on morbidity and the combined impact of air pollution and thermal stress are still rare. To analyse the correlations between air quality, thermal stress and morbidity, we used a two-stage meta-analysis approach, consisting of a Poisson regression model combined with distributed lag non-linear models (DLNMs) and a meta-analysis investigating whether latitude or the number of inhabitants significantly influence the correlations. We used air pollution, meteorological and hospital admission data from 28 administrative districts along a north-south gradient in western Germany from 2001 to 2011. We compared the performance of the single measure particulate matter (PM10) and air temperature to air quality indices (MPI and CAQI) and the biometeorological index UTCI. Based on the Akaike information criterion (AIC), it can be shown that using air quality indices instead of single measures increases the model strength. However, using the UTCI in the model does not give additional information compared to mean air temperature. Interaction between the 3-day average of air quality (max PM10, max CAQI and max MPI) and meteorology (mean air temperature and mean UTCI) did not improve the models. Using the mean air temperature, we found immediate effects of heat stress (RR 1.0013, 95% CI: 0.9983-1.0043) and by 3 days delayed effects of cold stress (RR: 1.0184, 95% CI: 1.0117-1.0252). The results for air quality differ between both air quality indices and PM10. CAQI and MPI show a delayed impact on morbidity with a maximum RR after 2 days (MPI 1.0058, 95% CI: 1.0013-1.0102; CAQI 1.0068, 95% CI: 1.0030-1.0107). Latitude was identified as a significant meta-variable, whereas the number of inhabitants was not significant in the model.

  14. Short-term effects of air quality and thermal stress on non-accidental morbidity—a multivariate meta-analysis comparing indices to single measures

    Science.gov (United States)

    Lokys, Hanna Leona; Junk, Jürgen; Krein, Andreas

    2018-01-01

    Air quality and thermal stress lead to increased morbidity and mortality. Studies on morbidity and the combined impact of air pollution and thermal stress are still rare. To analyse the correlations between air quality, thermal stress and morbidity, we used a two-stage meta-analysis approach, consisting of a Poisson regression model combined with distributed lag non-linear models (DLNMs) and a meta-analysis investigating whether latitude or the number of inhabitants significantly influence the correlations. We used air pollution, meteorological and hospital admission data from 28 administrative districts along a north-south gradient in western Germany from 2001 to 2011. We compared the performance of the single measure particulate matter (PM10) and air temperature to air quality indices (MPI and CAQI) and the biometeorological index UTCI. Based on the Akaike information criterion (AIC), it can be shown that using air quality indices instead of single measures increases the model strength. However, using the UTCI in the model does not give additional information compared to mean air temperature. Interaction between the 3-day average of air quality (max PM10, max CAQI and max MPI) and meteorology (mean air temperature and mean UTCI) did not improve the models. Using the mean air temperature, we found immediate effects of heat stress (RR 1.0013, 95% CI: 0.9983-1.0043) and by 3 days delayed effects of cold stress (RR: 1.0184, 95% CI: 1.0117-1.0252). The results for air quality differ between both air quality indices and PM10. CAQI and MPI show a delayed impact on morbidity with a maximum RR after 2 days (MPI 1.0058, 95% CI: 1.0013-1.0102; CAQI 1.0068, 95% CI: 1.0030-1.0107). Latitude was identified as a significant meta-variable, whereas the number of inhabitants was not significant in the model.

  15. Multiscale Methods, Parallel Computation, and Neural Networks for Real-Time Computer Vision.

    Science.gov (United States)

    Battiti, Roberto

    1990-01-01

    This thesis presents new algorithms for low and intermediate level computer vision. The guiding ideas in the presented approach are those of hierarchical and adaptive processing, concurrent computation, and supervised learning. Processing of the visual data at different resolutions is used not only to reduce the amount of computation necessary to reach the fixed point, but also to produce a more accurate estimation of the desired parameters. The presented adaptive multiple scale technique is applied to the problem of motion field estimation. Different parts of the image are analyzed at a resolution that is chosen in order to minimize the error in the coefficients of the differential equations to be solved. Tests with video-acquired images show that velocity estimation is more accurate over a wide range of motion with respect to the homogeneous scheme. In some cases introduction of explicit discontinuities coupled to the continuous variables can be used to avoid propagation of visual information from areas corresponding to objects with different physical and/or kinematic properties. The human visual system uses concurrent computation in order to process the vast amount of visual data in "real -time." Although with different technological constraints, parallel computation can be used efficiently for computer vision. All the presented algorithms have been implemented on medium grain distributed memory multicomputers with a speed-up approximately proportional to the number of processors used. A simple two-dimensional domain decomposition assigns regions of the multiresolution pyramid to the different processors. The inter-processor communication needed during the solution process is proportional to the linear dimension of the assigned domain, so that efficiency is close to 100% if a large region is assigned to each processor. Finally, learning algorithms are shown to be a viable technique to engineer computer vision systems for different applications starting from

  16. Non-Perturbative Formulation of Time-Dependent String Solutions

    CERN Document Server

    Alexandre, J; Mavromatos, Nikolaos E; Alexandre, Jean; Ellis, John; Mavromatos, Nikolaos E.

    2006-01-01

    We formulate here a new world-sheet renormalization-group technique for the bosonic string, which is non-perturbative in the Regge slope alpha' and based on a functional method for controlling the quantum fluctuations, whose magnitudes are scaled by the value of alpha'. Using this technique we exhibit, in addition to the well-known linear-dilaton cosmology, a new, non-perturbative time-dependent background solution. Using the reparametrization invariance of the string S-matrix, we demonstrate that this solution is conformally invariant to alpha', and we give a heuristic inductive argument that conformal invariance can be maintained to all orders in alpha'. This new time-dependent string solution may be applicable to primordial cosmology or to the exit from linear-dilaton cosmology at large times.

  17. Non-obstructive coronary artery disease assessed by coronary computed tomography angiography

    DEFF Research Database (Denmark)

    Nielsen, L.; Bøtker, H. E.; Sorensen, H.

    2015-01-01

    Introduction: Coronary CT angiography (CTA) detects non-obstructive coronary artery disease (CAD) that may not be recognized by functional testing, but the prognostic impact is not well understood. This study aimed to compare the risk of myocardial infarction (MI) and all-cause mortality...... in patients without or with non-obstructive and obstructive CAD assessed by coronary CTA. Methods: Consecutive patients without known coronary artery disease (CAD) and with chest pain who underwent coronary CTA (>64-detector row) between January 2007 and December 2012 in the 10 centers participating...... in the Western Denmark Cardiac Computed Tomography Registry were included. The endpoints were 3-year MI or all-cause mortality. The coronary CTA result was defined as normal (0% luminal stenosis), non-obstructive CAD (1%-49% luminal stenosis) or obstructive CAD (>50% luminal stenosis; 1-vessel, 2-vessel, or 3...

  18. Parallelization of Subchannel Analysis Code MATRA

    International Nuclear Information System (INIS)

    Kim, Seongjin; Hwang, Daehyun; Kwon, Hyouk

    2014-01-01

    A stand-alone calculation of MATRA code used up pertinent computing time for the thermal margin calculations while a relatively considerable time is needed to solve the whole core pin-by-pin problems. In addition, it is strongly required to improve the computation speed of the MATRA code to satisfy the overall performance of the multi-physics coupling calculations. Therefore, a parallel approach to improve and optimize the computability of the MATRA code is proposed and verified in this study. The parallel algorithm is embodied in the MATRA code using the MPI communication method and the modification of the previous code structure was minimized. An improvement is confirmed by comparing the results between the single and multiple processor algorithms. The speedup and efficiency are also evaluated when increasing the number of processors. The parallel algorithm was implemented to the subchannel code MATRA using the MPI. The performance of the parallel algorithm was verified by comparing the results with those from the MATRA with the single processor. It is also noticed that the performance of the MATRA code was greatly improved by implementing the parallel algorithm for the 1/8 core and whole core problems

  19. A Non-standard Empirical Likelihood for Time Series

    DEFF Research Database (Denmark)

    Nordman, Daniel J.; Bunzel, Helle; Lahiri, Soumendra N.

    Standard blockwise empirical likelihood (BEL) for stationary, weakly dependent time series requires specifying a fixed block length as a tuning parameter for setting confidence regions. This aspect can be difficult and impacts coverage accuracy. As an alternative, this paper proposes a new version...... of BEL based on a simple, though non-standard, data-blocking rule which uses a data block of every possible length. Consequently, the method involves no block selection and is also anticipated to exhibit better coverage performance. Its non-standard blocking scheme, however, induces non......-standard asymptotics and requires a significantly different development compared to standard BEL. We establish the large-sample distribution of log-ratio statistics from the new BEL method for calibrating confidence regions for mean or smooth function parameters of time series. This limit law is not the usual chi...

  20. Review of quantum computation

    International Nuclear Information System (INIS)

    Lloyd, S.

    1992-01-01

    Digital computers are machines that can be programmed to perform logical and arithmetical operations. Contemporary digital computers are ''universal,'' in the sense that a program that runs on one computer can, if properly compiled, run on any other computer that has access to enough memory space and time. Any one universal computer can simulate the operation of any other; and the set of tasks that any such machine can perform is common to all universal machines. Since Bennett's discovery that computation can be carried out in a non-dissipative fashion, a number of Hamiltonian quantum-mechanical systems have been proposed whose time-evolutions over discrete intervals are equivalent to those of specific universal computers. The first quantum-mechanical treatment of computers was given by Benioff, who exhibited a Hamiltonian system with a basis whose members corresponded to the logical states of a Turing machine. In order to make the Hamiltonian local, in the sense that its structure depended only on the part of the computation being performed at that time, Benioff found it necessary to make the Hamiltonian time-dependent. Feynman discovered a way to make the computational Hamiltonian both local and time-independent by incorporating the direction of computation in the initial condition. In Feynman's quantum computer, the program is a carefully prepared wave packet that propagates through different computational states. Deutsch presented a quantum computer that exploits the possibility of existing in a superposition of computational states to perform tasks that a classical computer cannot, such as generating purely random numbers, and carrying out superpositions of computations as a method of parallel processing. In this paper, we show that such computers, by virtue of their common function, possess a common form for their quantum dynamics