WorldWideScience

Sample records for high computational efficiency

  1. Dimensioning storage and computing clusters for efficient High Throughput Computing

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    Scientific experiments are producing huge amounts of data, and they continue increasing the size of their datasets and the total volume of data. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of Scientific Data Centres has shifted from coping efficiently with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful s...

  2. Dimensioning storage and computing clusters for efficient high throughput computing

    International Nuclear Information System (INIS)

    Accion, E; Bria, A; Bernabeu, G; Caubet, M; Delfino, M; Espinal, X; Merino, G; Lopez, F; Martinez, F; Planas, E

    2012-01-01

    Scientific experiments are producing huge amounts of data, and the size of their datasets and total volume of data continues increasing. These data are then processed by researchers belonging to large scientific collaborations, with the Large Hadron Collider being a good example. The focal point of scientific data centers has shifted from efficiently coping with PetaByte scale storage to deliver quality data processing throughput. The dimensioning of the internal components in High Throughput Computing (HTC) data centers is of crucial importance to cope with all the activities demanded by the experiments, both the online (data acceptance) and the offline (data processing, simulation and user analysis). This requires a precise setup involving disk and tape storage services, a computing cluster and the internal networking to prevent bottlenecks, overloads and undesired slowness that lead to losses cpu cycles and batch jobs failures. In this paper we point out relevant features for running a successful data storage and processing service in an intensive HTC environment.

  3. Low-cost, high-performance and efficiency computational photometer design

    Science.gov (United States)

    Siewert, Sam B.; Shihadeh, Jeries; Myers, Randall; Khandhar, Jay; Ivanov, Vitaly

    2014-05-01

    Researchers at the University of Alaska Anchorage and University of Colorado Boulder have built a low cost high performance and efficiency drop-in-place Computational Photometer (CP) to test in field applications ranging from port security and safety monitoring to environmental compliance monitoring and surveying. The CP integrates off-the-shelf visible spectrum cameras with near to long wavelength infrared detectors and high resolution digital snapshots in a single device. The proof of concept combines three or more detectors into a single multichannel imaging system that can time correlate read-out, capture, and image process all of the channels concurrently with high performance and energy efficiency. The dual-channel continuous read-out is combined with a third high definition digital snapshot capability and has been designed using an FPGA (Field Programmable Gate Array) to capture, decimate, down-convert, re-encode, and transform images from two standard definition CCD (Charge Coupled Device) cameras at 30Hz. The continuous stereo vision can be time correlated to megapixel high definition snapshots. This proof of concept has been fabricated as a fourlayer PCB (Printed Circuit Board) suitable for use in education and research for low cost high efficiency field monitoring applications that need multispectral and three dimensional imaging capabilities. Initial testing is in progress and includes field testing in ports, potential test flights in un-manned aerial systems, and future planned missions to image harsh environments in the arctic including volcanic plumes, ice formation, and arctic marine life.

  4. High-efficiency photorealistic computer-generated holograms based on the backward ray-tracing technique

    Science.gov (United States)

    Wang, Yuan; Chen, Zhidong; Sang, Xinzhu; Li, Hui; Zhao, Linmin

    2018-03-01

    Holographic displays can provide the complete optical wave field of a three-dimensional (3D) scene, including the depth perception. However, it often takes a long computation time to produce traditional computer-generated holograms (CGHs) without more complex and photorealistic rendering. The backward ray-tracing technique is able to render photorealistic high-quality images, which noticeably reduce the computation time achieved from the high-degree parallelism. Here, a high-efficiency photorealistic computer-generated hologram method is presented based on the ray-tracing technique. Rays are parallelly launched and traced under different illuminations and circumstances. Experimental results demonstrate the effectiveness of the proposed method. Compared with the traditional point cloud CGH, the computation time is decreased to 24 s to reconstruct a 3D object of 100 ×100 rays with continuous depth change.

  5. Power-efficient computer architectures recent advances

    CERN Document Server

    Själander, Magnus; Kaxiras, Stefanos

    2014-01-01

    As Moore's Law and Dennard scaling trends have slowed, the challenges of building high-performance computer architectures while maintaining acceptable power efficiency levels have heightened. Over the past ten years, architecture techniques for power efficiency have shifted from primarily focusing on module-level efficiencies, toward more holistic design styles based on parallelism and heterogeneity. This work highlights and synthesizes recent techniques and trends in power-efficient computer architecture.Table of Contents: Introduction / Voltage and Frequency Management / Heterogeneity and Sp

  6. Computation of the efficiency distribution of a multichannel focusing collimator

    International Nuclear Information System (INIS)

    Balasubramanian, A.; Venkateswaran, T.V.

    1977-01-01

    This article describes two computer methods of calculating the point source efficiency distribution functions of a focusing collimator with round tapered holes. The first method which computes only the geometric efficiency distribution is adequate for low energy collimators while the second method which computes both geometric and penetration efficiencies can be made use of for medium and high energy collimators. The scatter contribution to the efficiency is not taken into account. In the first method the efficiency distribution of a single cone of the collimator is obtained and the data are used for computing the distribution of the whole collimator. For high energy collimator the entire detector region is imagined to be divided into elemental areas. Efficiency of the elemental area is computed after suitably weighting for the penetration within the collimator septa, which is determined by three dimensional geometric techniques. The method of computing the line source efficiency distribution from point source distribution is also explained. The formulations have been tested by computing the efficiency distribution of several commercial collimators and collimators fabricated by us. (Auth.)

  7. A highly efficient parallel algorithm for solving the neutron diffusion nodal equations on shared-memory computers

    International Nuclear Information System (INIS)

    Azmy, Y.Y.; Kirk, B.L.

    1990-01-01

    Modern parallel computer architectures offer an enormous potential for reducing CPU and wall-clock execution times of large-scale computations commonly performed in various applications in science and engineering. Recently, several authors have reported their efforts in developing and implementing parallel algorithms for solving the neutron diffusion equation on a variety of shared- and distributed-memory parallel computers. Testing of these algorithms for a variety of two- and three-dimensional meshes showed significant speedup of the computation. Even for very large problems (i.e., three-dimensional fine meshes) executed concurrently on a few nodes in serial (nonvector) mode, however, the measured computational efficiency is very low (40 to 86%). In this paper, the authors present a highly efficient (∼85 to 99.9%) algorithm for solving the two-dimensional nodal diffusion equations on the Sequent Balance 8000 parallel computer. Also presented is a model for the performance, represented by the efficiency, as a function of problem size and the number of participating processors. The model is validated through several tests and then extrapolated to larger problems and more processors to predict the performance of the algorithm in more computationally demanding situations

  8. Efficient quantum computing with weak measurements

    International Nuclear Information System (INIS)

    Lund, A P

    2011-01-01

    Projective measurements with high quantum efficiency are often assumed to be required for efficient circuit-based quantum computing. We argue that this is not the case and show that the fact that they are not required was actually known previously but was not deeply explored. We examine this issue by giving an example of how to perform the quantum-ordering-finding algorithm efficiently using non-local weak measurements considering that the measurements used are of bounded weakness and some fixed but arbitrary probability of success less than unity is required. We also show that it is possible to perform the same computation with only local weak measurements, but this must necessarily introduce an exponential overhead.

  9. Efficient GPU-based skyline computation

    DEFF Research Database (Denmark)

    Bøgh, Kenneth Sejdenfaden; Assent, Ira; Magnani, Matteo

    2013-01-01

    The skyline operator for multi-criteria search returns the most interesting points of a data set with respect to any monotone preference function. Existing work has almost exclusively focused on efficiently computing skylines on one or more CPUs, ignoring the high parallelism possible in GPUs. In...

  10. Computational efficiency for the surface renewal method

    Science.gov (United States)

    Kelley, Jason; Higgins, Chad

    2018-04-01

    Measuring surface fluxes using the surface renewal (SR) method requires programmatic algorithms for tabulation, algebraic calculation, and data quality control. A number of different methods have been published describing automated calibration of SR parameters. Because the SR method utilizes high-frequency (10 Hz+) measurements, some steps in the flux calculation are computationally expensive, especially when automating SR to perform many iterations of these calculations. Several new algorithms were written that perform the required calculations more efficiently and rapidly, and that tested for sensitivity to length of flux averaging period, ability to measure over a large range of lag timescales, and overall computational efficiency. These algorithms utilize signal processing techniques and algebraic simplifications that demonstrate simple modifications that dramatically improve computational efficiency. The results here complement efforts by other authors to standardize a robust and accurate computational SR method. Increased speed of computation time grants flexibility to implementing the SR method, opening new avenues for SR to be used in research, for applied monitoring, and in novel field deployments.

  11. Improving the Eco-Efficiency of High Performance Computing Clusters Using EECluster

    Directory of Open Access Journals (Sweden)

    Alberto Cocaña-Fernández

    2016-03-01

    Full Text Available As data and supercomputing centres increase their performance to improve service quality and target more ambitious challenges every day, their carbon footprint also continues to grow, and has already reached the magnitude of the aviation industry. Also, high power consumptions are building up to a remarkable bottleneck for the expansion of these infrastructures in economic terms due to the unavailability of sufficient energy sources. A substantial part of the problem is caused by current energy consumptions of High Performance Computing (HPC clusters. To alleviate this situation, we present in this work EECluster, a tool that integrates with multiple open-source Resource Management Systems to significantly reduce the carbon footprint of clusters by improving their energy efficiency. EECluster implements a dynamic power management mechanism based on Computational Intelligence techniques by learning a set of rules through multi-criteria evolutionary algorithms. This approach enables cluster operators to find the optimal balance between a reduction in the cluster energy consumptions, service quality, and number of reconfigurations. Experimental studies using both synthetic and actual workloads from a real world cluster support the adoption of this tool to reduce the carbon footprint of HPC clusters.

  12. Many-core technologies: The move to energy-efficient, high-throughput x86 computing (TFLOPS on a chip)

    CERN Multimedia

    CERN. Geneva

    2012-01-01

    With Moore's Law alive and well, more and more parallelism is introduced into all computing platforms at all levels of integration and programming to achieve higher performance and energy efficiency. Especially in the area of High-Performance Computing (HPC) users can entertain a combination of different hardware and software parallel architectures and programming environments. Those technologies range from vectorization and SIMD computation over shared memory multi-threading (e.g. OpenMP) to distributed memory message passing (e.g. MPI) on cluster systems. We will discuss HPC industry trends and Intel's approach to it from processor/system architectures and research activities to hardware and software tools technologies. This includes the recently announced new Intel(r) Many Integrated Core (MIC) architecture for highly-parallel workloads and general purpose, energy efficient TFLOPS performance, some of its architectural features and its programming environment. At the end we will have a br...

  13. Interaction Entropy: A New Paradigm for Highly Efficient and Reliable Computation of Protein-Ligand Binding Free Energy.

    Science.gov (United States)

    Duan, Lili; Liu, Xiao; Zhang, John Z H

    2016-05-04

    Efficient and reliable calculation of protein-ligand binding free energy is a grand challenge in computational biology and is of critical importance in drug design and many other molecular recognition problems. The main challenge lies in the calculation of entropic contribution to protein-ligand binding or interaction systems. In this report, we present a new interaction entropy method which is theoretically rigorous, computationally efficient, and numerically reliable for calculating entropic contribution to free energy in protein-ligand binding and other interaction processes. Drastically different from the widely employed but extremely expensive normal mode method for calculating entropy change in protein-ligand binding, the new method calculates the entropic component (interaction entropy or -TΔS) of the binding free energy directly from molecular dynamics simulation without any extra computational cost. Extensive study of over a dozen randomly selected protein-ligand binding systems demonstrated that this interaction entropy method is both computationally efficient and numerically reliable and is vastly superior to the standard normal mode approach. This interaction entropy paradigm introduces a novel and intuitive conceptual understanding of the entropic effect in protein-ligand binding and other general interaction systems as well as a practical method for highly efficient calculation of this effect.

  14. Efficient computation of hashes

    International Nuclear Information System (INIS)

    Lopes, Raul H C; Franqueira, Virginia N L; Hobson, Peter R

    2014-01-01

    The sequential computation of hashes at the core of many distributed storage systems and found, for example, in grid services can hinder efficiency in service quality and even pose security challenges that can only be addressed by the use of parallel hash tree modes. The main contributions of this paper are, first, the identification of several efficiency and security challenges posed by the use of sequential hash computation based on the Merkle-Damgard engine. In addition, alternatives for the parallel computation of hash trees are discussed, and a prototype for a new parallel implementation of the Keccak function, the SHA-3 winner, is introduced.

  15. Efficient computation of argumentation semantics

    CERN Document Server

    Liao, Beishui

    2013-01-01

    Efficient Computation of Argumentation Semantics addresses argumentation semantics and systems, introducing readers to cutting-edge decomposition methods that drive increasingly efficient logic computation in AI and intelligent systems. Such complex and distributed systems are increasingly used in the automation and transportation systems field, and particularly autonomous systems, as well as more generic intelligent computation research. The Series in Intelligent Systems publishes titles that cover state-of-the-art knowledge and the latest advances in research and development in intelligen

  16. Computational design of high efficiency release targets for use at ISOL facilities

    CERN Document Server

    Liu, Y

    1999-01-01

    This report describes efforts made at the Oak Ridge National Laboratory to design high-efficiency-release targets that simultaneously incorporate the short diffusion lengths, high permeabilities, controllable temperatures, and heat-removal properties required for the generation of useful radioactive ion beam (RIB) intensities for nuclear physics and astrophysics research using the isotope separation on-line (ISOL) technique. Short diffusion lengths are achieved either by using thin fibrous target materials or by coating thin layers of selected target material onto low-density carbon fibers such as reticulated-vitreous-carbon fiber (RVCF) or carbon-bonded-carbon fiber (CBCF) to form highly permeable composite target matrices. Computational studies that simulate the generation and removal of primary beam deposited heat from target materials have been conducted to optimize the design of target/heat-sink systems for generating RIBs. The results derived from diffusion release-rate simulation studies for selected t...

  17. Complexity-aware high efficiency video coding

    CERN Document Server

    Correa, Guilherme; Agostini, Luciano; Cruz, Luis A da Silva

    2016-01-01

    This book discusses computational complexity of High Efficiency Video Coding (HEVC) encoders with coverage extending from the analysis of HEVC compression efficiency and computational complexity to the reduction and scaling of its encoding complexity. After an introduction to the topic and a review of the state-of-the-art research in the field, the authors provide a detailed analysis of the HEVC encoding tools compression efficiency and computational complexity.  Readers will benefit from a set of algorithms for scaling the computational complexity of HEVC encoders, all of which take advantage from the flexibility of the frame partitioning structures allowed by the standard.  The authors also provide a set of early termination methods based on data mining and machine learning techniques, which are able to reduce the computational complexity required to find the best frame partitioning structures. The applicability of the proposed methods is finally exemplified with an encoding time control system that emplo...

  18. Computing with memory for energy-efficient robust systems

    CERN Document Server

    Paul, Somnath

    2013-01-01

    This book analyzes energy and reliability as major challenges faced by designers of computing frameworks in the nanometer technology regime.  The authors describe the existing solutions to address these challenges and then reveal a new reconfigurable computing platform, which leverages high-density nanoscale memory for both data storage and computation to maximize the energy-efficiency and reliability. The energy and reliability benefits of this new paradigm are illustrated and the design challenges are discussed. Various hardware and software aspects of this exciting computing paradigm are de

  19. Convolutional networks for fast, energy-efficient neuromorphic computing.

    Science.gov (United States)

    Esser, Steven K; Merolla, Paul A; Arthur, John V; Cassidy, Andrew S; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J; McKinstry, Jeffrey L; Melano, Timothy; Barch, Davis R; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D; Modha, Dharmendra S

    2016-10-11

    Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware's underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer.

  20. Power-Efficient Computing: Experiences from the COSA Project

    Directory of Open Access Journals (Sweden)

    Daniele Cesini

    2017-01-01

    Full Text Available Energy consumption is today one of the most relevant issues in operating HPC systems for scientific applications. The use of unconventional computing systems is therefore of great interest for several scientific communities looking for a better tradeoff between time-to-solution and energy-to-solution. In this context, the performance assessment of processors with a high ratio of performance per watt is necessary to understand how to realize energy-efficient computing systems for scientific applications, using this class of processors. Computing On SOC Architecture (COSA is a three-year project (2015–2017 funded by the Scientific Commission V of the Italian Institute for Nuclear Physics (INFN, which aims to investigate the performance and the total cost of ownership offered by computing systems based on commodity low-power Systems on Chip (SoCs and high energy-efficient systems based on GP-GPUs. In this work, we present the results of the project analyzing the performance of several scientific applications on several GPU- and SoC-based systems. We also describe the methodology we have used to measure energy performance and the tools we have implemented to monitor the power drained by applications while running.

  1. On efficiency of fire simulation realization: parallelization with greater number of computational meshes

    Science.gov (United States)

    Valasek, Lukas; Glasa, Jan

    2017-12-01

    Current fire simulation systems are capable to utilize advantages of high-performance computer (HPC) platforms available and to model fires efficiently in parallel. In this paper, efficiency of a corridor fire simulation on a HPC computer cluster is discussed. The parallel MPI version of Fire Dynamics Simulator is used for testing efficiency of selected strategies of allocation of computational resources of the cluster using a greater number of computational cores. Simulation results indicate that if the number of cores used is not equal to a multiple of the total number of cluster node cores there are allocation strategies which provide more efficient calculations.

  2. High performance computing in Windows Azure cloud

    OpenAIRE

    Ambruš, Dejan

    2013-01-01

    High performance, security, availability, scalability, flexibility and lower costs of maintenance have essentially contributed to the growing popularity of cloud computing in all spheres of life, especially in business. In fact cloud computing offers even more than this. With usage of virtual computing clusters a runtime environment for high performance computing can be efficiently implemented also in a cloud. There are many advantages but also some disadvantages of cloud computing, some ...

  3. An efficient implementation of 3D high-resolution imaging for large-scale seismic data with GPU/CPU heterogeneous parallel computing

    Science.gov (United States)

    Xu, Jincheng; Liu, Wei; Wang, Jin; Liu, Linong; Zhang, Jianfeng

    2018-02-01

    De-absorption pre-stack time migration (QPSTM) compensates for the absorption and dispersion of seismic waves by introducing an effective Q parameter, thereby making it an effective tool for 3D, high-resolution imaging of seismic data. Although the optimal aperture obtained via stationary-phase migration reduces the computational cost of 3D QPSTM and yields 3D stationary-phase QPSTM, the associated computational efficiency is still the main problem in the processing of 3D, high-resolution images for real large-scale seismic data. In the current paper, we proposed a division method for large-scale, 3D seismic data to optimize the performance of stationary-phase QPSTM on clusters of graphics processing units (GPU). Then, we designed an imaging point parallel strategy to achieve an optimal parallel computing performance. Afterward, we adopted an asynchronous double buffering scheme for multi-stream to perform the GPU/CPU parallel computing. Moreover, several key optimization strategies of computation and storage based on the compute unified device architecture (CUDA) were adopted to accelerate the 3D stationary-phase QPSTM algorithm. Compared with the initial GPU code, the implementation of the key optimization steps, including thread optimization, shared memory optimization, register optimization and special function units (SFU), greatly improved the efficiency. A numerical example employing real large-scale, 3D seismic data showed that our scheme is nearly 80 times faster than the CPU-QPSTM algorithm. Our GPU/CPU heterogeneous parallel computing framework significant reduces the computational cost and facilitates 3D high-resolution imaging for large-scale seismic data.

  4. A primer on the energy efficiency of computing

    Energy Technology Data Exchange (ETDEWEB)

    Koomey, Jonathan G. [Research Fellow, Steyer-Taylor Center for Energy Policy and Finance, Stanford University (United States)

    2015-03-30

    The efficiency of computing at peak output has increased rapidly since the dawn of the computer age. This paper summarizes some of the key factors affecting the efficiency of computing in all usage modes. While there is still great potential for improving the efficiency of computing devices, we will need to alter how we do computing in the next few decades because we are finally approaching the limits of current technologies.

  5. Convolutional networks for fast, energy-efficient neuromorphic computing

    Science.gov (United States)

    Esser, Steven K.; Merolla, Paul A.; Arthur, John V.; Cassidy, Andrew S.; Appuswamy, Rathinakumar; Andreopoulos, Alexander; Berg, David J.; McKinstry, Jeffrey L.; Melano, Timothy; Barch, Davis R.; di Nolfo, Carmelo; Datta, Pallab; Amir, Arnon; Taba, Brian; Flickner, Myron D.; Modha, Dharmendra S.

    2016-01-01

    Deep networks are now able to achieve human-level performance on a broad spectrum of recognition tasks. Independently, neuromorphic computing has now demonstrated unprecedented energy-efficiency through a new chip architecture based on spiking neurons, low precision synapses, and a scalable communication network. Here, we demonstrate that neuromorphic computing, despite its novel architectural primitives, can implement deep convolution networks that (i) approach state-of-the-art classification accuracy across eight standard datasets encompassing vision and speech, (ii) perform inference while preserving the hardware’s underlying energy-efficiency and high throughput, running on the aforementioned datasets at between 1,200 and 2,600 frames/s and using between 25 and 275 mW (effectively >6,000 frames/s per Watt), and (iii) can be specified and trained using backpropagation with the same ease-of-use as contemporary deep learning. This approach allows the algorithmic power of deep learning to be merged with the efficiency of neuromorphic processors, bringing the promise of embedded, intelligent, brain-inspired computing one step closer. PMID:27651489

  6. GATE: Improving the computational efficiency

    International Nuclear Information System (INIS)

    Staelens, S.; De Beenhouwer, J.; Kruecker, D.; Maigne, L.; Rannou, F.; Ferrer, L.; D'Asseler, Y.; Buvat, I.; Lemahieu, I.

    2006-01-01

    GATE is a software dedicated to Monte Carlo simulations in Single Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET). An important disadvantage of those simulations is the fundamental burden of computation time. This manuscript describes three different techniques in order to improve the efficiency of those simulations. Firstly, the implementation of variance reduction techniques (VRTs), more specifically the incorporation of geometrical importance sampling, is discussed. After this, the newly designed cluster version of the GATE software is described. The experiments have shown that GATE simulations scale very well on a cluster of homogeneous computers. Finally, an elaboration on the deployment of GATE on the Enabling Grids for E-Science in Europe (EGEE) grid will conclude the description of efficiency enhancement efforts. The three aforementioned methods improve the efficiency of GATE to a large extent and make realistic patient-specific overnight Monte Carlo simulations achievable

  7. Spin-neurons: A possible path to energy-efficient neuromorphic computers

    Energy Technology Data Exchange (ETDEWEB)

    Sharad, Mrigank; Fan, Deliang; Roy, Kaushik [School of Electrical and Computer Engineering, Purdue University, West Lafayette, Indiana 47907 (United States)

    2013-12-21

    Recent years have witnessed growing interest in the field of brain-inspired computing based on neural-network architectures. In order to translate the related algorithmic models into powerful, yet energy-efficient cognitive-computing hardware, computing-devices beyond CMOS may need to be explored. The suitability of such devices to this field of computing would strongly depend upon how closely their physical characteristics match with the essential computing primitives employed in such models. In this work, we discuss the rationale of applying emerging spin-torque devices for bio-inspired computing. Recent spin-torque experiments have shown the path to low-current, low-voltage, and high-speed magnetization switching in nano-scale magnetic devices. Such magneto-metallic, current-mode spin-torque switches can mimic the analog summing and “thresholding” operation of an artificial neuron with high energy-efficiency. Comparison with CMOS-based analog circuit-model of a neuron shows that “spin-neurons” (spin based circuit model of neurons) can achieve more than two orders of magnitude lower energy and beyond three orders of magnitude reduction in energy-delay product. The application of spin-neurons can therefore be an attractive option for neuromorphic computers of future.

  8. Efficient Parallel Kernel Solvers for Computational Fluid Dynamics Applications

    Science.gov (United States)

    Sun, Xian-He

    1997-01-01

    Distributed-memory parallel computers dominate today's parallel computing arena. These machines, such as Intel Paragon, IBM SP2, and Cray Origin2OO, have successfully delivered high performance computing power for solving some of the so-called "grand-challenge" problems. Despite initial success, parallel machines have not been widely accepted in production engineering environments due to the complexity of parallel programming. On a parallel computing system, a task has to be partitioned and distributed appropriately among processors to reduce communication cost and to attain load balance. More importantly, even with careful partitioning and mapping, the performance of an algorithm may still be unsatisfactory, since conventional sequential algorithms may be serial in nature and may not be implemented efficiently on parallel machines. In many cases, new algorithms have to be introduced to increase parallel performance. In order to achieve optimal performance, in addition to partitioning and mapping, a careful performance study should be conducted for a given application to find a good algorithm-machine combination. This process, however, is usually painful and elusive. The goal of this project is to design and develop efficient parallel algorithms for highly accurate Computational Fluid Dynamics (CFD) simulations and other engineering applications. The work plan is 1) developing highly accurate parallel numerical algorithms, 2) conduct preliminary testing to verify the effectiveness and potential of these algorithms, 3) incorporate newly developed algorithms into actual simulation packages. The work plan has well achieved. Two highly accurate, efficient Poisson solvers have been developed and tested based on two different approaches: (1) Adopting a mathematical geometry which has a better capacity to describe the fluid, (2) Using compact scheme to gain high order accuracy in numerical discretization. The previously developed Parallel Diagonal Dominant (PDD) algorithm

  9. Energy efficiency indicators for high electric-load buildings

    Energy Technology Data Exchange (ETDEWEB)

    Aebischer, Bernard; Balmer, Markus A.; Kinney, Satkartar; Le Strat, Pascale; Shibata, Yoshiaki; Varone, Frederic

    2003-06-01

    Energy per unit of floor area is not an adequate indicator for energy efficiency in high electric-load buildings. For two activities, restaurants and computer centres, alternative indicators for energy efficiency are discussed.

  10. Exploring Infiniband Hardware Virtualization in OpenNebula towards Efficient High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Pais Pitta de Lacerda Ruivo, Tiago [IIT, Chicago; Bernabeu Altayo, Gerard [Fermilab; Garzoglio, Gabriele [Fermilab; Timm, Steven [Fermilab; Kim, Hyun-Woo [Fermilab; Noh, Seo-Young [KISTI, Daejeon; Raicu, Ioan [IIT, Chicago

    2014-11-11

    has been widely accepted that software virtualization has a big negative impact on high-performance computing (HPC) application performance. This work explores the potential use of Infiniband hardware virtualization in an OpenNebula cloud towards the efficient support of MPI-based workloads. We have implemented, deployed, and tested an Infiniband network on the FermiCloud private Infrastructure-as-a-Service (IaaS) cloud. To avoid software virtualization towards minimizing the virtualization overhead, we employed a technique called Single Root Input/Output Virtualization (SRIOV). Our solution spanned modifications to the Linux’s Hypervisor as well as the OpenNebula manager. We evaluated the performance of the hardware virtualization on up to 56 virtual machines connected by up to 8 DDR Infiniband network links, with micro-benchmarks (latency and bandwidth) as well as w a MPI-intensive application (the HPL Linpack benchmark).

  11. Efficient and Flexible Computation of Many-Electron Wave Function Overlaps.

    Science.gov (United States)

    Plasser, Felix; Ruckenbauer, Matthias; Mai, Sebastian; Oppel, Markus; Marquetand, Philipp; González, Leticia

    2016-03-08

    A new algorithm for the computation of the overlap between many-electron wave functions is described. This algorithm allows for the extensive use of recurring intermediates and thus provides high computational efficiency. Because of the general formalism employed, overlaps can be computed for varying wave function types, molecular orbitals, basis sets, and molecular geometries. This paves the way for efficiently computing nonadiabatic interaction terms for dynamics simulations. In addition, other application areas can be envisaged, such as the comparison of wave functions constructed at different levels of theory. Aside from explaining the algorithm and evaluating the performance, a detailed analysis of the numerical stability of wave function overlaps is carried out, and strategies for overcoming potential severe pitfalls due to displaced atoms and truncated wave functions are presented.

  12. Recovery Act - CAREER: Sustainable Silicon -- Energy-Efficient VLSI Interconnect for Extreme-Scale Computing

    Energy Technology Data Exchange (ETDEWEB)

    Chiang, Patrick [Oregon State Univ., Corvallis, OR (United States)

    2014-01-31

    The research goal of this CAREER proposal is to develop energy-efficient, VLSI interconnect circuits and systems that will facilitate future massively-parallel, high-performance computing. Extreme-scale computing will exhibit massive parallelism on multiple vertical levels, from thou­ sands of computational units on a single processor to thousands of processors in a single data center. Unfortunately, the energy required to communicate between these units at every level (on­ chip, off-chip, off-rack) will be the critical limitation to energy efficiency. Therefore, the PI's career goal is to become a leading researcher in the design of energy-efficient VLSI interconnect for future computing systems.

  13. A Computationally Efficient Method for Polyphonic Pitch Estimation

    Directory of Open Access Journals (Sweden)

    Ruohua Zhou

    2009-01-01

    Full Text Available This paper presents a computationally efficient method for polyphonic pitch estimation. The method employs the Fast Resonator Time-Frequency Image (RTFI as the basic time-frequency analysis tool. The approach is composed of two main stages. First, a preliminary pitch estimation is obtained by means of a simple peak-picking procedure in the pitch energy spectrum. Such spectrum is calculated from the original RTFI energy spectrum according to harmonic grouping principles. Then the incorrect estimations are removed according to spectral irregularity and knowledge of the harmonic structures of the music notes played on commonly used music instruments. The new approach is compared with a variety of other frame-based polyphonic pitch estimation methods, and results demonstrate the high performance and computational efficiency of the approach.

  14. Efficient Resource Management in Cloud Computing

    OpenAIRE

    Rushikesh Shingade; Amit Patil; Shivam Suryawanshi; M. Venkatesan

    2015-01-01

    Cloud computing, one of the widely used technology to provide cloud services for users who are charged for receiving services. In the aspect of a maximum number of resources, evaluating the performance of Cloud resource management policies are difficult to optimize efficiently. There are different simulation toolkits available for simulation and modelling the Cloud computing environment like GridSim CloudAnalyst, CloudSim, GreenCloud, CloudAuction etc. In proposed Efficient Resource Manage...

  15. Computational screening of new inorganic materials for highly efficient solar energy conversion

    DEFF Research Database (Denmark)

    Kuhar, Korina

    2017-01-01

    in solar cells convert solar energy into electricity, and PC uses harvested energy to conduct chemical reactions, such as splitting water into oxygen and, more importantly, hydrogen, also known as the fuel of the future. Further progress in both PV and PC fields is mostly limited by the flaws in materials...... materials. In this work a high-throughput computational search for suitable absorbers for PV and PC applications is presented. A set of descriptors has been developed, such that each descriptor targets an important property or issue of a good solar energy conversion material. The screening study...... that we have access to. Despite the vast amounts of energy at our disposal, we are not able to harvest this solar energy efficiently. Currently, there are a few ways of converting solar power into usable energy, such as photovoltaics (PV) or photoelectrochemical generation of fuels (PC). PV processes...

  16. High-efficiency silicon solar cells for low-illumination applications

    OpenAIRE

    Glunz, S.W.; Dicker, J.; Esterle, M.; Hermle, M.; Isenberg, J.; Kamerewerd, F.; Knobloch, J.; Kray, D.; Leimenstoll, A.; Lutz, F.; Oßwald, D.; Preu, R.; Rein, S.; Schäffer, E.; Schetter, C.

    2002-01-01

    At Fraunhofer ISE the fabrication of high-efficiency solar cells was extended from a laboratory scale to a small pilot-line production. Primarily, the fabricated cells are used in small high-efficiency modules integrated in prototypes of solar-powered portable electronic devices such as cellular phones, handheld computers etc. Compared to other applications of high-efficiency cells such as solar cars and planes, the illumination densities found in these mainly indoor applications are signific...

  17. Efficient Multi-Party Computation over Rings

    DEFF Research Database (Denmark)

    Cramer, Ronald; Fehr, Serge; Ishai, Yuval

    2003-01-01

    Secure multi-party computation (MPC) is an active research area, and a wide range of literature can be found nowadays suggesting improvements and generalizations of existing protocols in various directions. However, all current techniques for secure MPC apply to functions that are represented by ...... the usefulness of the above results by presenting a novel application of MPC over (non-field) rings to the round-efficient secure computation of the maximum function. Basic Research in Computer Science (www.brics.dk), funded by the Danish National Research Foundation.......Secure multi-party computation (MPC) is an active research area, and a wide range of literature can be found nowadays suggesting improvements and generalizations of existing protocols in various directions. However, all current techniques for secure MPC apply to functions that are represented...... by (boolean or arithmetic) circuits over finite fields. We are motivated by two limitations of these techniques: – Generality. Existing protocols do not apply to computation over more general algebraic structures (except via a brute-force simulation of computation in these structures). – Efficiency. The best...

  18. Computational model for a high temperature electrolyzer coupled to a HTTR for efficient nuclear hydrogen production

    International Nuclear Information System (INIS)

    Gonzalez, Daniel; Rojas, Leorlen; Rosales, Jesus; Castro, Landy; Gamez, Abel; Brayner, Carlos; Garcia, Lazaro; Garcia, Carlos; Torre, Raciel de la; Sanchez, Danny

    2015-01-01

    High temperature electrolysis process coupled to a very high temperature reactor (VHTR) is one of the most promising methods for hydrogen production using a nuclear reactor as the primary heat source. However there are not references in the scientific publications of a test facility that allow to evaluate the efficiency of the process and other physical parameters that has to be taken into consideration for its accurate application in the hydrogen economy as a massive production method. For this lack of experimental facilities, mathematical models are one of the most used tools to study this process and theirs flowsheets, in which the electrolyzer is the most important component because of its complexity and importance in the process. A computational fluid dynamic (CFD) model for the evaluation and optimization of the electrolyzer of a high temperature electrolysis hydrogen production process flowsheet was developed using ANSYS FLUENT®. Electrolyzer's operational and design parameters will be optimized in order to obtain the maximum hydrogen production and the higher efficiency in the module. This optimized model of the electrolyzer will be incorporated to a chemical process simulation (CPS) code to study the overall high temperature flowsheet coupled to a high temperature accelerator driven system (ADS) that offers advantages in the transmutation of the spent fuel. (author)

  19. Computational model for a high temperature electrolyzer coupled to a HTTR for efficient nuclear hydrogen production

    Energy Technology Data Exchange (ETDEWEB)

    Gonzalez, Daniel; Rojas, Leorlen; Rosales, Jesus; Castro, Landy; Gamez, Abel; Brayner, Carlos, E-mail: danielgonro@gmail.com [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil); Garcia, Lazaro; Garcia, Carlos; Torre, Raciel de la, E-mail: lgarcia@instec.cu [Instituto Superior de Tecnologias y Ciencias Aplicadas (InSTEC), La Habana (Cuba); Sanchez, Danny [Universidade Estadual de Santa Cruz (UESC), Ilheus, BA (Brazil)

    2015-07-01

    High temperature electrolysis process coupled to a very high temperature reactor (VHTR) is one of the most promising methods for hydrogen production using a nuclear reactor as the primary heat source. However there are not references in the scientific publications of a test facility that allow to evaluate the efficiency of the process and other physical parameters that has to be taken into consideration for its accurate application in the hydrogen economy as a massive production method. For this lack of experimental facilities, mathematical models are one of the most used tools to study this process and theirs flowsheets, in which the electrolyzer is the most important component because of its complexity and importance in the process. A computational fluid dynamic (CFD) model for the evaluation and optimization of the electrolyzer of a high temperature electrolysis hydrogen production process flowsheet was developed using ANSYS FLUENT®. Electrolyzer's operational and design parameters will be optimized in order to obtain the maximum hydrogen production and the higher efficiency in the module. This optimized model of the electrolyzer will be incorporated to a chemical process simulation (CPS) code to study the overall high temperature flowsheet coupled to a high temperature accelerator driven system (ADS) that offers advantages in the transmutation of the spent fuel. (author)

  20. Octopus: embracing the energy efficiency of handheld multimedia computers

    NARCIS (Netherlands)

    Havinga, Paul J.M.; Smit, Gerardus Johannes Maria

    1999-01-01

    In the MOBY DICK project we develop and define the architecture of a new generation of mobile hand-held computers called Mobile Digital Companions. The Companions must meet several major requirements: high performance, energy efficient, a notion of Quality of Service (QoS), small size, and low

  1. Efficient Unsteady Flow Visualization with High-Order Access Dependencies

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Jiang; Guo, Hanqi; Yuan, Xiaoru

    2016-04-19

    We present a novel high-order access dependencies based model for efficient pathline computation in unsteady flow visualization. By taking longer access sequences into account to model more sophisticated data access patterns in particle tracing, our method greatly improves the accuracy and reliability in data access prediction. In our work, high-order access dependencies are calculated by tracing uniformly-seeded pathlines in both forward and backward directions in a preprocessing stage. The effectiveness of our proposed approach is demonstrated through a parallel particle tracing framework with high-order data prefetching. Results show that our method achieves higher data locality and hence improves the efficiency of pathline computation.

  2. An energy efficient and high speed architecture for convolution computing based on binary resistive random access memory

    Science.gov (United States)

    Liu, Chen; Han, Runze; Zhou, Zheng; Huang, Peng; Liu, Lifeng; Liu, Xiaoyan; Kang, Jinfeng

    2018-04-01

    In this work we present a novel convolution computing architecture based on metal oxide resistive random access memory (RRAM) to process the image data stored in the RRAM arrays. The proposed image storage architecture shows performances of better speed-device consumption efficiency compared with the previous kernel storage architecture. Further we improve the architecture for a high accuracy and low power computing by utilizing the binary storage and the series resistor. For a 28 × 28 image and 10 kernels with a size of 3 × 3, compared with the previous kernel storage approach, the newly proposed architecture shows excellent performances including: 1) almost 100% accuracy within 20% LRS variation and 90% HRS variation; 2) more than 67 times speed boost; 3) 71.4% energy saving.

  3. Towards the Automatic Detection of Efficient Computing Assets in a Heterogeneous Cloud Environment

    OpenAIRE

    Iglesias, Jesus Omana; Stokes, Nicola; Ventresque, Anthony; Murphy, Liam, B.E.; Thorburn, James

    2013-01-01

    peer-reviewed In a heterogeneous cloud environment, the manual grading of computing assets is the first step in the process of configuring IT infrastructures to ensure optimal utilization of resources. Grading the efficiency of computing assets is however, a difficult, subjective and time consuming manual task. Thus, an automatic efficiency grading algorithm is highly desirable. In this paper, we compare the effectiveness of the different criteria used in the manual gr...

  4. Highly efficient separation materials created by computational approach. For the separation of lanthanides and actinides

    International Nuclear Information System (INIS)

    Goto, Masahiro; Uezu, Kazuya; Aoshima, Atsushi; Koma, Yoshikazu

    2002-05-01

    In this study, efficient separation materials have been created by the computational approach. Based on the computational calculation, novel organophosphorus extractants, which have two functional moieties in the molecular structure, were developed for the recycle system of transuranium elements using liquid-liquid extraction. Furthermore, molecularly imprinted resins were prepared by the surface-imprint polymerization technique. Thorough this research project, we obtained two principal results: 1) design of novel extractants by computational approach, and 2) preparation of highly selective resins by the molecular imprinting technique. The synthesized extractants showed extremely high extractability to rare earth metals compared to those of commercially available extractants. The results of extraction equilibrium suggested that the structural effect of extractants is one of the key factors to enhance the selectivity and extractability in rare earth extractions. Furthermore, a computational analysis was carried out to evaluate the extraction properties for the extraction of rare earth metals by the synthesized extractants. The computer simulation was shown to be very useful for designing new extractants. The new concept to connect some functional moieties with a spacer is very useful and is a promising method to develop novel extractants for the treatment of nuclear fuel. In the second part, we proposed a novel molecular imprinting technique (surface template polymerization) for the separation of lanthanides and actinides. A surface-templated resin is prepared by an emulsion polymerization using an ion-binding (host) monomer, a resin matrix-forming monomer and the target Nd(III) metal ion. A host monomer which has amphiphilic nature forms a complex with a metal ion at the interface, and the complex remains as it is. After the matrix is polymerized, the coordination structure is 'imprinted' at the resin interface. Adsorption of Nd(III) and La(III) ions onto the

  5. Efficient scatter model for simulation of ultrasound images from computed tomography data

    Science.gov (United States)

    D'Amato, J. P.; Lo Vercio, L.; Rubi, P.; Fernandez Vera, E.; Barbuzza, R.; Del Fresno, M.; Larrabide, I.

    2015-12-01

    Background and motivation: Real-time ultrasound simulation refers to the process of computationally creating fully synthetic ultrasound images instantly. Due to the high value of specialized low cost training for healthcare professionals, there is a growing interest in the use of this technology and the development of high fidelity systems that simulate the acquisitions of echographic images. The objective is to create an efficient and reproducible simulator that can run either on notebooks or desktops using low cost devices. Materials and methods: We present an interactive ultrasound simulator based on CT data. This simulator is based on ray-casting and provides real-time interaction capabilities. The simulation of scattering that is coherent with the transducer position in real time is also introduced. Such noise is produced using a simplified model of multiplicative noise and convolution with point spread functions (PSF) tailored for this purpose. Results: The computational efficiency of scattering maps generation was revised with an improved performance. This allowed a more efficient simulation of coherent scattering in the synthetic echographic images while providing highly realistic result. We describe some quality and performance metrics to validate these results, where a performance of up to 55fps was achieved. Conclusion: The proposed technique for real-time scattering modeling provides realistic yet computationally efficient scatter distributions. The error between the original image and the simulated scattering image was compared for the proposed method and the state-of-the-art, showing negligible differences in its distribution.

  6. High performance computing in linear control

    International Nuclear Information System (INIS)

    Datta, B.N.

    1993-01-01

    Remarkable progress has been made in both theory and applications of all important areas of control. The theory is rich and very sophisticated. Some beautiful applications of control theory are presently being made in aerospace, biomedical engineering, industrial engineering, robotics, economics, power systems, etc. Unfortunately, the same assessment of progress does not hold in general for computations in control theory. Control Theory is lagging behind other areas of science and engineering in this respect. Nowadays there is a revolution going on in the world of high performance scientific computing. Many powerful computers with vector and parallel processing have been built and have been available in recent years. These supercomputers offer very high speed in computations. Highly efficient software, based on powerful algorithms, has been developed to use on these advanced computers, and has also contributed to increased performance. While workers in many areas of science and engineering have taken great advantage of these hardware and software developments, control scientists and engineers, unfortunately, have not been able to take much advantage of these developments

  7. Efficient Backprojection-Based Synthetic Aperture Radar Computation with Many-Core Processors

    Directory of Open Access Journals (Sweden)

    Jongsoo Park

    2013-01-01

    Full Text Available Tackling computationally challenging problems with high efficiency often requires the combination of algorithmic innovation, advanced architecture, and thorough exploitation of parallelism. We demonstrate this synergy through synthetic aperture radar (SAR via backprojection, an image reconstruction method that can require hundreds of TFLOPS. Computation cost is significantly reduced by our new algorithm of approximate strength reduction; data movement cost is economized by software locality optimizations facilitated by advanced architecture support; parallelism is fully harnessed in various patterns and granularities. We deliver over 35 billion backprojections per second throughput per compute node on an Intel® Xeon® processor E5-2670-based cluster, equipped with Intel® Xeon Phi™ coprocessors. This corresponds to processing a 3K×3K image within a second using a single node. Our study can be extended to other settings: backprojection is applicable elsewhere including medical imaging, approximate strength reduction is a general code transformation technique, and many-core processors are emerging as a solution to energy-efficient computing.

  8. The Efficient Use of Vector Computers with Emphasis on Computational Fluid Dynamics : a GAMM-Workshop

    CERN Document Server

    Gentzsch, Wolfgang

    1986-01-01

    The GAMM Committee for Numerical Methods in Fluid Mechanics organizes workshops which should bring together experts of a narrow field of computational fluid dynamics (CFD) to exchange ideas and experiences in order to speed-up the development in this field. In this sense it was suggested that a workshop should treat the solution of CFD problems on vector computers. Thus we organized a workshop with the title "The efficient use of vector computers with emphasis on computational fluid dynamics". The workshop took place at the Computing Centre of the University of Karlsruhe, March 13-15,1985. The participation had been restricted to 22 people of 7 countries. 18 papers have been presented. In the announcement of the workshop we wrote: "Fluid mechanics has actively stimulated the development of superfast vector computers like the CRAY's or CYBER 205. Now these computers on their turn stimulate the development of new algorithms which result in a high degree of vectorization (sca1ar/vectorized execution-time). But w...

  9. A Distributed Snapshot Protocol for Efficient Artificial Intelligence Computation in Cloud Computing Environments

    Directory of Open Access Journals (Sweden)

    JongBeom Lim

    2018-01-01

    Full Text Available Many artificial intelligence applications often require a huge amount of computing resources. As a result, cloud computing adoption rates are increasing in the artificial intelligence field. To support the demand for artificial intelligence applications and guarantee the service level agreement, cloud computing should provide not only computing resources but also fundamental mechanisms for efficient computing. In this regard, a snapshot protocol has been used to create a consistent snapshot of the global state in cloud computing environments. However, the existing snapshot protocols are not optimized in the context of artificial intelligence applications, where large-scale iterative computation is the norm. In this paper, we present a distributed snapshot protocol for efficient artificial intelligence computation in cloud computing environments. The proposed snapshot protocol is based on a distributed algorithm to run interconnected multiple nodes in a scalable fashion. Our snapshot protocol is able to deal with artificial intelligence applications, in which a large number of computing nodes are running. We reveal that our distributed snapshot protocol guarantees the correctness, safety, and liveness conditions.

  10. Efficient computation of Laguerre polynomials

    NARCIS (Netherlands)

    A. Gil (Amparo); J. Segura (Javier); N.M. Temme (Nico)

    2017-01-01

    textabstractAn efficient algorithm and a Fortran 90 module (LaguerrePol) for computing Laguerre polynomials . Ln(α)(z) are presented. The standard three-term recurrence relation satisfied by the polynomials and different types of asymptotic expansions valid for . n large and . α small, are used

  11. Efficient computation of the joint sample frequency spectra for multiple populations.

    Science.gov (United States)

    Kamm, John A; Terhorst, Jonathan; Song, Yun S

    2017-01-01

    A wide range of studies in population genetics have employed the sample frequency spectrum (SFS), a summary statistic which describes the distribution of mutant alleles at a polymorphic site in a sample of DNA sequences and provides a highly efficient dimensional reduction of large-scale population genomic variation data. Recently, there has been much interest in analyzing the joint SFS data from multiple populations to infer parameters of complex demographic histories, including variable population sizes, population split times, migration rates, admixture proportions, and so on. SFS-based inference methods require accurate computation of the expected SFS under a given demographic model. Although much methodological progress has been made, existing methods suffer from numerical instability and high computational complexity when multiple populations are involved and the sample size is large. In this paper, we present new analytic formulas and algorithms that enable accurate, efficient computation of the expected joint SFS for thousands of individuals sampled from hundreds of populations related by a complex demographic model with arbitrary population size histories (including piecewise-exponential growth). Our results are implemented in a new software package called momi (MOran Models for Inference). Through an empirical study we demonstrate our improvements to numerical stability and computational complexity.

  12. Efficient Secure Multiparty Subset Computation

    Directory of Open Access Journals (Sweden)

    Sufang Zhou

    2017-01-01

    Full Text Available Secure subset problem is important in secure multiparty computation, which is a vital field in cryptography. Most of the existing protocols for this problem can only keep the elements of one set private, while leaking the elements of the other set. In other words, they cannot solve the secure subset problem perfectly. While a few studies have addressed actual secure subsets, these protocols were mainly based on the oblivious polynomial evaluations with inefficient computation. In this study, we first design an efficient secure subset protocol for sets whose elements are drawn from a known set based on a new encoding method and homomorphic encryption scheme. If the elements of the sets are taken from a large domain, the existing protocol is inefficient. Using the Bloom filter and homomorphic encryption scheme, we further present an efficient protocol with linear computational complexity in the cardinality of the large set, and this is considered to be practical for inputs consisting of a large number of data. However, the second protocol that we design may yield a false positive. This probability can be rapidly decreased by reexecuting the protocol with different hash functions. Furthermore, we present the experimental performance analyses of these protocols.

  13. A new computationally-efficient two-dimensional model for boron implantation into single-crystal silicon

    International Nuclear Information System (INIS)

    Klein, K.M.; Park, C.; Yang, S.; Morris, S.; Do, V.; Tasch, F.

    1992-01-01

    We have developed a new computationally-efficient two-dimensional model for boron implantation into single-crystal silicon. This paper reports that this new model is based on the dual Pearson semi-empirical implant depth profile model and the UT-MARLOWE Monte Carlo boron ion implantation model. This new model can predict with very high computational efficiency two-dimensional as-implanted boron profiles as a function of energy, dose, tilt angle, rotation angle, masking edge orientation, and masking edge thickness

  14. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    CERN Document Server

    Abdurachmanov, David; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2014-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  15. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    Science.gov (United States)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Knight, Robert; Muzaffar, Shahzad

    2015-05-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG).

  16. Heterogeneous High Throughput Scientific Computing with APM X-Gene and Intel Xeon Phi

    International Nuclear Information System (INIS)

    Abdurachmanov, David; Bockelman, Brian; Elmer, Peter; Eulisse, Giulio; Muzaffar, Shahzad; Knight, Robert

    2015-01-01

    Electrical power requirements will be a constraint on the future growth of Distributed High Throughput Computing (DHTC) as used by High Energy Physics. Performance-per-watt is a critical metric for the evaluation of computer architectures for cost- efficient computing. Additionally, future performance growth will come from heterogeneous, many-core, and high computing density platforms with specialized processors. In this paper, we examine the Intel Xeon Phi Many Integrated Cores (MIC) co-processor and Applied Micro X-Gene ARMv8 64-bit low-power server system-on-a-chip (SoC) solutions for scientific computing applications. We report our experience on software porting, performance and energy efficiency and evaluate the potential for use of such technologies in the context of distributed computing systems such as the Worldwide LHC Computing Grid (WLCG). (paper)

  17. A computationally efficient approach for template matching-based ...

    Indian Academy of Sciences (India)

    In this paper, a new computationally efficient image registration method is ...... the proposed method requires less computational time as compared to traditional methods. ... Zitová B and Flusser J 2003 Image registration methods: A survey.

  18. Efficient Skyline Computation in Structured Peer-to-Peer Systems

    DEFF Research Database (Denmark)

    Cui, Bin; Chen, Lijiang; Xu, Linhao

    2009-01-01

    An increasing number of large-scale applications exploit peer-to-peer network architecture to provide highly scalable and flexible services. Among these applications, data management in peer-to-peer systems is one of the interesting domains. In this paper, we investigate the multidimensional...... skyline computation problem on a structured peer-to-peer network. In order to achieve low communication cost and quick response time, we utilize the iMinMax(\\theta ) method to transform high-dimensional data to one-dimensional value and distribute the data in a structured peer-to-peer network called BATON....... Thereafter, we propose a progressive algorithm with adaptive filter technique for efficient skyline computation in this environment. We further discuss some optimization techniques for the algorithm, and summarize the key principles of our algorithm into a query routing protocol with detailed analysis...

  19. Open source acceleration of wave optics simulations on energy efficient high-performance computing platforms

    Science.gov (United States)

    Beck, Jeffrey; Bos, Jeremy P.

    2017-05-01

    We compare several modifications to the open-source wave optics package, WavePy, intended to improve execution time. Specifically, we compare the relative performance of the Intel MKL, a CPU based OpenCV distribution, and GPU-based version. Performance is compared between distributions both on the same compute platform and between a fully-featured computing workstation and the NVIDIA Jetson TX1 platform. Comparisons are drawn in terms of both execution time and power consumption. We have found that substituting the Fast Fourier Transform operation from OpenCV provides a marked improvement on all platforms. In addition, we show that embedded platforms offer some possibility for extensive improvement in terms of efficiency compared to a fully featured workstation.

  20. Computer aided drug discovery of highly ligand efficient, low molecular weight imidazopyridine analogs as FLT3 inhibitors.

    Science.gov (United States)

    Frett, Brendan; McConnell, Nick; Smith, Catherine C; Wang, Yuanxiang; Shah, Neil P; Li, Hong-yu

    2015-04-13

    The FLT3 kinase represents an attractive target to effectively treat AML. Unfortunately, no FLT3 targeted therapeutic is currently approved. In line with our continued interests in treating kinase related disease for anti-FLT3 mutant activity, we utilized pioneering synthetic methodology in combination with computer aided drug discovery and identified low molecular weight, highly ligand efficient, FLT3 kinase inhibitors. Compounds were analyzed for biochemical inhibition, their ability to selectively inhibit cell proliferation, for FLT3 mutant activity, and preliminary aqueous solubility. Validated hits were discovered that can serve as starting platforms for lead candidates. Copyright © 2015 Elsevier Masson SAS. All rights reserved.

  1. An energy-efficient failure detector for vehicular cloud computing.

    Science.gov (United States)

    Liu, Jiaxi; Wu, Zhibo; Dong, Jian; Wu, Jin; Wen, Dongxin

    2018-01-01

    Failure detectors are one of the fundamental components for maintaining the high availability of vehicular cloud computing. In vehicular cloud computing, lots of RSUs are deployed along the road to improve the connectivity. Many of them are equipped with solar battery due to the unavailability or excess expense of wired electrical power. So it is important to reduce the battery consumption of RSU. However, the existing failure detection algorithms are not designed to save battery consumption RSU. To solve this problem, a new energy-efficient failure detector 2E-FD has been proposed specifically for vehicular cloud computing. 2E-FD does not only provide acceptable failure detection service, but also saves the battery consumption of RSU. Through the comparative experiments, the results show that our failure detector has better performance in terms of speed, accuracy and battery consumption.

  2. HAlign-II: efficient ultra-large multiple sequence alignment and phylogenetic tree reconstruction with distributed and parallel computing.

    Science.gov (United States)

    Wan, Shixiang; Zou, Quan

    2017-01-01

    Multiple sequence alignment (MSA) plays a key role in biological sequence analyses, especially in phylogenetic tree construction. Extreme increase in next-generation sequencing results in shortage of efficient ultra-large biological sequence alignment approaches for coping with different sequence types. Distributed and parallel computing represents a crucial technique for accelerating ultra-large (e.g. files more than 1 GB) sequence analyses. Based on HAlign and Spark distributed computing system, we implement a highly cost-efficient and time-efficient HAlign-II tool to address ultra-large multiple biological sequence alignment and phylogenetic tree construction. The experiments in the DNA and protein large scale data sets, which are more than 1GB files, showed that HAlign II could save time and space. It outperformed the current software tools. HAlign-II can efficiently carry out MSA and construct phylogenetic trees with ultra-large numbers of biological sequences. HAlign-II shows extremely high memory efficiency and scales well with increases in computing resource. THAlign-II provides a user-friendly web server based on our distributed computing infrastructure. HAlign-II with open-source codes and datasets was established at http://lab.malab.cn/soft/halign.

  3. Multi-petascale highly efficient parallel supercomputer

    Science.gov (United States)

    Asaad, Sameh; Bellofatto, Ralph E.; Blocksome, Michael A.; Blumrich, Matthias A.; Boyle, Peter; Brunheroto, Jose R.; Chen, Dong; Cher, Chen -Yong; Chiu, George L.; Christ, Norman; Coteus, Paul W.; Davis, Kristan D.; Dozsa, Gabor J.; Eichenberger, Alexandre E.; Eisley, Noel A.; Ellavsky, Matthew R.; Evans, Kahn C.; Fleischer, Bruce M.; Fox, Thomas W.; Gara, Alan; Giampapa, Mark E.; Gooding, Thomas M.; Gschwind, Michael K.; Gunnels, John A.; Hall, Shawn A.; Haring, Rudolf A.; Heidelberger, Philip; Inglett, Todd A.; Knudson, Brant L.; Kopcsay, Gerard V.; Kumar, Sameer; Mamidala, Amith R.; Marcella, James A.; Megerian, Mark G.; Miller, Douglas R.; Miller, Samuel J.; Muff, Adam J.; Mundy, Michael B.; O'Brien, John K.; O'Brien, Kathryn M.; Ohmacht, Martin; Parker, Jeffrey J.; Poole, Ruth J.; Ratterman, Joseph D.; Salapura, Valentina; Satterfield, David L.; Senger, Robert M.; Smith, Brian; Steinmacher-Burow, Burkhard; Stockdell, William M.; Stunkel, Craig B.; Sugavanam, Krishnan; Sugawara, Yutaka; Takken, Todd E.; Trager, Barry M.; Van Oosten, James L.; Wait, Charles D.; Walkup, Robert E.; Watson, Alfred T.; Wisniewski, Robert W.; Wu, Peng

    2015-07-14

    A Multi-Petascale Highly Efficient Parallel Supercomputer of 100 petaOPS-scale computing, at decreased cost, power and footprint, and that allows for a maximum packaging density of processing nodes from an interconnect point of view. The Supercomputer exploits technological advances in VLSI that enables a computing model where many processors can be integrated into a single Application Specific Integrated Circuit (ASIC). Each ASIC computing node comprises a system-on-chip ASIC utilizing four or more processors integrated into one die, with each having full access to all system resources and enabling adaptive partitioning of the processors to functions such as compute or messaging I/O on an application by application basis, and preferably, enable adaptive partitioning of functions in accordance with various algorithmic phases within an application, or if I/O or other processors are underutilized, then can participate in computation or communication nodes are interconnected by a five dimensional torus network with DMA that optimally maximize the throughput of packet communications between nodes and minimize latency.

  4. The Design of High Efficiency Crossflow Hydro Turbines: A Review and Extension

    Directory of Open Access Journals (Sweden)

    Ram Adhikari

    2018-01-01

    Full Text Available Efficiency is a critical consideration in the design of hydro turbines. The crossflow turbine is the cheapest and easiest hydro turbine to manufacture and so is commonly used in remote power systems for developing countries. A longstanding problem for practical crossflow turbines is their lower maximum efficiency compared to their more advanced counterparts, such as Pelton and Francis turbines. This paper reviews the experimental and computational studies relevant to the design of high efficiency crossflow turbines. We concentrate on the studies that have contributed to designs with efficiencies in the range of 88–90%. Many recent studies have been conducted on turbines of low maximum efficiency, which we believe is due to misunderstanding of design principles for achieving high efficiencies. We synthesize the key results of experimental and computational fluid dynamics studies to highlight the key fundamental design principles for achieving efficiencies of about 90%, as well as future research and development areas to further improve the maximum efficiency. The main finding of this review is that the total conversion of head into kinetic energy in the nozzle and the matching of nozzle and runner designs are the two main design requirements for the design of high efficiency turbines.

  5. Developing a computationally efficient dynamic multilevel hybrid optimization scheme using multifidelity model interactions.

    Energy Technology Data Exchange (ETDEWEB)

    Hough, Patricia Diane (Sandia National Laboratories, Livermore, CA); Gray, Genetha Anne (Sandia National Laboratories, Livermore, CA); Castro, Joseph Pete Jr. (; .); Giunta, Anthony Andrew

    2006-01-01

    Many engineering application problems use optimization algorithms in conjunction with numerical simulators to search for solutions. The formulation of relevant objective functions and constraints dictate possible optimization algorithms. Often, a gradient based approach is not possible since objective functions and constraints can be nonlinear, nonconvex, non-differentiable, or even discontinuous and the simulations involved can be computationally expensive. Moreover, computational efficiency and accuracy are desirable and also influence the choice of solution method. With the advent and increasing availability of massively parallel computers, computational speed has increased tremendously. Unfortunately, the numerical and model complexities of many problems still demand significant computational resources. Moreover, in optimization, these expenses can be a limiting factor since obtaining solutions often requires the completion of numerous computationally intensive simulations. Therefore, we propose a multifidelity optimization algorithm (MFO) designed to improve the computational efficiency of an optimization method for a wide range of applications. In developing the MFO algorithm, we take advantage of the interactions between multi fidelity models to develop a dynamic and computational time saving optimization algorithm. First, a direct search method is applied to the high fidelity model over a reduced design space. In conjunction with this search, a specialized oracle is employed to map the design space of this high fidelity model to that of a computationally cheaper low fidelity model using space mapping techniques. Then, in the low fidelity space, an optimum is obtained using gradient or non-gradient based optimization, and it is mapped back to the high fidelity space. In this paper, we describe the theory and implementation details of our MFO algorithm. We also demonstrate our MFO method on some example problems and on two applications: earth penetrators and

  6. FPGA Compute Acceleration for High-Throughput Data Processing in High-Energy Physics Experiments

    CERN Multimedia

    CERN. Geneva

    2017-01-01

    The upgrades of the four large experiments of the LHC at CERN in the coming years will result in a huge increase of data bandwidth for each experiment which needs to be processed very efficiently. For example the LHCb experiment will upgrade its detector 2019/2020 to a 'triggerless' readout scheme, where all of the readout electronics and several sub-detector parts will be replaced. The new readout electronics will be able to readout the detector at 40MHz. This increases the data bandwidth from the detector down to the event filter farm to 40TBit/s, which must be processed to select the interesting proton-proton collisions for later storage. The architecture of such a computing farm, which can process this amount of data as efficiently as possible, is a challenging task and several compute accelerator technologies are being considered.    In the high performance computing sector more and more FPGA compute accelerators are being used to improve the compute performance and reduce the...

  7. Highly efficient computer algorithm for identifying layer thickness of atomically thin 2D materials

    Science.gov (United States)

    Lee, Jekwan; Cho, Seungwan; Park, Soohyun; Bae, Hyemin; Noh, Minji; Kim, Beom; In, Chihun; Yang, Seunghoon; Lee, Sooun; Seo, Seung Young; Kim, Jehyun; Lee, Chul-Ho; Shim, Woo-Young; Jo, Moon-Ho; Kim, Dohun; Choi, Hyunyong

    2018-03-01

    The fields of layered material research, such as transition-metal dichalcogenides (TMDs), have demonstrated that the optical, electrical and mechanical properties strongly depend on the layer number N. Thus, efficient and accurate determination of N is the most crucial step before the associated device fabrication. An existing experimental technique using an optical microscope is the most widely used one to identify N. However, a critical drawback of this approach is that it relies on extensive laboratory experiences to estimate N; it requires a very time-consuming image-searching task assisted by human eyes and secondary measurements such as atomic force microscopy and Raman spectroscopy, which are necessary to ensure N. In this work, we introduce a computer algorithm based on the image analysis of a quantized optical contrast. We show that our algorithm can apply to a wide variety of layered materials, including graphene, MoS2, and WS2 regardless of substrates. The algorithm largely consists of two parts. First, it sets up an appropriate boundary between target flakes and substrate. Second, to compute N, it automatically calculates the optical contrast using an adaptive RGB estimation process between each target, which results in a matrix with different integer Ns and returns a matrix map of Ns onto the target flake position. Using a conventional desktop computational power, the time taken to display the final N matrix was 1.8 s on average for the image size of 1280 pixels by 960 pixels and obtained a high accuracy of 90% (six estimation errors among 62 samples) when compared to the other methods. To show the effectiveness of our algorithm, we also apply it to TMD flakes transferred on optically transparent c-axis sapphire substrates and obtain a similar result of the accuracy of 94% (two estimation errors among 34 samples).

  8. Energy efficiency of computer power supply units - Final report

    Energy Technology Data Exchange (ETDEWEB)

    Aebischer, B. [cepe - Centre for Energy Policy and Economics, Swiss Federal Institute of Technology Zuerich, Zuerich (Switzerland); Huser, H. [Encontrol GmbH, Niederrohrdorf (Switzerland)

    2002-11-15

    This final report for the Swiss Federal Office of Energy (SFOE) takes a look at the efficiency of computer power supply units, which decreases rapidly during average computer use. The background and the purpose of the project are examined. The power supplies for personal computers are discussed and the testing arrangement used is described. Efficiency, power-factor and operating points of the units are examined. Potentials for improvement and measures to be taken are discussed. Also, action to be taken by those involved in the design and operation of such power units is proposed. Finally, recommendations for further work are made.

  9. Efficient quantum computation in a network with probabilistic gates and logical encoding

    DEFF Research Database (Denmark)

    Borregaard, J.; Sørensen, A. S.; Cirac, J. I.

    2017-01-01

    An approach to efficient quantum computation with probabilistic gates is proposed and analyzed in both a local and nonlocal setting. It combines heralded gates previously studied for atom or atomlike qubits with logical encoding from linear optical quantum computation in order to perform high......-fidelity quantum gates across a quantum network. The error-detecting properties of the heralded operations ensure high fidelity while the encoding makes it possible to correct for failed attempts such that deterministic and high-quality gates can be achieved. Importantly, this is robust to photon loss, which...... is typically the main obstacle to photonic-based quantum information processing. Overall this approach opens a path toward quantum networks with atomic nodes and photonic links....

  10. Efficient Smoothed Concomitant Lasso Estimation for High Dimensional Regression

    Science.gov (United States)

    Ndiaye, Eugene; Fercoq, Olivier; Gramfort, Alexandre; Leclère, Vincent; Salmon, Joseph

    2017-10-01

    In high dimensional settings, sparse structures are crucial for efficiency, both in term of memory, computation and performance. It is customary to consider ℓ 1 penalty to enforce sparsity in such scenarios. Sparsity enforcing methods, the Lasso being a canonical example, are popular candidates to address high dimension. For efficiency, they rely on tuning a parameter trading data fitting versus sparsity. For the Lasso theory to hold this tuning parameter should be proportional to the noise level, yet the latter is often unknown in practice. A possible remedy is to jointly optimize over the regression parameter as well as over the noise level. This has been considered under several names in the literature: Scaled-Lasso, Square-root Lasso, Concomitant Lasso estimation for instance, and could be of interest for uncertainty quantification. In this work, after illustrating numerical difficulties for the Concomitant Lasso formulation, we propose a modification we coined Smoothed Concomitant Lasso, aimed at increasing numerical stability. We propose an efficient and accurate solver leading to a computational cost no more expensive than the one for the Lasso. We leverage on standard ingredients behind the success of fast Lasso solvers: a coordinate descent algorithm, combined with safe screening rules to achieve speed efficiency, by eliminating early irrelevant features.

  11. Efficient computation of smoothing splines via adaptive basis sampling

    KAUST Repository

    Ma, Ping

    2015-06-24

    © 2015 Biometrika Trust. Smoothing splines provide flexible nonparametric regression estimators. However, the high computational cost of smoothing splines for large datasets has hindered their wide application. In this article, we develop a new method, named adaptive basis sampling, for efficient computation of smoothing splines in super-large samples. Except for the univariate case where the Reinsch algorithm is applicable, a smoothing spline for a regression problem with sample size n can be expressed as a linear combination of n basis functions and its computational complexity is generally O(n3). We achieve a more scalable computation in the multivariate case by evaluating the smoothing spline using a smaller set of basis functions, obtained by an adaptive sampling scheme that uses values of the response variable. Our asymptotic analysis shows that smoothing splines computed via adaptive basis sampling converge to the true function at the same rate as full basis smoothing splines. Using simulation studies and a large-scale deep earth core-mantle boundary imaging study, we show that the proposed method outperforms a sampling method that does not use the values of response variables.

  12. Efficient computation of smoothing splines via adaptive basis sampling

    KAUST Repository

    Ma, Ping; Huang, Jianhua Z.; Zhang, Nan

    2015-01-01

    © 2015 Biometrika Trust. Smoothing splines provide flexible nonparametric regression estimators. However, the high computational cost of smoothing splines for large datasets has hindered their wide application. In this article, we develop a new method, named adaptive basis sampling, for efficient computation of smoothing splines in super-large samples. Except for the univariate case where the Reinsch algorithm is applicable, a smoothing spline for a regression problem with sample size n can be expressed as a linear combination of n basis functions and its computational complexity is generally O(n3). We achieve a more scalable computation in the multivariate case by evaluating the smoothing spline using a smaller set of basis functions, obtained by an adaptive sampling scheme that uses values of the response variable. Our asymptotic analysis shows that smoothing splines computed via adaptive basis sampling converge to the true function at the same rate as full basis smoothing splines. Using simulation studies and a large-scale deep earth core-mantle boundary imaging study, we show that the proposed method outperforms a sampling method that does not use the values of response variables.

  13. Experimental high energy physics and modern computer architectures

    International Nuclear Information System (INIS)

    Hoek, J.

    1988-06-01

    The paper examines how experimental High Energy Physics can use modern computer architectures efficiently. In this connection parallel and vector architectures are investigated, and the types available at the moment for general use are discussed. A separate section briefly describes some architectures that are either a combination of both, or exemplify other architectures. In an appendix some directions in which computing seems to be developing in the USA are mentioned. (author)

  14. Computer Architecture Techniques for Power-Efficiency

    CERN Document Server

    Kaxiras, Stefanos

    2008-01-01

    In the last few years, power dissipation has become an important design constraint, on par with performance, in the design of new computer systems. Whereas in the past, the primary job of the computer architect was to translate improvements in operating frequency and transistor count into performance, now power efficiency must be taken into account at every step of the design process. While for some time, architects have been successful in delivering 40% to 50% annual improvement in processor performance, costs that were previously brushed aside eventually caught up. The most critical of these

  15. Noise-free high-efficiency photon-number-resolving detectors

    International Nuclear Information System (INIS)

    Rosenberg, Danna; Lita, Adriana E.; Miller, Aaron J.; Nam, Sae Woo

    2005-01-01

    High-efficiency optical detectors that can determine the number of photons in a pulse of monochromatic light have applications in a variety of physics studies, including post-selection-based entanglement protocols for linear optics quantum computing and experiments that simultaneously close the detection and communication loopholes of Bell's inequalities. Here we report on our demonstration of fiber-coupled, noise-free, photon-number-resolving transition-edge sensors with 88% efficiency at 1550 nm. The efficiency of these sensors could be made even higher at any wavelength in the visible and near-infrared spectrum without resulting in a higher dark-count rate or degraded photon-number resolution

  16. In silico evaluation of highly efficient organic light-emitting materials

    Science.gov (United States)

    Kwak, H. Shaun; Giesen, David J.; Hughes, Thomas F.; Goldberg, Alexander; Cao, Yixiang; Gavartin, Jacob; Dixon, Steve; Halls, Mathew D.

    2016-09-01

    Design and development of highly efficient organic and organometallic dopants is one of the central challenges in the organic light-emitting diodes (OLEDs) technology. Recent advances in the computational materials science have made it possible to apply computer-aided evaluation and screening framework directly to the design space of organic lightemitting diodes (OLEDs). In this work, we will showcase two major components of the latest in silico framework for development of organometallic phosphorescent dopants - (1) rapid screening of dopants by machine-learned quantum mechanical models and (2) phosphorescence lifetime predictions with spin-orbit coupled calculations (SOC-TDDFT). The combined work of virtual screening and evaluation would significantly widen the design space for highly efficient phosphorescent dopants with unbiased measures to evaluate performance of the materials from first principles.

  17. Usage of super high speed computer for clarification of complex phenomena

    International Nuclear Information System (INIS)

    Sekiguchi, Tomotsugu; Sato, Mitsuhisa; Nakata, Hideki; Tatebe, Osami; Takagi, Hiromitsu

    1999-01-01

    This study aims at construction of an efficient super high speed computer system application environment in response to parallel distributed system with easy transplantation to different computer system and different number by conducting research and development on super high speed computer application technology required for elucidation of complicated phenomenon in elucidation of complicated phenomenon of nuclear power field due to computed scientific method. In order to realize such environment, the Electrotechnical Laboratory has conducted development on Ninf, a network numerical information library. This Ninf system can supply a global network infrastructure for worldwide computing with high performance on further wide range distributed network (G.K.)

  18. A Computational Framework for Efficient Low Temperature Plasma Simulations

    Science.gov (United States)

    Verma, Abhishek Kumar; Venkattraman, Ayyaswamy

    2016-10-01

    Over the past years, scientific computing has emerged as an essential tool for the investigation and prediction of low temperature plasmas (LTP) applications which includes electronics, nanomaterial synthesis, metamaterials etc. To further explore the LTP behavior with greater fidelity, we present a computational toolbox developed to perform LTP simulations. This framework will allow us to enhance our understanding of multiscale plasma phenomenon using high performance computing tools mainly based on OpenFOAM FVM distribution. Although aimed at microplasma simulations, the modular framework is able to perform multiscale, multiphysics simulations of physical systems comprises of LTP. Some salient introductory features are capability to perform parallel, 3D simulations of LTP applications on unstructured meshes. Performance of the solver is tested based on numerical results assessing accuracy and efficiency of benchmarks for problems in microdischarge devices. Numerical simulation of microplasma reactor at atmospheric pressure with hemispherical dielectric coated electrodes will be discussed and hence, provide an overview of applicability and future scope of this framework.

  19. Efficient high-precision matrix algebra on parallel architectures for nonlinear combinatorial optimization

    KAUST Repository

    Gunnels, John; Lee, Jon; Margulies, Susan

    2010-01-01

    We provide a first demonstration of the idea that matrix-based algorithms for nonlinear combinatorial optimization problems can be efficiently implemented. Such algorithms were mainly conceived by theoretical computer scientists for proving efficiency. We are able to demonstrate the practicality of our approach by developing an implementation on a massively parallel architecture, and exploiting scalable and efficient parallel implementations of algorithms for ultra high-precision linear algebra. Additionally, we have delineated and implemented the necessary algorithmic and coding changes required in order to address problems several orders of magnitude larger, dealing with the limits of scalability from memory footprint, computational efficiency, reliability, and interconnect perspectives. © Springer and Mathematical Programming Society 2010.

  20. Efficient high-precision matrix algebra on parallel architectures for nonlinear combinatorial optimization

    KAUST Repository

    Gunnels, John

    2010-06-01

    We provide a first demonstration of the idea that matrix-based algorithms for nonlinear combinatorial optimization problems can be efficiently implemented. Such algorithms were mainly conceived by theoretical computer scientists for proving efficiency. We are able to demonstrate the practicality of our approach by developing an implementation on a massively parallel architecture, and exploiting scalable and efficient parallel implementations of algorithms for ultra high-precision linear algebra. Additionally, we have delineated and implemented the necessary algorithmic and coding changes required in order to address problems several orders of magnitude larger, dealing with the limits of scalability from memory footprint, computational efficiency, reliability, and interconnect perspectives. © Springer and Mathematical Programming Society 2010.

  1. Efficiency using computer simulation of Reverse Threshold Model Theory on assessing a “One Laptop Per Child” computer versus desktop computer

    Directory of Open Access Journals (Sweden)

    Supat Faarungsang

    2017-04-01

    Full Text Available The Reverse Threshold Model Theory (RTMT model was introduced based on limiting factor concepts, but its efficiency compared to the Conventional Model (CM has not been published. This investigation assessed the efficiency of RTMT compared to CM using computer simulation on the “One Laptop Per Child” computer and a desktop computer. Based on probability values, it was found that RTMT was more efficient than CM among eight treatment combinations and an earlier study verified that RTMT gives complete elimination of random error. Furthermore, RTMT has several advantages over CM and is therefore proposed to be applied to most research data.

  2. Efficient MATLAB computations with sparse and factored tensors.

    Energy Technology Data Exchange (ETDEWEB)

    Bader, Brett William; Kolda, Tamara Gibson (Sandia National Lab, Livermore, CA)

    2006-12-01

    In this paper, the term tensor refers simply to a multidimensional or N-way array, and we consider how specially structured tensors allow for efficient storage and computation. First, we study sparse tensors, which have the property that the vast majority of the elements are zero. We propose storing sparse tensors using coordinate format and describe the computational efficiency of this scheme for various mathematical operations, including those typical to tensor decomposition algorithms. Second, we study factored tensors, which have the property that they can be assembled from more basic components. We consider two specific types: a Tucker tensor can be expressed as the product of a core tensor (which itself may be dense, sparse, or factored) and a matrix along each mode, and a Kruskal tensor can be expressed as the sum of rank-1 tensors. We are interested in the case where the storage of the components is less than the storage of the full tensor, and we demonstrate that many elementary operations can be computed using only the components. All of the efficiencies described in this paper are implemented in the Tensor Toolbox for MATLAB.

  3. The Effect of Computer Automation on Institutional Review Board (IRB) Office Efficiency

    Science.gov (United States)

    Oder, Karl; Pittman, Stephanie

    2015-01-01

    Companies purchase computer systems to make their processes more efficient through automation. Some academic medical centers (AMC) have purchased computer systems for their institutional review boards (IRB) to increase efficiency and compliance with regulations. IRB computer systems are expensive to purchase, deploy, and maintain. An AMC should…

  4. Computationally Efficient Clustering of Audio-Visual Meeting Data

    Science.gov (United States)

    Hung, Hayley; Friedland, Gerald; Yeo, Chuohao

    This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors, comprising a limited number of cameras and microphones. We first demonstrate computationally efficient algorithms that can identify who spoke and when, a problem in speech processing known as speaker diarization. We also extract visual activity features efficiently from MPEG4 video by taking advantage of the processing that was already done for video compression. Then, we present a method of associating the audio-visual data together so that the content of each participant can be managed individually. The methods presented in this article can be used as a principal component that enables many higher-level semantic analysis tasks needed in search, retrieval, and navigation.

  5. Automated Development of Accurate Algorithms and Efficient Codes for Computational Aeroacoustics

    Science.gov (United States)

    Goodrich, John W.; Dyson, Rodger W.

    1999-01-01

    The simulation of sound generation and propagation in three space dimensions with realistic aircraft components is a very large time dependent computation with fine details. Simulations in open domains with embedded objects require accurate and robust algorithms for propagation, for artificial inflow and outflow boundaries, and for the definition of geometrically complex objects. The development, implementation, and validation of methods for solving these demanding problems is being done to support the NASA pillar goals for reducing aircraft noise levels. Our goal is to provide algorithms which are sufficiently accurate and efficient to produce usable results rapidly enough to allow design engineers to study the effects on sound levels of design changes in propulsion systems, and in the integration of propulsion systems with airframes. There is a lack of design tools for these purposes at this time. Our technical approach to this problem combines the development of new, algorithms with the use of Mathematica and Unix utilities to automate the algorithm development, code implementation, and validation. We use explicit methods to ensure effective implementation by domain decomposition for SPMD parallel computing. There are several orders of magnitude difference in the computational efficiencies of the algorithms which we have considered. We currently have new artificial inflow and outflow boundary conditions that are stable, accurate, and unobtrusive, with implementations that match the accuracy and efficiency of the propagation methods. The artificial numerical boundary treatments have been proven to have solutions which converge to the full open domain problems, so that the error from the boundary treatments can be driven as low as is required. The purpose of this paper is to briefly present a method for developing highly accurate algorithms for computational aeroacoustics, the use of computer automation in this process, and a brief survey of the algorithms that

  6. Acceleration of FDTD mode solver by high-performance computing techniques.

    Science.gov (United States)

    Han, Lin; Xi, Yanping; Huang, Wei-Ping

    2010-06-21

    A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

  7. A computationally efficient fuzzy control s

    Directory of Open Access Journals (Sweden)

    Abdel Badie Sharkawy

    2013-12-01

    Full Text Available This paper develops a decentralized fuzzy control scheme for MIMO nonlinear second order systems with application to robot manipulators via a combination of genetic algorithms (GAs and fuzzy systems. The controller for each degree of freedom (DOF consists of a feedforward fuzzy torque computing system and a feedback fuzzy PD system. The feedforward fuzzy system is trained and optimized off-line using GAs, whereas not only the parameters but also the structure of the fuzzy system is optimized. The feedback fuzzy PD system, on the other hand, is used to keep the closed-loop stable. The rule base consists of only four rules per each DOF. Furthermore, the fuzzy feedback system is decentralized and simplified leading to a computationally efficient control scheme. The proposed control scheme has the following advantages: (1 it needs no exact dynamics of the system and the computation is time-saving because of the simple structure of the fuzzy systems and (2 the controller is robust against various parameters and payload uncertainties. The computational complexity of the proposed control scheme has been analyzed and compared with previous works. Computer simulations show that this controller is effective in achieving the control goals.

  8. Monitoring SLAC High Performance UNIX Computing Systems

    International Nuclear Information System (INIS)

    Lettsome, Annette K.

    2005-01-01

    Knowledge of the effectiveness and efficiency of computers is important when working with high performance systems. The monitoring of such systems is advantageous in order to foresee possible misfortunes or system failures. Ganglia is a software system designed for high performance computing systems to retrieve specific monitoring information. An alternative storage facility for Ganglia's collected data is needed since its default storage system, the round-robin database (RRD), struggles with data integrity. The creation of a script-driven MySQL database solves this dilemma. This paper describes the process took in the creation and implementation of the MySQL database for use by Ganglia. Comparisons between data storage by both databases are made using gnuplot and Ganglia's real-time graphical user interface

  9. Computationally efficient SVM multi-class image recognition with confidence measures

    International Nuclear Information System (INIS)

    Makili, Lazaro; Vega, Jesus; Dormido-Canto, Sebastian; Pastor, Ignacio; Murari, Andrea

    2011-01-01

    Typically, machine learning methods produce non-qualified estimates, i.e. the accuracy and reliability of the predictions are not provided. Transductive predictors are very recent classifiers able to provide, simultaneously with the prediction, a couple of values (confidence and credibility) to reflect the quality of the prediction. Usually, a drawback of the transductive techniques for huge datasets and large dimensionality is the high computational time. To overcome this issue, a more efficient classifier has been used in a multi-class image classification problem in the TJ-II stellarator database. It is based on the creation of a hash function to generate several 'one versus the rest' classifiers for every class. By using Support Vector Machines as the underlying classifier, a comparison between the pure transductive approach and the new method has been performed. In both cases, the success rates are high and the computation time with the new method is up to 0.4 times the old one.

  10. Introduction to massively-parallel computing in high-energy physics

    CERN Document Server

    AUTHOR|(CDS)2083520

    1993-01-01

    Ever since computers were first used for scientific and numerical work, there has existed an "arms race" between the technical development of faster computing hardware, and the desires of scientists to solve larger problems in shorter time-scales. However, the vast leaps in processor performance achieved through advances in semi-conductor science have reached a hiatus as the technology comes up against the physical limits of the speed of light and quantum effects. This has lead all high performance computer manufacturers to turn towards a parallel architecture for their new machines. In these lectures we will introduce the history and concepts behind parallel computing, and review the various parallel architectures and software environments currently available. We will then introduce programming methodologies that allow efficient exploitation of parallel machines, and present case studies of the parallelization of typical High Energy Physics codes for the two main classes of parallel computing architecture (S...

  11. Micromagnetics on high-performance workstation and mobile computational platforms

    Science.gov (United States)

    Fu, S.; Chang, R.; Couture, S.; Menarini, M.; Escobar, M. A.; Kuteifan, M.; Lubarda, M.; Gabay, D.; Lomakin, V.

    2015-05-01

    The feasibility of using high-performance desktop and embedded mobile computational platforms is presented, including multi-core Intel central processing unit, Nvidia desktop graphics processing units, and Nvidia Jetson TK1 Platform. FastMag finite element method-based micromagnetic simulator is used as a testbed, showing high efficiency on all the platforms. Optimization aspects of improving the performance of the mobile systems are discussed. The high performance, low cost, low power consumption, and rapid performance increase of the embedded mobile systems make them a promising candidate for micromagnetic simulations. Such architectures can be used as standalone systems or can be built as low-power computing clusters.

  12. Game-Theoretic Rate-Distortion-Complexity Optimization of High Efficiency Video Coding

    DEFF Research Database (Denmark)

    Ukhanova, Ann; Milani, Simone; Forchhammer, Søren

    2013-01-01

    profiles in order to tailor the computational load to the different hardware and power-supply resources of devices. In this work, we focus on optimizing the quantization parameter and partition depth in HEVC via a game-theoretic approach. The proposed rate control strategy alone provides 0.2 dB improvement......This paper presents an algorithm for rate-distortioncomplexity optimization for the emerging High Efficiency Video Coding (HEVC) standard, whose high computational requirements urge the need for low-complexity optimization algorithms. Optimization approaches need to specify different complexity...

  13. High-efficiency wavefunction updates for large scale Quantum Monte Carlo

    Science.gov (United States)

    Kent, Paul; McDaniel, Tyler; Li, Ying Wai; D'Azevedo, Ed

    Within ab intio Quantum Monte Carlo (QMC) simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunctions. The evaluation of each Monte Carlo move requires finding the determinant of a dense matrix, which is traditionally iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. For calculations with thousands of electrons, this operation dominates the execution profile. We propose a novel rank- k delayed update scheme. This strategy enables probability evaluation for multiple successive Monte Carlo moves, with application of accepted moves to the matrices delayed until after a predetermined number of moves, k. Accepted events grouped in this manner are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency. This procedure does not change the underlying Monte Carlo sampling or the sampling efficiency. For large systems and algorithms such as diffusion Monte Carlo where the acceptance ratio is high, order of magnitude speedups can be obtained on both multi-core CPU and on GPUs, making this algorithm highly advantageous for current petascale and future exascale computations.

  14. Efficient Computation of Transition State Resonances and Reaction Rates from a Quantum Normal Form

    NARCIS (Netherlands)

    Schubert, Roman; Waalkens, Holger; Wiggins, Stephen

    2006-01-01

    A quantum version of a recent formulation of transition state theory in phase space is presented. The theory developed provides an algorithm to compute quantum reaction rates and the associated Gamov-Siegert resonances with very high accuracy. The algorithm is especially efficient for

  15. On efficiently computing multigroup multi-layer neutron reflection and transmission conditions

    International Nuclear Information System (INIS)

    Abreu, Marcos P. de

    2007-01-01

    In this article, we present an algorithm for efficient computation of multigroup discrete ordinates neutron reflection and transmission conditions, which replace a multi-layered boundary region in neutron multiplication eigenvalue computations with no spatial truncation error. In contrast to the independent layer-by-layer algorithm considered thus far in our computations, the algorithm here is based on an inductive approach developed by the present author for deriving neutron reflection and transmission conditions for a nonactive boundary region with an arbitrary number of arbitrarily thick layers. With this new algorithm, we were able to increase significantly the computational efficiency of our spectral diamond-spectral Green's function method for solving multigroup neutron multiplication eigenvalue problems with multi-layered boundary regions. We provide comparative results for a two-group reactor core model to illustrate the increased efficiency of our spectral method, and we conclude this article with a number of general remarks. (author)

  16. Efficient Minimum-Phase Prefilter Computation Using Fast QL-Factorization

    DEFF Research Database (Denmark)

    Hansen, Morten; Christensen, Lars P.B.

    2009-01-01

    This paper presents a novel approach for computing both the minimum-phase filter and the associated all-pass filter in a computationally efficient way using the fast QL-factorization. A desirable property of this approach is that the complexity is independent on the size of the matrix which is QL...

  17. Energy Efficiency in Computing (1/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    As manufacturers improve the silicon process, truly low energy computing is becoming a reality - both in servers and in the consumer space. This series of lectures covers a broad spectrum of aspects related to energy efficient computing - from circuits to datacentres. We will discuss common trade-offs and basic components, such as processors, memory and accelerators. We will also touch on the fundamentals of modern datacenter design and operation. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP and Google), as well as international research institutes, such as EPFL. Currently, Andrzej acts as a consultant on technology and innovation with TIK Services (http://tik.services), and runs a peer-to-peer lending start-up. NB! All Academic L...

  18. Positive Wigner functions render classical simulation of quantum computation efficient.

    Science.gov (United States)

    Mari, A; Eisert, J

    2012-12-07

    We show that quantum circuits where the initial state and all the following quantum operations can be represented by positive Wigner functions can be classically efficiently simulated. This is true both for continuous-variable as well as discrete variable systems in odd prime dimensions, two cases which will be treated on entirely the same footing. Noting the fact that Clifford and Gaussian operations preserve the positivity of the Wigner function, our result generalizes the Gottesman-Knill theorem. Our algorithm provides a way of sampling from the output distribution of a computation or a simulation, including the efficient sampling from an approximate output distribution in the case of sampling imperfections for initial states, gates, or measurements. In this sense, this work highlights the role of the positive Wigner function as separating classically efficiently simulable systems from those that are potentially universal for quantum computing and simulation, and it emphasizes the role of negativity of the Wigner function as a computational resource.

  19. High-performance computing in accelerating structure design and analysis

    International Nuclear Information System (INIS)

    Li Zenghai; Folwell, Nathan; Ge Lixin; Guetz, Adam; Ivanov, Valentin; Kowalski, Marc; Lee, Lie-Quan; Ng, Cho-Kuen; Schussman, Greg; Stingelin, Lukas; Uplenchwar, Ravindra; Wolf, Michael; Xiao, Liling; Ko, Kwok

    2006-01-01

    Future high-energy accelerators such as the Next Linear Collider (NLC) will accelerate multi-bunch beams of high current and low emittance to obtain high luminosity, which put stringent requirements on the accelerating structures for efficiency and beam stability. While numerical modeling has been quite standard in accelerator R and D, designing the NLC accelerating structure required a new simulation capability because of the geometric complexity and level of accuracy involved. Under the US DOE Advanced Computing initiatives (first the Grand Challenge and now SciDAC), SLAC has developed a suite of electromagnetic codes based on unstructured grids and utilizing high-performance computing to provide an advanced tool for modeling structures at accuracies and scales previously not possible. This paper will discuss the code development and computational science research (e.g. domain decomposition, scalable eigensolvers, adaptive mesh refinement) that have enabled the large-scale simulations needed for meeting the computational challenges posed by the NLC as well as projects such as the PEP-II and RIA. Numerical results will be presented to show how high-performance computing has made a qualitative improvement in accelerator structure modeling for these accelerators, either at the component level (single cell optimization), or on the scale of an entire structure (beam heating and long-range wakefields)

  20. A strategy for improved computational efficiency of the method of anchored distributions

    Science.gov (United States)

    Over, Matthew William; Yang, Yarong; Chen, Xingyuan; Rubin, Yoram

    2013-06-01

    This paper proposes a strategy for improving the computational efficiency of model inversion using the method of anchored distributions (MAD) by "bundling" similar model parametrizations in the likelihood function. Inferring the likelihood function typically requires a large number of forward model (FM) simulations for each possible model parametrization; as a result, the process is quite expensive. To ease this prohibitive cost, we present an approximation for the likelihood function called bundling that relaxes the requirement for high quantities of FM simulations. This approximation redefines the conditional statement of the likelihood function as the probability of a set of similar model parametrizations "bundle" replicating field measurements, which we show is neither a model reduction nor a sampling approach to improving the computational efficiency of model inversion. To evaluate the effectiveness of these modifications, we compare the quality of predictions and computational cost of bundling relative to a baseline MAD inversion of 3-D flow and transport model parameters. Additionally, to aid understanding of the implementation we provide a tutorial for bundling in the form of a sample data set and script for the R statistical computing language. For our synthetic experiment, bundling achieved a 35% reduction in overall computational cost and had a limited negative impact on predicted probability distributions of the model parameters. Strategies for minimizing error in the bundling approximation, for enforcing similarity among the sets of model parametrizations, and for identifying convergence of the likelihood function are also presented.

  1. The thermodynamic efficiency of computations made in cells across the range of life

    Science.gov (United States)

    Kempes, Christopher P.; Wolpert, David; Cohen, Zachary; Pérez-Mercader, Juan

    2017-11-01

    Biological organisms must perform computation as they grow, reproduce and evolve. Moreover, ever since Landauer's bound was proposed, it has been known that all computation has some thermodynamic cost-and that the same computation can be achieved with greater or smaller thermodynamic cost depending on how it is implemented. Accordingly an important issue concerning the evolution of life is assessing the thermodynamic efficiency of the computations performed by organisms. This issue is interesting both from the perspective of how close life has come to maximally efficient computation (presumably under the pressure of natural selection), and from the practical perspective of what efficiencies we might hope that engineered biological computers might achieve, especially in comparison with current computational systems. Here we show that the computational efficiency of translation, defined as free energy expended per amino acid operation, outperforms the best supercomputers by several orders of magnitude, and is only about an order of magnitude worse than the Landauer bound. However, this efficiency depends strongly on the size and architecture of the cell in question. In particular, we show that the useful efficiency of an amino acid operation, defined as the bulk energy per amino acid polymerization, decreases for increasing bacterial size and converges to the polymerization cost of the ribosome. This cost of the largest bacteria does not change in cells as we progress through the major evolutionary shifts to both single- and multicellular eukaryotes. However, the rates of total computation per unit mass are non-monotonic in bacteria with increasing cell size, and also change across different biological architectures, including the shift from unicellular to multicellular eukaryotes. This article is part of the themed issue 'Reconceptualizing the origins of life'.

  2. A Power Efficient Exaflop Computer Design for Global Cloud System Resolving Climate Models.

    Science.gov (United States)

    Wehner, M. F.; Oliker, L.; Shalf, J.

    2008-12-01

    Exascale computers would allow routine ensemble modeling of the global climate system at the cloud system resolving scale. Power and cost requirements of traditional architecture systems are likely to delay such capability for many years. We present an alternative route to the exascale using embedded processor technology to design a system optimized for ultra high resolution climate modeling. These power efficient processors, used in consumer electronic devices such as mobile phones, portable music players, cameras, etc., can be tailored to the specific needs of scientific computing. We project that a system capable of integrating a kilometer scale climate model a thousand times faster than real time could be designed and built in a five year time scale for US$75M with a power consumption of 3MW. This is cheaper, more power efficient and sooner than any other existing technology.

  3. Wireless-Uplinks-Based Energy-Efficient Scheduling in Mobile Cloud Computing

    OpenAIRE

    Xing Liu; Chaowei Yuan; Zhen Yang; Enda Peng

    2015-01-01

    Mobile cloud computing (MCC) combines cloud computing and mobile internet to improve the computational capabilities of resource-constrained mobile devices (MDs). In MCC, mobile users could not only improve the computational capability of MDs but also save operation consumption by offloading the mobile applications to the cloud. However, MCC faces the problem of energy efficiency because of time-varying channels when the offloading is being executed. In this paper, we address the issue of ener...

  4. An efficient method for computing the absorption of solar radiation by water vapor

    Science.gov (United States)

    Chou, M.-D.; Arking, A.

    1981-01-01

    Chou and Arking (1980) have developed a fast but accurate method for computing the IR cooling rate due to water vapor. Using a similar approach, the considered investigation develops a method for computing the heating rates due to the absorption of solar radiation by water vapor in the wavelength range from 4 to 8.3 micrometers. The validity of the method is verified by comparison with line-by-line calculations. An outline is provided of an efficient method for transmittance and flux computations based upon actual line parameters. High speed is achieved by employing a one-parameter scaling approximation to convert an inhomogeneous path into an equivalent homogeneous path at suitably chosen reference conditions.

  5. Computer Architecture for Energy Efficient SFQ

    Science.gov (United States)

    2014-08-27

    IBM Corporation (T.J. Watson Research Laboratory) 1101 Kitchawan Road Yorktown Heights, NY 10598 -0000 2 ABSTRACT Number of Papers published in peer...accomplished during this ARO-sponsored project at IBM Research to identify and model an energy efficient SFQ-based computer architecture. The... IBM Windsor Blue (WB), illustrated schematically in Figure 2. The basic building block of WB is a "tile" comprised of a 64-bit arithmetic logic unit

  6. Efficient conjugate gradient algorithms for computation of the manipulator forward dynamics

    Science.gov (United States)

    Fijany, Amir; Scheid, Robert E.

    1989-01-01

    The applicability of conjugate gradient algorithms for computation of the manipulator forward dynamics is investigated. The redundancies in the previously proposed conjugate gradient algorithm are analyzed. A new version is developed which, by avoiding these redundancies, achieves a significantly greater efficiency. A preconditioned conjugate gradient algorithm is also presented. A diagonal matrix whose elements are the diagonal elements of the inertia matrix is proposed as the preconditioner. In order to increase the computational efficiency, an algorithm is developed which exploits the synergism between the computation of the diagonal elements of the inertia matrix and that required by the conjugate gradient algorithm.

  7. Computer proficiency questionnaire: assessing low and high computer proficient seniors.

    Science.gov (United States)

    Boot, Walter R; Charness, Neil; Czaja, Sara J; Sharit, Joseph; Rogers, Wendy A; Fisk, Arthur D; Mitzner, Tracy; Lee, Chin Chin; Nair, Sankaran

    2015-06-01

    Computers and the Internet have the potential to enrich the lives of seniors and aid in the performance of important tasks required for independent living. A prerequisite for reaping these benefits is having the skills needed to use these systems, which is highly dependent on proper training. One prerequisite for efficient and effective training is being able to gauge current levels of proficiency. We developed a new measure (the Computer Proficiency Questionnaire, or CPQ) to measure computer proficiency in the domains of computer basics, printing, communication, Internet, calendaring software, and multimedia use. Our aim was to develop a measure appropriate for individuals with a wide range of proficiencies from noncomputer users to extremely skilled users. To assess the reliability and validity of the CPQ, a diverse sample of older adults, including 276 older adults with no or minimal computer experience, was recruited and asked to complete the CPQ. The CPQ demonstrated excellent reliability (Cronbach's α = .98), with subscale reliabilities ranging from .86 to .97. Age, computer use, and general technology use all predicted CPQ scores. Factor analysis revealed three main factors of proficiency related to Internet and e-mail use; communication and calendaring; and computer basics. Based on our findings, we also developed a short-form CPQ (CPQ-12) with similar properties but 21 fewer questions. The CPQ and CPQ-12 are useful tools to gauge computer proficiency for training and research purposes, even among low computer proficient older adults. © The Author 2013. Published by Oxford University Press on behalf of The Gerontological Society of America. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  8. An efficient and general numerical method to compute steady uniform vortices

    Science.gov (United States)

    Luzzatto-Fegiz, Paolo; Williamson, Charles H. K.

    2011-07-01

    Steady uniform vortices are widely used to represent high Reynolds number flows, yet their efficient computation still presents some challenges. Existing Newton iteration methods become inefficient as the vortices develop fine-scale features; in addition, these methods cannot, in general, find solutions with specified Casimir invariants. On the other hand, available relaxation approaches are computationally inexpensive, but can fail to converge to a solution. In this paper, we overcome these limitations by introducing a new discretization, based on an inverse-velocity map, which radically increases the efficiency of Newton iteration methods. In addition, we introduce a procedure to prescribe Casimirs and remove the degeneracies in the steady vorticity equation, thus ensuring convergence for general vortex configurations. We illustrate our methodology by considering several unbounded flows involving one or two vortices. Our method enables the computation, for the first time, of steady vortices that do not exhibit any geometric symmetry. In addition, we discover that, as the limiting vortex state for each flow is approached, each family of solutions traces a clockwise spiral in a bifurcation plot consisting of a velocity-impulse diagram. By the recently introduced "IVI diagram" stability approach [Phys. Rev. Lett. 104 (2010) 044504], each turn of this spiral is associated with a loss of stability for the steady flows. Such spiral structure is suggested to be a universal feature of steady, uniform-vorticity flows.

  9. Efficient computation of spaced seeds

    Directory of Open Access Journals (Sweden)

    Ilie Silvana

    2012-02-01

    Full Text Available Abstract Background The most frequently used tools in bioinformatics are those searching for similarities, or local alignments, between biological sequences. Since the exact dynamic programming algorithm is quadratic, linear-time heuristics such as BLAST are used. Spaced seeds are much more sensitive than the consecutive seed of BLAST and using several seeds represents the current state of the art in approximate search for biological sequences. The most important aspect is computing highly sensitive seeds. Since the problem seems hard, heuristic algorithms are used. The leading software in the common Bernoulli model is the SpEED program. Findings SpEED uses a hill climbing method based on the overlap complexity heuristic. We propose a new algorithm for this heuristic that improves its speed by over one order of magnitude. We use the new implementation to compute improved seeds for several software programs. We compute as well multiple seeds of the same weight as MegaBLAST, that greatly improve its sensitivity. Conclusion Multiple spaced seeds are being successfully used in bioinformatics software programs. Enabling researchers to compute very fast high quality seeds will help expanding the range of their applications.

  10. On the efficient parallel computation of Legendre transforms

    NARCIS (Netherlands)

    Inda, M.A.; Bisseling, R.H.; Maslen, D.K.

    2001-01-01

    In this article, we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the

  11. On the efficient parallel computation of Legendre transforms

    NARCIS (Netherlands)

    Inda, M.A.; Bisseling, R.H.; Maslen, D.K.

    1999-01-01

    In this article we discuss a parallel implementation of efficient algorithms for computation of Legendre polynomial transforms and other orthogonal polynomial transforms. We develop an approach to the Driscoll-Healy algorithm using polynomial arithmetic and present experimental results on the

  12. Computationally efficient clustering of audio-visual meeting data

    NARCIS (Netherlands)

    Hung, H.; Friedland, G.; Yeo, C.; Shao, L.; Shan, C.; Luo, J.; Etoh, M.

    2010-01-01

    This chapter presents novel computationally efficient algorithms to extract semantically meaningful acoustic and visual events related to each of the participants in a group discussion using the example of business meeting recordings. The recording setup involves relatively few audio-visual sensors,

  13. Efficient computation method of Jacobian matrix

    International Nuclear Information System (INIS)

    Sasaki, Shinobu

    1995-05-01

    As well known, the elements of the Jacobian matrix are complex trigonometric functions of the joint angles, resulting in a matrix of staggering complexity when we write it all out in one place. This article addresses that difficulties to this subject are overcome by using velocity representation. The main point is that its recursive algorithm and computer algebra technologies allow us to derive analytical formulation with no human intervention. Particularly, it is to be noted that as compared to previous results the elements are extremely simplified throughout the effective use of frame transformations. Furthermore, in case of a spherical wrist, it is shown that the present approach is computationally most efficient. Due to such advantages, the proposed method is useful in studying kinematically peculiar properties such as singularity problems. (author)

  14. Improving robustness and computational efficiency using modern C++

    International Nuclear Information System (INIS)

    Paterno, M; Kowalkowski, J; Green, C

    2014-01-01

    For nearly two decades, the C++ programming language has been the dominant programming language for experimental HEP. The publication of ISO/IEC 14882:2011, the current version of the international standard for the C++ programming language, makes available a variety of language and library facilities for improving the robustness, expressiveness, and computational efficiency of C++ code. However, much of the C++ written by the experimental HEP community does not take advantage of the features of the language to obtain these benefits, either due to lack of familiarity with these features or concern that these features must somehow be computationally inefficient. In this paper, we address some of the features of modern C+-+, and show how they can be used to make programs that are both robust and computationally efficient. We compare and contrast simple yet realistic examples of some common implementation patterns in C, currently-typical C++, and modern C++, and show (when necessary, down to the level of generated assembly language code) the quality of the executable code produced by recent C++ compilers, with the aim of allowing the HEP community to make informed decisions on the costs and benefits of the use of modern C++.

  15. Efficient quantum circuits for one-way quantum computing.

    Science.gov (United States)

    Tanamoto, Tetsufumi; Liu, Yu-Xi; Hu, Xuedong; Nori, Franco

    2009-03-13

    While Ising-type interactions are ideal for implementing controlled phase flip gates in one-way quantum computing, natural interactions between solid-state qubits are most often described by either the XY or the Heisenberg models. We show an efficient way of generating cluster states directly using either the imaginary SWAP (iSWAP) gate for the XY model, or the sqrt[SWAP] gate for the Heisenberg model. Our approach thus makes one-way quantum computing more feasible for solid-state devices.

  16. Energy-efficient computing and networking. Revised selected papers

    Energy Technology Data Exchange (ETDEWEB)

    Hatziargyriou, Nikos; Dimeas, Aris [Ethnikon Metsovion Polytechneion, Athens (Greece); Weidlich, Anke (eds.) [SAP Research Center, Karlsruhe (Germany); Tomtsi, Thomai

    2011-07-01

    This book constitutes the postproceedings of the First International Conference on Energy-Efficient Computing and Networking, E-Energy, held in Passau, Germany in April 2010. The 23 revised papers presented were carefully reviewed and selected for inclusion in the post-proceedings. The papers are organized in topical sections on energy market and algorithms, ICT technology for the energy market, implementation of smart grid and smart home technology, microgrids and energy management, and energy efficiency through distributed energy management and buildings. (orig.)

  17. Wireless-Uplinks-Based Energy-Efficient Scheduling in Mobile Cloud Computing

    Directory of Open Access Journals (Sweden)

    Xing Liu

    2015-01-01

    Full Text Available Mobile cloud computing (MCC combines cloud computing and mobile internet to improve the computational capabilities of resource-constrained mobile devices (MDs. In MCC, mobile users could not only improve the computational capability of MDs but also save operation consumption by offloading the mobile applications to the cloud. However, MCC faces the problem of energy efficiency because of time-varying channels when the offloading is being executed. In this paper, we address the issue of energy-efficient scheduling for wireless uplink in MCC. By introducing Lyapunov optimization, we first propose a scheduling algorithm that can dynamically choose channel to transmit data based on queue backlog and channel statistics. Then, we show that the proposed scheduling algorithm can make a tradeoff between queue backlog and energy consumption in a channel-aware MCC system. Simulation results show that the proposed scheduling algorithm can reduce the time average energy consumption for offloading compared to the existing algorithm.

  18. Delayed Slater determinant update algorithms for high efficiency quantum Monte Carlo

    Science.gov (United States)

    McDaniel, T.; D'Azevedo, E. F.; Li, Y. W.; Wong, K.; Kent, P. R. C.

    2017-11-01

    Within ab initio Quantum Monte Carlo simulations, the leading numerical cost for large systems is the computation of the values of the Slater determinants in the trial wavefunction. Each Monte Carlo step requires finding the determinant of a dense matrix. This is most commonly iteratively evaluated using a rank-1 Sherman-Morrison updating scheme to avoid repeated explicit calculation of the inverse. The overall computational cost is, therefore, formally cubic in the number of electrons or matrix size. To improve the numerical efficiency of this procedure, we propose a novel multiple rank delayed update scheme. This strategy enables probability evaluation with an application of accepted moves to the matrices delayed until after a predetermined number of moves, K. The accepted events are then applied to the matrices en bloc with enhanced arithmetic intensity and computational efficiency via matrix-matrix operations instead of matrix-vector operations. This procedure does not change the underlying Monte Carlo sampling or its statistical efficiency. For calculations on large systems and algorithms such as diffusion Monte Carlo, where the acceptance ratio is high, order of magnitude improvements in the update time can be obtained on both multi-core central processing units and graphical processing units.

  19. Increasing the computational efficient of digital cross correlation by a vectorization method

    Science.gov (United States)

    Chang, Ching-Yuan; Ma, Chien-Ching

    2017-08-01

    This study presents a vectorization method for use in MATLAB programming aimed at increasing the computational efficiency of digital cross correlation in sound and images, resulting in a speedup of 6.387 and 36.044 times compared with performance values obtained from looped expression. This work bridges the gap between matrix operations and loop iteration, preserving flexibility and efficiency in program testing. This paper uses numerical simulation to verify the speedup of the proposed vectorization method as well as experiments to measure the quantitative transient displacement response subjected to dynamic impact loading. The experiment involved the use of a high speed camera as well as a fiber optic system to measure the transient displacement in a cantilever beam under impact from a steel ball. Experimental measurement data obtained from the two methods are in excellent agreement in both the time and frequency domain, with discrepancies of only 0.68%. Numerical and experiment results demonstrate the efficacy of the proposed vectorization method with regard to computational speed in signal processing and high precision in the correlation algorithm. We also present the source code with which to build MATLAB-executable functions on Windows as well as Linux platforms, and provide a series of examples to demonstrate the application of the proposed vectorization method.

  20. Perspective: Memcomputing: Leveraging memory and physics to compute efficiently

    Science.gov (United States)

    Di Ventra, Massimiliano; Traversa, Fabio L.

    2018-05-01

    It is well known that physical phenomena may be of great help in computing some difficult problems efficiently. A typical example is prime factorization that may be solved in polynomial time by exploiting quantum entanglement on a quantum computer. There are, however, other types of (non-quantum) physical properties that one may leverage to compute efficiently a wide range of hard problems. In this perspective, we discuss how to employ one such property, memory (time non-locality), in a novel physics-based approach to computation: Memcomputing. In particular, we focus on digital memcomputing machines (DMMs) that are scalable. DMMs can be realized with non-linear dynamical systems with memory. The latter property allows the realization of a new type of Boolean logic, one that is self-organizing. Self-organizing logic gates are "terminal-agnostic," namely, they do not distinguish between the input and output terminals. When appropriately assembled to represent a given combinatorial/optimization problem, the corresponding self-organizing circuit converges to the equilibrium points that express the solutions of the problem at hand. In doing so, DMMs take advantage of the long-range order that develops during the transient dynamics. This collective dynamical behavior, reminiscent of a phase transition, or even the "edge of chaos," is mediated by families of classical trajectories (instantons) that connect critical points of increasing stability in the system's phase space. The topological character of the solution search renders DMMs robust against noise and structural disorder. Since DMMs are non-quantum systems described by ordinary differential equations, not only can they be built in hardware with the available technology, they can also be simulated efficiently on modern classical computers. As an example, we will show the polynomial-time solution of the subset-sum problem for the worst cases, and point to other types of hard problems where simulations of DMMs

  1. An accurate and computationally efficient small-scale nonlinear FEA of flexible risers

    OpenAIRE

    Rahmati, MT; Bahai, H; Alfano, G

    2016-01-01

    This paper presents a highly efficient small-scale, detailed finite-element modelling method for flexible risers which can be effectively implemented in a fully-nested (FE2) multiscale analysis based on computational homogenisation. By exploiting cyclic symmetry and applying periodic boundary conditions, only a small fraction of a flexible pipe is used for a detailed nonlinear finite-element analysis at the small scale. In this model, using three-dimensional elements, all layer components are...

  2. An efficient hybrid technique in RCS predictions of complex targets at high frequencies

    Science.gov (United States)

    Algar, María-Jesús; Lozano, Lorena; Moreno, Javier; González, Iván; Cátedra, Felipe

    2017-09-01

    Most computer codes in Radar Cross Section (RCS) prediction use Physical Optics (PO) and Physical theory of Diffraction (PTD) combined with Geometrical Optics (GO) and Geometrical Theory of Diffraction (GTD). The latter approaches are computationally cheaper and much more accurate for curved surfaces, but not applicable for the computation of the RCS of all surfaces of a complex object due to the presence of caustic problems in the analysis of concave surfaces or flat surfaces in the far field. The main contribution of this paper is the development of a hybrid method based on a new combination of two asymptotic techniques: GTD and PO, considering the advantages and avoiding the disadvantages of each of them. A very efficient and accurate method to analyze the RCS of complex structures at high frequencies is obtained with the new combination. The proposed new method has been validated comparing RCS results obtained for some simple cases using the proposed approach and RCS using the rigorous technique of Method of Moments (MoM). Some complex cases have been examined at high frequencies contrasting the results with PO. This study shows the accuracy and the efficiency of the hybrid method and its suitability for the computation of the RCS at really large and complex targets at high frequencies.

  3. Computational Properties of the Hippocampus Increase the Efficiency of Goal-Directed Foraging through Hierarchical Reinforcement Learning

    Directory of Open Access Journals (Sweden)

    Eric Chalmers

    2016-12-01

    Full Text Available The mammalian brain is thought to use a version of Model-based Reinforcement Learning (MBRL to guide goal-directed behavior, wherein animals consider goals and make plans to acquire desired outcomes. However, conventional MBRL algorithms do not fully explain animals’ ability to rapidly adapt to environmental changes, or learn multiple complex tasks. They also require extensive computation, suggesting that goal-directed behavior is cognitively expensive. We propose here that key features of processing in the hippocampus support a flexible MBRL mechanism for spatial navigation that is computationally efficient and can adapt quickly to change. We investigate this idea by implementing a computational MBRL framework that incorporates features inspired by computational properties of the hippocampus: a hierarchical representation of space, forward sweeps through future spatial trajectories, and context-driven remapping of place cells. We find that a hierarchical abstraction of space greatly reduces the computational load (mental effort required for adaptation to changing environmental conditions, and allows efficient scaling to large problems. It also allows abstract knowledge gained at high levels to guide adaptation to new obstacles. Moreover, a context-driven remapping mechanism allows learning and memory of multiple tasks. Simulating dorsal or ventral hippocampal lesions in our computational framework qualitatively reproduces behavioral deficits observed in rodents with analogous lesions. The framework may thus embody key features of how the brain organizes model-based RL to efficiently solve navigation and other difficult tasks.

  4. Efficient Adjoint Computation of Hybrid Systems of Differential Algebraic Equations with Applications in Power Systems

    Energy Technology Data Exchange (ETDEWEB)

    Abhyankar, Shrirang [Argonne National Lab. (ANL), Argonne, IL (United States); Anitescu, Mihai [Argonne National Lab. (ANL), Argonne, IL (United States); Constantinescu, Emil [Argonne National Lab. (ANL), Argonne, IL (United States); Zhang, Hong [Argonne National Lab. (ANL), Argonne, IL (United States)

    2016-03-31

    Sensitivity analysis is an important tool to describe power system dynamic behavior in response to parameter variations. It is a central component in preventive and corrective control applications. The existing approaches for sensitivity calculations, namely, finite-difference and forward sensitivity analysis, require a computational effort that increases linearly with the number of sensitivity parameters. In this work, we investigate, implement, and test a discrete adjoint sensitivity approach whose computational effort is effectively independent of the number of sensitivity parameters. The proposed approach is highly efficient for calculating trajectory sensitivities of larger systems and is consistent, within machine precision, with the function whose sensitivity we are seeking. This is an essential feature for use in optimization applications. Moreover, our approach includes a consistent treatment of systems with switching, such as DC exciters, by deriving and implementing the adjoint jump conditions that arise from state and time-dependent discontinuities. The accuracy and the computational efficiency of the proposed approach are demonstrated in comparison with the forward sensitivity analysis approach.

  5. Scalable optical packet switch architecture for low latency and high load computer communication networks

    NARCIS (Netherlands)

    Calabretta, N.; Di Lucente, S.; Nazarathy, Y.; Raz, O.; Dorren, H.J.S.

    2011-01-01

    High performance computer and data-centers require PetaFlop/s processing speed and Petabyte storage capacity with thousands of low-latency short link interconnections between computers nodes. Switch matrices that operate transparently in the optical domain are a potential way to efficiently

  6. Dendritic nonlinearities are tuned for efficient spike-based computations in cortical circuits.

    Science.gov (United States)

    Ujfalussy, Balázs B; Makara, Judit K; Branco, Tiago; Lengyel, Máté

    2015-12-24

    Cortical neurons integrate thousands of synaptic inputs in their dendrites in highly nonlinear ways. It is unknown how these dendritic nonlinearities in individual cells contribute to computations at the level of neural circuits. Here, we show that dendritic nonlinearities are critical for the efficient integration of synaptic inputs in circuits performing analog computations with spiking neurons. We developed a theory that formalizes how a neuron's dendritic nonlinearity that is optimal for integrating synaptic inputs depends on the statistics of its presynaptic activity patterns. Based on their in vivo preynaptic population statistics (firing rates, membrane potential fluctuations, and correlations due to ensemble dynamics), our theory accurately predicted the responses of two different types of cortical pyramidal cells to patterned stimulation by two-photon glutamate uncaging. These results reveal a new computational principle underlying dendritic integration in cortical neurons by suggesting a functional link between cellular and systems--level properties of cortical circuits.

  7. Processing-Efficient Distributed Adaptive RLS Filtering for Computationally Constrained Platforms

    Directory of Open Access Journals (Sweden)

    Noor M. Khan

    2017-01-01

    Full Text Available In this paper, a novel processing-efficient architecture of a group of inexpensive and computationally incapable small platforms is proposed for a parallely distributed adaptive signal processing (PDASP operation. The proposed architecture runs computationally expensive procedures like complex adaptive recursive least square (RLS algorithm cooperatively. The proposed PDASP architecture operates properly even if perfect time alignment among the participating platforms is not available. An RLS algorithm with the application of MIMO channel estimation is deployed on the proposed architecture. Complexity and processing time of the PDASP scheme with MIMO RLS algorithm are compared with sequentially operated MIMO RLS algorithm and liner Kalman filter. It is observed that PDASP scheme exhibits much lesser computational complexity parallely than the sequential MIMO RLS algorithm as well as Kalman filter. Moreover, the proposed architecture provides an improvement of 95.83% and 82.29% decreased processing time parallely compared to the sequentially operated Kalman filter and MIMO RLS algorithm for low doppler rate, respectively. Likewise, for high doppler rate, the proposed architecture entails an improvement of 94.12% and 77.28% decreased processing time compared to the Kalman and RLS algorithms, respectively.

  8. Computing in high energy physics

    Energy Technology Data Exchange (ETDEWEB)

    Watase, Yoshiyuki

    1991-09-15

    The increasingly important role played by computing and computers in high energy physics is displayed in the 'Computing in High Energy Physics' series of conferences, bringing together experts in different aspects of computing - physicists, computer scientists, and vendors.

  9. Computing in high energy physics

    International Nuclear Information System (INIS)

    Watase, Yoshiyuki

    1991-01-01

    The increasingly important role played by computing and computers in high energy physics is displayed in the 'Computing in High Energy Physics' series of conferences, bringing together experts in different aspects of computing - physicists, computer scientists, and vendors

  10. Energy-Efficient FPGA-Based Parallel Quasi-Stochastic Computing

    Directory of Open Access Journals (Sweden)

    Ramu Seva

    2017-11-01

    Full Text Available The high performance of FPGA (Field Programmable Gate Array in image processing applications is justified by its flexible reconfigurability, its inherent parallel nature and the availability of a large amount of internal memories. Lately, the Stochastic Computing (SC paradigm has been found to be significantly advantageous in certain application domains including image processing because of its lower hardware complexity and power consumption. However, its viability is deemed to be limited due to its serial bitstream processing and excessive run-time requirement for convergence. To address these issues, a novel approach is proposed in this work where an energy-efficient implementation of SC is accomplished by introducing fast-converging Quasi-Stochastic Number Generators (QSNGs and parallel stochastic bitstream processing, which are well suited to leverage FPGA’s reconfigurability and abundant internal memory resources. The proposed approach has been tested on the Virtex-4 FPGA, and results have been compared with the serial and parallel implementations of conventional stochastic computation using the well-known SC edge detection and multiplication circuits. Results prove that by using this approach, execution time, as well as the power consumption are decreased by a factor of 3.5 and 4.5 for the edge detection circuit and multiplication circuit, respectively.

  11. Efficient universal computing architectures for decoding neural activity.

    Directory of Open Access Journals (Sweden)

    Benjamin I Rapoport

    Full Text Available The ability to decode neural activity into meaningful control signals for prosthetic devices is critical to the development of clinically useful brain- machine interfaces (BMIs. Such systems require input from tens to hundreds of brain-implanted recording electrodes in order to deliver robust and accurate performance; in serving that primary function they should also minimize power dissipation in order to avoid damaging neural tissue; and they should transmit data wirelessly in order to minimize the risk of infection associated with chronic, transcutaneous implants. Electronic architectures for brain- machine interfaces must therefore minimize size and power consumption, while maximizing the ability to compress data to be transmitted over limited-bandwidth wireless channels. Here we present a system of extremely low computational complexity, designed for real-time decoding of neural signals, and suited for highly scalable implantable systems. Our programmable architecture is an explicit implementation of a universal computing machine emulating the dynamics of a network of integrate-and-fire neurons; it requires no arithmetic operations except for counting, and decodes neural signals using only computationally inexpensive logic operations. The simplicity of this architecture does not compromise its ability to compress raw neural data by factors greater than [Formula: see text]. We describe a set of decoding algorithms based on this computational architecture, one designed to operate within an implanted system, minimizing its power consumption and data transmission bandwidth; and a complementary set of algorithms for learning, programming the decoder, and postprocessing the decoded output, designed to operate in an external, nonimplanted unit. The implementation of the implantable portion is estimated to require fewer than 5000 operations per second. A proof-of-concept, 32-channel field-programmable gate array (FPGA implementation of this portion

  12. Investigating the Multi-memetic Mind Evolutionary Computation Algorithm Efficiency

    Directory of Open Access Journals (Sweden)

    M. K. Sakharov

    2017-01-01

    Full Text Available In solving practically significant problems of global optimization, the objective function is often of high dimensionality and computational complexity and of nontrivial landscape as well. Studies show that often one optimization method is not enough for solving such problems efficiently - hybridization of several optimization methods is necessary.One of the most promising contemporary trends in this field are memetic algorithms (MA, which can be viewed as a combination of the population-based search for a global optimum and the procedures for a local refinement of solutions (memes, provided by a synergy. Since there are relatively few theoretical studies concerning the MA configuration, which is advisable for use to solve the black-box optimization problems, many researchers tend just to adaptive algorithms, which for search select the most efficient methods of local optimization for the certain domains of the search space.The article proposes a multi-memetic modification of a simple SMEC algorithm, using random hyper-heuristics. Presents the software algorithm and memes used (Nelder-Mead method, method of random hyper-sphere surface search, Hooke-Jeeves method. Conducts a comparative study of the efficiency of the proposed algorithm depending on the set and the number of memes. The study has been carried out using Rastrigin, Rosenbrock, and Zakharov multidimensional test functions. Computational experiments have been carried out for all possible combinations of memes and for each meme individually.According to results of study, conducted by the multi-start method, the combinations of memes, comprising the Hooke-Jeeves method, were successful. These results prove a rapid convergence of the method to a local optimum in comparison with other memes, since all methods perform the fixed number of iterations at the most.The analysis of the average number of iterations shows that using the most efficient sets of memes allows us to find the optimal

  13. Progress of High Efficiency Centrifugal Compressor Simulations Using TURBO

    Science.gov (United States)

    Kulkarni, Sameer; Beach, Timothy A.

    2017-01-01

    Three-dimensional, time-accurate, and phase-lagged computational fluid dynamics (CFD) simulations of the High Efficiency Centrifugal Compressor (HECC) stage were generated using the TURBO solver. Changes to the TURBO Parallel Version 4 source code were made in order to properly model the no-slip boundary condition along the spinning hub region for centrifugal impellers. A startup procedure was developed to generate a converged flow field in TURBO. This procedure initialized computations on a coarsened mesh generated by the Turbomachinery Gridding System (TGS) and relied on a method of systematically increasing wheel speed and backpressure. Baseline design-speed TURBO results generally overpredicted total pressure ratio, adiabatic efficiency, and the choking flow rate of the HECC stage as compared with the design-intent CFD results of Code Leo. Including diffuser fillet geometry in the TURBO computation resulted in a 0.6 percent reduction in the choking flow rate and led to a better match with design-intent CFD. Diffuser fillets reduced annulus cross-sectional area but also reduced corner separation, and thus blockage, in the diffuser passage. It was found that the TURBO computations are somewhat insensitive to inlet total pressure changing from the TURBO default inlet pressure of 14.7 pounds per square inch (101.35 kilopascals) down to 11.0 pounds per square inch (75.83 kilopascals), the inlet pressure of the component test. Off-design tip clearance was modeled in TURBO in two computations: one in which the blade tip geometry was trimmed by 12 mils (0.3048 millimeters), and another in which the hub flow path was moved to reflect a 12-mil axial shift in the impeller hub, creating a step at the hub. The one-dimensional results of these two computations indicate non-negligible differences between the two modeling approaches.

  14. High-efficiency electric motors: An analysis of a feasible tariff policy for Brazil

    International Nuclear Information System (INIS)

    Paiva Delgado, M.A. de; Tolmasquim, M.T.

    1997-01-01

    The main objective is to calculate an average value for an electricity tariff which will facilitate the introduction of high-efficiency electric motors in the production sector. Two computational models will be developed for technical-economic evaluation to assess economic attractiveness by calculating feasible average electricity tariffs in order to create a market for substitution of standard motors by new high-efficiency models (Purchase Decision Model) as well as to determine if retrofitting of standard installed motors by others with high-efficiency characteristics is viable, and, if so, to specify the optimum timing for such substitution (Substitution Decision Model). It should be noted that the Purchase Decision Model takes into account power factor adjustment and the Substitution Decision Model incorporates considerations as to reduction in the electromechanical performance of operating motors. Results indicate that even where average electricity tariffs are low, as in Brazil, high-efficiency motors are economically attractive compared to standard motors. There is an obvious need for complementary instruments to assist massive market penetration

  15. High Efficiency Ka-Band Spatial Combiner

    Directory of Open Access Journals (Sweden)

    D. Passi

    2014-12-01

    Full Text Available A Ka-Band, High Efficiency, Small Size Spatial Combiner (SPC is proposed in this paper, which uses an innovatively matched quadruple Fin Lines to microstrip (FLuS transitions. At the date of this paper and at the Author's best knowledge no such FLuS innovative transitions have been reported in literature before. These transitions are inserted into a WR28 waveguide T-junction, in order to allow the integration of 16 Monolithic Microwave Integrated Circuit (MMIC Solid State Power Amplifiers (SSPA's. A computational electromagnetic model using the finite elements method has been implemented. A mean insertion loss of 2 dB is achieved with a return loss better the 10 dB in the 31-37 GHz bandwidth.

  16. Toward high-efficiency and detailed Monte Carlo simulation study of the granular flow spallation target

    Science.gov (United States)

    Cai, Han-Jie; Zhang, Zhi-Lei; Fu, Fen; Li, Jian-Yang; Zhang, Xun-Chao; Zhang, Ya-Ling; Yan, Xue-Song; Lin, Ping; Xv, Jian-Ya; Yang, Lei

    2018-02-01

    The dense granular flow spallation target is a new target concept chosen for the Accelerator-Driven Subcritical (ADS) project in China. For the R&D of this kind of target concept, a dedicated Monte Carlo (MC) program named GMT was developed to perform the simulation study of the beam-target interaction. Owing to the complexities of the target geometry, the computational cost of the MC simulation of particle tracks is highly expensive. Thus, improvement of computational efficiency will be essential for the detailed MC simulation studies of the dense granular target. Here we present the special design of the GMT program and its high efficiency performance. In addition, the speedup potential of the GPU-accelerated spallation models is discussed.

  17. PVT: an efficient computational procedure to speed up next-generation sequence analysis.

    Science.gov (United States)

    Maji, Ranjan Kumar; Sarkar, Arijita; Khatua, Sunirmal; Dasgupta, Subhasis; Ghosh, Zhumur

    2014-06-04

    High-throughput Next-Generation Sequencing (NGS) techniques are advancing genomics and molecular biology research. This technology generates substantially large data which puts up a major challenge to the scientists for an efficient, cost and time effective solution to analyse such data. Further, for the different types of NGS data, there are certain common challenging steps involved in analysing those data. Spliced alignment is one such fundamental step in NGS data analysis which is extremely computational intensive as well as time consuming. There exists serious problem even with the most widely used spliced alignment tools. TopHat is one such widely used spliced alignment tools which although supports multithreading, does not efficiently utilize computational resources in terms of CPU utilization and memory. Here we have introduced PVT (Pipelined Version of TopHat) where we take up a modular approach by breaking TopHat's serial execution into a pipeline of multiple stages, thereby increasing the degree of parallelization and computational resource utilization. Thus we address the discrepancies in TopHat so as to analyze large NGS data efficiently. We analysed the SRA dataset (SRX026839 and SRX026838) consisting of single end reads and SRA data SRR1027730 consisting of paired-end reads. We used TopHat v2.0.8 to analyse these datasets and noted the CPU usage, memory footprint and execution time during spliced alignment. With this basic information, we designed PVT, a pipelined version of TopHat that removes the redundant computational steps during 'spliced alignment' and breaks the job into a pipeline of multiple stages (each comprising of different step(s)) to improve its resource utilization, thus reducing the execution time. PVT provides an improvement over TopHat for spliced alignment of NGS data analysis. PVT thus resulted in the reduction of the execution time to ~23% for the single end read dataset. Further, PVT designed for paired end reads showed an

  18. Computationally Efficient Power Allocation Algorithm in Multicarrier-Based Cognitive Radio Networks: OFDM and FBMC Systems

    Directory of Open Access Journals (Sweden)

    Shaat Musbah

    2010-01-01

    Full Text Available Cognitive Radio (CR systems have been proposed to increase the spectrum utilization by opportunistically access the unused spectrum. Multicarrier communication systems are promising candidates for CR systems. Due to its high spectral efficiency, filter bank multicarrier (FBMC can be considered as an alternative to conventional orthogonal frequency division multiplexing (OFDM for transmission over the CR networks. This paper addresses the problem of resource allocation in multicarrier-based CR networks. The objective is to maximize the downlink capacity of the network under both total power and interference introduced to the primary users (PUs constraints. The optimal solution has high computational complexity which makes it unsuitable for practical applications and hence a low complexity suboptimal solution is proposed. The proposed algorithm utilizes the spectrum holes in PUs bands as well as active PU bands. The performance of the proposed algorithm is investigated for OFDM and FBMC based CR systems. Simulation results illustrate that the proposed resource allocation algorithm with low computational complexity achieves near optimal performance and proves the efficiency of using FBMC in CR context.

  19. Turbocharged molecular discovery of OLED emitters: from high-throughput quantum simulation to highly efficient TADF devices

    Science.gov (United States)

    Gómez-Bombarelli, Rafael; Aguilera-Iparraguirre, Jorge; Hirzel, Timothy D.; Ha, Dong-Gwang; Einzinger, Markus; Wu, Tony; Baldo, Marc A.; Aspuru-Guzik, Alán.

    2016-09-01

    Discovering new OLED emitters requires many experiments to synthesize candidates and test performance in devices. Large scale computer simulation can greatly speed this search process but the problem remains challenging enough that brute force application of massive computing power is not enough to successfully identify novel structures. We report a successful High Throughput Virtual Screening study that leveraged a range of methods to optimize the search process. The generation of candidate structures was constrained to contain combinatorial explosion. Simulations were tuned to the specific problem and calibrated with experimental results. Experimentalists and theorists actively collaborated such that experimental feedback was regularly utilized to update and shape the computational search. Supervised machine learning methods prioritized candidate structures prior to quantum chemistry simulation to prevent wasting compute on likely poor performers. With this combination of techniques, each multiplying the strength of the search, this effort managed to navigate an area of molecular space and identify hundreds of promising OLED candidate structures. An experimentally validated selection of this set shows emitters with external quantum efficiencies as high as 22%.

  20. Efficiency of High Order Spectral Element Methods on Petascale Architectures

    KAUST Repository

    Hutchinson, Maxwell; Heinecke, Alexander; Pabst, Hans; Henry, Greg; Parsani, Matteo; Keyes, David E.

    2016-01-01

    High order methods for the solution of PDEs expose a tradeoff between computational cost and accuracy on a per degree of freedom basis. In many cases, the cost increases due to higher arithmetic intensity while affecting data movement minimally. As architectures tend towards wider vector instructions and expect higher arithmetic intensities, the best order for a particular simulation may change. This study highlights preferred orders by identifying the high order efficiency frontier of the spectral element method implemented in Nek5000 and NekBox: the set of orders and meshes that minimize computational cost at fixed accuracy. First, we extract Nek’s order-dependent computational kernels and demonstrate exceptional hardware utilization by hardware-aware implementations. Then, we perform productionscale calculations of the nonlinear single mode Rayleigh-Taylor instability on BlueGene/Q and Cray XC40-based supercomputers to highlight the influence of the architecture. Accuracy is defined with respect to physical observables, and computational costs are measured by the corehour charge of the entire application. The total number of grid points needed to achieve a given accuracy is reduced by increasing the polynomial order. On the XC40 and BlueGene/Q, polynomial orders as high as 31 and 15 come at no marginal cost per timestep, respectively. Taken together, these observations lead to a strong preference for high order discretizations that use fewer degrees of freedom. From a performance point of view, we demonstrate up to 60% full application bandwidth utilization at scale and achieve ≈1PFlop/s of compute performance in Nek’s most flop-intense methods.

  1. Efficiency of High Order Spectral Element Methods on Petascale Architectures

    KAUST Repository

    Hutchinson, Maxwell

    2016-06-14

    High order methods for the solution of PDEs expose a tradeoff between computational cost and accuracy on a per degree of freedom basis. In many cases, the cost increases due to higher arithmetic intensity while affecting data movement minimally. As architectures tend towards wider vector instructions and expect higher arithmetic intensities, the best order for a particular simulation may change. This study highlights preferred orders by identifying the high order efficiency frontier of the spectral element method implemented in Nek5000 and NekBox: the set of orders and meshes that minimize computational cost at fixed accuracy. First, we extract Nek’s order-dependent computational kernels and demonstrate exceptional hardware utilization by hardware-aware implementations. Then, we perform productionscale calculations of the nonlinear single mode Rayleigh-Taylor instability on BlueGene/Q and Cray XC40-based supercomputers to highlight the influence of the architecture. Accuracy is defined with respect to physical observables, and computational costs are measured by the corehour charge of the entire application. The total number of grid points needed to achieve a given accuracy is reduced by increasing the polynomial order. On the XC40 and BlueGene/Q, polynomial orders as high as 31 and 15 come at no marginal cost per timestep, respectively. Taken together, these observations lead to a strong preference for high order discretizations that use fewer degrees of freedom. From a performance point of view, we demonstrate up to 60% full application bandwidth utilization at scale and achieve ≈1PFlop/s of compute performance in Nek’s most flop-intense methods.

  2. Unified commutation-pruning technique for efficient computation of composite DFTs

    Science.gov (United States)

    Castro-Palazuelos, David E.; Medina-Melendrez, Modesto Gpe.; Torres-Roman, Deni L.; Shkvarko, Yuriy V.

    2015-12-01

    An efficient computation of a composite length discrete Fourier transform (DFT), as well as a fast Fourier transform (FFT) of both time and space data sequences in uncertain (non-sparse or sparse) computational scenarios, requires specific processing algorithms. Traditional algorithms typically employ some pruning methods without any commutations, which prevents them from attaining the potential computational efficiency. In this paper, we propose an alternative unified approach with automatic commutations between three computational modalities aimed at efficient computations of the pruned DFTs adapted for variable composite lengths of the non-sparse input-output data. The first modality is an implementation of the direct computation of a composite length DFT, the second one employs the second-order recursive filtering method, and the third one performs the new pruned decomposed transform. The pruned decomposed transform algorithm performs the decimation in time or space (DIT) data acquisition domain and, then, decimation in frequency (DIF). The unified combination of these three algorithms is addressed as the DFTCOMM technique. Based on the treatment of the combinational-type hypotheses testing optimization problem of preferable allocations between all feasible commuting-pruning modalities, we have found the global optimal solution to the pruning problem that always requires a fewer or, at most, the same number of arithmetic operations than other feasible modalities. The DFTCOMM method outperforms the existing competing pruning techniques in the sense of attainable savings in the number of required arithmetic operations. It requires fewer or at most the same number of arithmetic operations for its execution than any other of the competing pruning methods reported in the literature. Finally, we provide the comparison of the DFTCOMM with the recently developed sparse fast Fourier transform (SFFT) algorithmic family. We feature that, in the sensing scenarios with

  3. A high performance scientific cloud computing environment for materials simulations

    Science.gov (United States)

    Jorissen, K.; Vila, F. D.; Rehr, J. J.

    2012-09-01

    We describe the development of a scientific cloud computing (SCC) platform that offers high performance computation capability. The platform consists of a scientific virtual machine prototype containing a UNIX operating system and several materials science codes, together with essential interface tools (an SCC toolset) that offers functionality comparable to local compute clusters. In particular, our SCC toolset provides automatic creation of virtual clusters for parallel computing, including tools for execution and monitoring performance, as well as efficient I/O utilities that enable seamless connections to and from the cloud. Our SCC platform is optimized for the Amazon Elastic Compute Cloud (EC2). We present benchmarks for prototypical scientific applications and demonstrate performance comparable to local compute clusters. To facilitate code execution and provide user-friendly access, we have also integrated cloud computing capability in a JAVA-based GUI. Our SCC platform may be an alternative to traditional HPC resources for materials science or quantum chemistry applications.

  4. A primary study on the increasing of efficiency in the computer cooling system by means of external air

    Energy Technology Data Exchange (ETDEWEB)

    Kim, S. H.; Kim, M. H. [Silla University, Busan (Korea, Republic of)

    2009-07-01

    In recent years, since the continuing increase in the capacity of in personal computer such as the optimal performance, high quality and high resolution image, the computer system's components produce large amounts of heat during operation. This study analyzes and investigates an ability and efficiency of the cooling system inside the computer by means of Central Processing Unit (CPU) and power supply cooling fan. This research was conducted for increasing an ability of the cooling system inside the computer by making a structure which produces different air pressures in an air inflow tube. Consequently, when temperatures of the CPU and room inside computer were compared with a general personal computer, temperatures of the tested CPU, the room and the heat sink were as low as 5 .deg. C, 2.5 .deg. C and 7 .deg. C respectively. In addition to, Revolution Per Minute (RPM) was shown as low as 250 after 1 hour operation. This research explored the possibility of enhancing the effective cooling of high-performance computer systems.

  5. Achieving high performance in numerical computations on RISC workstations and parallel systems

    Energy Technology Data Exchange (ETDEWEB)

    Goedecker, S. [Max-Planck Inst. for Solid State Research, Stuttgart (Germany); Hoisie, A. [Los Alamos National Lab., NM (United States)

    1997-08-20

    The nominal peak speeds of both serial and parallel computers is raising rapidly. At the same time however it is becoming increasingly difficult to get out a significant fraction of this high peak speed from modern computer architectures. In this tutorial the authors give the scientists and engineers involved in numerically demanding calculations and simulations the necessary basic knowledge to write reasonably efficient programs. The basic principles are rather simple and the possible rewards large. Writing a program by taking into account optimization techniques related to the computer architecture can significantly speedup your program, often by factors of 10--100. As such, optimizing a program can for instance be a much better solution than buying a faster computer. If a few basic optimization principles are applied during program development, the additional time needed for obtaining an efficient program is practically negligible. In-depth optimization is usually only needed for a few subroutines or kernels and the effort involved is therefore also acceptable.

  6. Toward High-Power Klystrons With RF Power Conversion Efficiency on the Order of 90%

    CERN Document Server

    Baikov, Andrey Yu; Syratchev, Igor

    2015-01-01

    The increase in efficiency of RF power generation for future large accelerators is considered a high priority issue. The vast majority of the existing commercial high-power RF klystrons operates in the electronic efficiency range between 40% and 55%. Only a few klystrons available on the market are capable of operating with 65% efficiency or above. In this paper, a new method to achieve 90% RF power conversion efficiency in a klystron amplifier is presented. The essential part of this method is a new bunching technique - bunching with bunch core oscillations. Computer simulations confirm that the RF production efficiency above 90% can be reached with this new bunching method. The results of a preliminary study of an L-band, 20-MW peak RF power multibeam klystron for Compact Linear Collider with the efficiency above 85% are presented.

  7. Computing in high energy physics

    Energy Technology Data Exchange (ETDEWEB)

    Smith, Sarah; Devenish, Robin [Nuclear Physics Laboratory, Oxford University (United Kingdom)

    1989-07-15

    Computing in high energy physics has changed over the years from being something one did on a slide-rule, through early computers, then a necessary evil to the position today where computers permeate all aspects of the subject from control of the apparatus to theoretical lattice gauge calculations. The state of the art, as well as new trends and hopes, were reflected in this year's 'Computing In High Energy Physics' conference held in the dreamy setting of Oxford's spires. The conference aimed to give a comprehensive overview, entailing a heavy schedule of 35 plenary talks plus 48 contributed papers in two afternoons of parallel sessions. In addition to high energy physics computing, a number of papers were given by experts in computing science, in line with the conference's aim – 'to bring together high energy physicists and computer scientists'.

  8. An Efficient Integer Coding and Computing Method for Multiscale Time Segment

    Directory of Open Access Journals (Sweden)

    TONG Xiaochong

    2016-12-01

    Full Text Available This article focus on the exist problem and status of current time segment coding, proposed a new set of approach about time segment coding: multi-scale time segment integer coding (MTSIC. This approach utilized the tree structure and the sort by size formed among integer, it reflected the relationship among the multi-scale time segments: order, include/contained, intersection, etc., and finally achieved an unity integer coding processing for multi-scale time. On this foundation, this research also studied the computing method for calculating the time relationships of MTSIC, to support an efficient calculation and query based on the time segment, and preliminary discussed the application method and prospect of MTSIC. The test indicated that, the implement of MTSIC is convenient and reliable, and the transformation between it and the traditional method is convenient, it has the very high efficiency in query and calculating.

  9. Efficient O(N) recursive computation of the operational space inertial matrix

    International Nuclear Information System (INIS)

    Lilly, K.W.; Orin, D.E.

    1993-01-01

    The operational space inertia matrix Λ reflects the dynamic properties of a robot manipulator to its tip. In the control domain, it may be used to decouple force and/or motion control about the manipulator workspace axes. The matrix Λ also plays an important role in the development of efficient algorithms for the dynamic simulation of closed-chain robotic mechanisms, including simple closed-chain mechanisms such as multiple manipulator systems and walking machines. The traditional approach used to compute Λ has a computational complexity of O(N 3 ) for an N degree-of-freedom manipulator. This paper presents the development of a recursive algorithm for computing the operational space inertia matrix (OSIM) that reduces the computational complexity to O(N). This algorithm, the inertia propagation method, is based on a single recursion that begins at the base of the manipulator and progresses out to the last link. Also applicable to redundant systems and mechanisms with multiple-degree-of-freedom joints, the inertia propagation method is the most efficient method known for computing Λ for N ≥ 6. The numerical accuracy of the algorithm is discussed for a PUMA 560 robot with a fixed base

  10. An energy-efficient high-performance processor with reconfigurable data-paths using RSFQ circuits

    International Nuclear Information System (INIS)

    Takagi, Naofumi

    2013-01-01

    Highlights: ► An idea of a high-performance computer using RSFQ circuits is shown. ► An outline of processor with reconfigurable data-paths (RDPs) is shown. ► Architectural details of an SFQ-RDP are described. -- Abstract: We show recent progress in our research on an energy-efficient high-performance processor with reconfigurable data-paths (RDPs) using rapid single-flux-quantum (RSFQ) circuits. We mainly describe the architectural details of an RDP implemented using RSFQ circuits. An RDP consists of a lot of floating-point units (FPUs) and operand routing networks (ORNs) which connect the FPUs. We reconfigure the RDP to fit a computation, i.e., a group of floating-point operations, appearing in a ‘for’ loop of programs for numerical computations by setting the route in ORNs before the execution of the loop. In the RDP, a lot of FPUs work in parallel with pipelined fashion, and hence, very high-performance computation is achieved

  11. Some computational challenges of developing efficient parallel algorithms for data-dependent computations in thermal-hydraulics supercomputer applications

    International Nuclear Information System (INIS)

    Woodruff, S.B.

    1994-01-01

    The Transient Reactor Analysis Code (TRAC), which features a two-fluid treatment of thermal-hydraulics, is designed to model transients in water reactors and related facilities. One of the major computational costs associated with TRAC and similar codes is calculating constitutive coefficients. Although the formulations for these coefficients are local, the costs are flow-regime- or data-dependent; i.e., the computations needed for a given spatial node often vary widely as a function of time. Consequently, a fixed, uniform assignment of nodes to prallel processors will result in degraded computational efficiency due to the poor load balancing. A standard method for treating data-dependent models on vector architectures has been to use gather operations (or indirect adressing) to sort the nodes into subsets that (temporarily) share a common computational model. However, this method is not effective on distributed memory data parallel architectures, where indirect adressing involves expensive communication overhead. Another serious problem with this method involves software engineering challenges in the areas of maintainability and extensibility. For example, an implementation that was hand-tuned to achieve good computational efficiency would have to be rewritten whenever the decision tree governing the sorting was modified. Using an example based on the calculation of the wall-to-liquid and wall-to-vapor heat-transfer coefficients for three nonboiling flow regimes, we describe how the use of the Fortran 90 WHERE construct and automatic inlining of functions can be used to ameliorate this problem while improving both efficiency and software engineering. Unfortunately, a general automatic solution to the load-balancing problem associated with data-dependent computations is not yet available for massively parallel architectures. We discuss why developers should either wait for such solutions or consider alternative numerical algorithms, such as a neural network

  12. A computationally efficient OMP-based compressed sensing reconstruction for dynamic MRI

    International Nuclear Information System (INIS)

    Usman, M; Prieto, C; Schaeffter, T; Batchelor, P G; Odille, F; Atkinson, D

    2011-01-01

    Compressed sensing (CS) methods in MRI are computationally intensive. Thus, designing novel CS algorithms that can perform faster reconstructions is crucial for everyday applications. We propose a computationally efficient orthogonal matching pursuit (OMP)-based reconstruction, specifically suited to cardiac MR data. According to the energy distribution of a y-f space obtained from a sliding window reconstruction, we label the y-f space as static or dynamic. For static y-f space images, a computationally efficient masked OMP reconstruction is performed, whereas for dynamic y-f space images, standard OMP reconstruction is used. The proposed method was tested on a dynamic numerical phantom and two cardiac MR datasets. Depending on the field of view composition of the imaging data, compared to the standard OMP method, reconstruction speedup factors ranging from 1.5 to 2.5 are achieved. (note)

  13. Resilient and Robust High Performance Computing Platforms for Scientific Computing Integrity

    Energy Technology Data Exchange (ETDEWEB)

    Jin, Yier [Univ. of Central Florida, Orlando, FL (United States)

    2017-07-14

    As technology advances, computer systems are subject to increasingly sophisticated cyber-attacks that compromise both their security and integrity. High performance computing platforms used in commercial and scientific applications involving sensitive, or even classified data, are frequently targeted by powerful adversaries. This situation is made worse by a lack of fundamental security solutions that both perform efficiently and are effective at preventing threats. Current security solutions fail to address the threat landscape and ensure the integrity of sensitive data. As challenges rise, both private and public sectors will require robust technologies to protect its computing infrastructure. The research outcomes from this project try to address all these challenges. For example, we present LAZARUS, a novel technique to harden kernel Address Space Layout Randomization (KASLR) against paging-based side-channel attacks. In particular, our scheme allows for fine-grained protection of the virtual memory mappings that implement the randomization. We demonstrate the effectiveness of our approach by hardening a recent Linux kernel with LAZARUS, mitigating all of the previously presented side-channel attacks on KASLR. Our extensive evaluation shows that LAZARUS incurs only 0.943% overhead for standard benchmarks, and is therefore highly practical. We also introduced HA2lloc, a hardware-assisted allocator that is capable of leveraging an extended memory management unit to detect memory errors in the heap. We also perform testing using HA2lloc in a simulation environment and find that the approach is capable of preventing common memory vulnerabilities.

  14. On the Computation of the Efficient Frontier of the Portfolio Selection Problem

    Directory of Open Access Journals (Sweden)

    Clara Calvo

    2012-01-01

    Full Text Available An easy-to-use procedure is presented for improving the ε-constraint method for computing the efficient frontier of the portfolio selection problem endowed with additional cardinality and semicontinuous variable constraints. The proposed method provides not only a numerical plotting of the frontier but also an analytical description of it, including the explicit equations of the arcs of parabola it comprises and the change points between them. This information is useful for performing a sensitivity analysis as well as for providing additional criteria to the investor in order to select an efficient portfolio. Computational results are provided to test the efficiency of the algorithm and to illustrate its applications. The procedure has been implemented in Mathematica.

  15. High-level language computer architecture

    CERN Document Server

    Chu, Yaohan

    1975-01-01

    High-Level Language Computer Architecture offers a tutorial on high-level language computer architecture, including von Neumann architecture and syntax-oriented architecture as well as direct and indirect execution architecture. Design concepts of Japanese-language data processing systems are discussed, along with the architecture of stack machines and the SYMBOL computer system. The conceptual design of a direct high-level language processor is also described.Comprised of seven chapters, this book first presents a classification of high-level language computer architecture according to the pr

  16. Computational Biology and High Performance Computing 2000

    Energy Technology Data Exchange (ETDEWEB)

    Simon, Horst D.; Zorn, Manfred D.; Spengler, Sylvia J.; Shoichet, Brian K.; Stewart, Craig; Dubchak, Inna L.; Arkin, Adam P.

    2000-10-19

    The pace of extraordinary advances in molecular biology has accelerated in the past decade due in large part to discoveries coming from genome projects on human and model organisms. The advances in the genome project so far, happening well ahead of schedule and under budget, have exceeded any dreams by its protagonists, let alone formal expectations. Biologists expect the next phase of the genome project to be even more startling in terms of dramatic breakthroughs in our understanding of human biology, the biology of health and of disease. Only today can biologists begin to envision the necessary experimental, computational and theoretical steps necessary to exploit genome sequence information for its medical impact, its contribution to biotechnology and economic competitiveness, and its ultimate contribution to environmental quality. High performance computing has become one of the critical enabling technologies, which will help to translate this vision of future advances in biology into reality. Biologists are increasingly becoming aware of the potential of high performance computing. The goal of this tutorial is to introduce the exciting new developments in computational biology and genomics to the high performance computing community.

  17. An efficient algorithm for nucleolus and prekernel computation in some classes of TU-games

    NARCIS (Netherlands)

    Faigle, U.; Kern, Walter; Kuipers, J.

    1998-01-01

    We consider classes of TU-games. We show that we can efficiently compute an allocation in the intersection of the prekernel and the least core of the game if we can efficiently compute the minimum excess for any given allocation. In the case where the prekernel of the game contains exactly one core

  18. High-performance computing using FPGAs

    CERN Document Server

    Benkrid, Khaled

    2013-01-01

    This book is concerned with the emerging field of High Performance Reconfigurable Computing (HPRC), which aims to harness the high performance and relative low power of reconfigurable hardware–in the form Field Programmable Gate Arrays (FPGAs)–in High Performance Computing (HPC) applications. It presents the latest developments in this field from applications, architecture, and tools and methodologies points of view. We hope that this work will form a reference for existing researchers in the field, and entice new researchers and developers to join the HPRC community.  The book includes:  Thirteen application chapters which present the most important application areas tackled by high performance reconfigurable computers, namely: financial computing, bioinformatics and computational biology, data search and processing, stencil computation e.g. computational fluid dynamics and seismic modeling, cryptanalysis, astronomical N-body simulation, and circuit simulation.     Seven architecture chapters which...

  19. Computing in high energy physics

    International Nuclear Information System (INIS)

    Smith, Sarah; Devenish, Robin

    1989-01-01

    Computing in high energy physics has changed over the years from being something one did on a slide-rule, through early computers, then a necessary evil to the position today where computers permeate all aspects of the subject from control of the apparatus to theoretical lattice gauge calculations. The state of the art, as well as new trends and hopes, were reflected in this year's 'Computing In High Energy Physics' conference held in the dreamy setting of Oxford's spires. The conference aimed to give a comprehensive overview, entailing a heavy schedule of 35 plenary talks plus 48 contributed papers in two afternoons of parallel sessions. In addition to high energy physics computing, a number of papers were given by experts in computing science, in line with the conference's aim – 'to bring together high energy physicists and computer scientists'

  20. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in modern High Energy Physics. To perform precision measurements of the Higgs boson properties, the use of fast and efficient instruments of Monte Carlo event simulation is required. Due to the increasing amount of data and to the growing complexity of the simulation software tools, the computing resources currently available for Monte Carlo simulation on the LHC GRID are not sufficient. One of the possibilities to address this shortfall of computing resources is the usage of institutes computer clusters, commercial computing resources and supercomputers. In this paper, a brief description of the Higgs boson physics, the Monte-Carlo generation and event simulation techniques are presented. A description of modern high performance computing systems and tests of their performance are also discussed. These studies have been performed on the Worldwide LHC Computing Grid and Kurchatov Institute Data Processing Center, including Tier...

  1. Overview of Ecological Agriculture with High Efficiency

    OpenAIRE

    Huang, Guo-qin; Zhao, Qi-guo; Gong, Shao-lin; Shi, Qing-hua

    2012-01-01

    From the presentation, connotation, characteristics, principles, pattern, and technologies of ecological agriculture with high efficiency, we conduct comprehensive and systematic analysis and discussion of the theoretical and practical progress of ecological agriculture with high efficiency. (i) Ecological agriculture with high efficiency was first advanced in China in 1991. (ii) Ecological agriculture with high efficiency highlights "high efficiency", "ecology", and "combination". (iii) Ecol...

  2. A highly efficient multi-core algorithm for clustering extremely large datasets

    Directory of Open Access Journals (Sweden)

    Kraus Johann M

    2010-04-01

    Full Text Available Abstract Background In recent years, the demand for computational power in computational biology has increased due to rapidly growing data sets from microarray and other high-throughput technologies. This demand is likely to increase. Standard algorithms for analyzing data, such as cluster algorithms, need to be parallelized for fast processing. Unfortunately, most approaches for parallelizing algorithms largely rely on network communication protocols connecting and requiring multiple computers. One answer to this problem is to utilize the intrinsic capabilities in current multi-core hardware to distribute the tasks among the different cores of one computer. Results We introduce a multi-core parallelization of the k-means and k-modes cluster algorithms based on the design principles of transactional memory for clustering gene expression microarray type data and categorial SNP data. Our new shared memory parallel algorithms show to be highly efficient. We demonstrate their computational power and show their utility in cluster stability and sensitivity analysis employing repeated runs with slightly changed parameters. Computation speed of our Java based algorithm was increased by a factor of 10 for large data sets while preserving computational accuracy compared to single-core implementations and a recently published network based parallelization. Conclusions Most desktop computers and even notebooks provide at least dual-core processors. Our multi-core algorithms show that using modern algorithmic concepts, parallelization makes it possible to perform even such laborious tasks as cluster sensitivity and cluster number estimation on the laboratory computer.

  3. Study of application technology of ultra-high speed computer to the elucidation of complex phenomena

    International Nuclear Information System (INIS)

    Sekiguchi, Tomotsugu

    1996-01-01

    The basic design of numerical information library in the decentralized computer network was explained at the first step of constructing the application technology of ultra-high speed computer to the elucidation of complex phenomena. Establishment of the system makes possible to construct the efficient application environment of ultra-high speed computer system to be scalable with the different computing systems. We named the system Ninf (Network Information Library for High Performance Computing). The summary of application technology of library was described as follows: the application technology of library under the distributed environment, numeric constants, retrieval of value, library of special functions, computing library, Ninf library interface, Ninf remote library and registration. By the system, user is able to use the program concentrating the analyzing technology of numerical value with high precision, reliability and speed. (S.Y.)

  4. Structured Parallel Programming Patterns for Efficient Computation

    CERN Document Server

    McCool, Michael; Robison, Arch

    2012-01-01

    Programming is now parallel programming. Much as structured programming revolutionized traditional serial programming decades ago, a new kind of structured programming, based on patterns, is relevant to parallel programming today. Parallel computing experts and industry insiders Michael McCool, Arch Robison, and James Reinders describe how to design and implement maintainable and efficient parallel algorithms using a pattern-based approach. They present both theory and practice, and give detailed concrete examples using multiple programming models. Examples are primarily given using two of th

  5. Energy Efficiency in Computing (2/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    We will start the second day of our energy efficient computing series with a brief discussion of software and the impact it has on energy consumption. A second major point of this lecture will be the current state of research and a few future technologies, ranging from mainstream (e.g. the Internet of Things) to exotic. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP and Google), as well as international research institutes, such as EPFL. Currently, Andrzej acts as a consultant on technology and innovation with TIK Services (http://tik.services), and runs a peer-to-peer lending start-up. NB! All Academic Lectures are recorded. No webcast! Because of a problem of the recording equipment, this lecture will be repeated for recording pu...

  6. A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems.

    Science.gov (United States)

    Brunelli, Davide

    2016-03-04

    Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm³. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions.

  7. A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems

    Science.gov (United States)

    Brunelli, Davide

    2016-01-01

    Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm3. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions. PMID:26959018

  8. A High-Efficiency Wind Energy Harvester for Autonomous Embedded Systems

    Directory of Open Access Journals (Sweden)

    Davide Brunelli

    2016-03-01

    Full Text Available Energy harvesting is currently a hot research topic, mainly as a consequence of the increasing attractiveness of computing and sensing solutions based on small, low-power distributed embedded systems. Harvesting may enable systems to operate in a deploy-and-forget mode, particularly when power grid is absent and the use of rechargeable batteries is unattractive due to their limited lifetime and maintenance requirements. This paper focuses on wind flow as an energy source feasible to meet the energy needs of a small autonomous embedded system. In particular the contribution is on the electrical converter and system integration. We characterize the micro-wind turbine, we define a detailed model of its behaviour, and then we focused on a highly efficient circuit to convert wind energy into electrical energy. The optimized design features an overall volume smaller than 64 cm3. The core of the harvester is a high efficiency buck-boost converter which performs an optimal power point tracking. Experimental results show that the wind generator boosts efficiency over a wide range of operating conditions.

  9. Efficient computation of clipped Voronoi diagram for mesh generation

    KAUST Repository

    Yan, Dongming

    2013-04-01

    The Voronoi diagram is a fundamental geometric structure widely used in various fields, especially in computer graphics and geometry computing. For a set of points in a compact domain (i.e. a bounded and closed 2D region or a 3D volume), some Voronoi cells of their Voronoi diagram are infinite or partially outside of the domain, but in practice only the parts of the cells inside the domain are needed, as when computing the centroidal Voronoi tessellation. Such a Voronoi diagram confined to a compact domain is called a clipped Voronoi diagram. We present an efficient algorithm to compute the clipped Voronoi diagram for a set of sites with respect to a compact 2D region or a 3D volume. We also apply the proposed method to optimal mesh generation based on the centroidal Voronoi tessellation. Crown Copyright © 2011 Published by Elsevier Ltd. All rights reserved.

  10. Efficient computation of clipped Voronoi diagram for mesh generation

    KAUST Repository

    Yan, Dongming; Wang, Wen Ping; Lé vy, Bruno L.; Liu, Yang

    2013-01-01

    The Voronoi diagram is a fundamental geometric structure widely used in various fields, especially in computer graphics and geometry computing. For a set of points in a compact domain (i.e. a bounded and closed 2D region or a 3D volume), some Voronoi cells of their Voronoi diagram are infinite or partially outside of the domain, but in practice only the parts of the cells inside the domain are needed, as when computing the centroidal Voronoi tessellation. Such a Voronoi diagram confined to a compact domain is called a clipped Voronoi diagram. We present an efficient algorithm to compute the clipped Voronoi diagram for a set of sites with respect to a compact 2D region or a 3D volume. We also apply the proposed method to optimal mesh generation based on the centroidal Voronoi tessellation. Crown Copyright © 2011 Published by Elsevier Ltd. All rights reserved.

  11. High energy physics and grid computing

    International Nuclear Information System (INIS)

    Yu Chuansong

    2004-01-01

    The status of the new generation computing environment of the high energy physics experiments is introduced briefly in this paper. The development of the high energy physics experiments and the new computing requirements by the experiments are presented. The blueprint of the new generation computing environment of the LHC experiments, the history of the Grid computing, the R and D status of the high energy physics grid computing technology, the network bandwidth needed by the high energy physics grid and its development are described. The grid computing research in Chinese high energy physics community is introduced at last. (authors)

  12. Hiding Electronic Patient Record (EPR) in medical images: A high capacity and computationally efficient technique for e-healthcare applications.

    Science.gov (United States)

    Loan, Nazir A; Parah, Shabir A; Sheikh, Javaid A; Akhoon, Jahangir A; Bhat, Ghulam M

    2017-09-01

    A high capacity and semi-reversible data hiding scheme based on Pixel Repetition Method (PRM) and hybrid edge detection for scalable medical images has been proposed in this paper. PRM has been used to scale up the small sized image (seed image) and hybrid edge detection ensures that no important edge information is missed. The scaled up version of seed image has been divided into 2×2 non overlapping blocks. In each block there is one seed pixel whose status decides the number of bits to be embedded in the remaining three pixels of that block. The Electronic Patient Record (EPR)/data have been embedded by using Least Significant and Intermediate Significant Bit Substitution (ISBS). The RC4 encryption has been used to add an additional security layer for embedded EPR/data. The proposed scheme has been tested for various medical and general images and compared with some state of art techniques in the field. The experimental results reveal that the proposed scheme besides being semi-reversible and computationally efficient is capable of handling high payload and as such can be used effectively for electronic healthcare applications. Copyright © 2017. Published by Elsevier Inc.

  13. Energy Efficiency Evaluation and Benchmarking of AFRL’s Condor High Performance Computer

    Science.gov (United States)

    2011-08-01

    PlayStation 3 nodes executing the HPL benchmark. When idle, the two PS3s consume 188.49 W on average. At peak HPL performance, the nodes draw an average of...AUG 2011 2. REPORT TYPE CONFERENCE PAPER (Post Print) 3. DATES COVERED (From - To) JAN 2011 – JUN 2011 4 . TITLE AND SUBTITLE ENERGY EFFICIENCY...the High Performance LINPACK (HPL) benchmark while also measuring the energy consumed to achieve such performance. Supercomputers are ranked by

  14. The peak efficiency calibration of volume source using 152Eu point source in computer

    International Nuclear Information System (INIS)

    Shen Tingyun; Qian Jianfu; Nan Qinliang; Zhou Yanguo

    1997-01-01

    The author describes the method of the peak efficiency calibration of volume source by means of 152 Eu point source for HPGe γ spectrometer. The peak efficiency can be computed by Monte Carlo simulation, after inputting parameter of detector. The computation results are in agreement with the experimental results with an error of +-3.8%, with an exception one is about +-7.4%

  15. High-throughput sample adaptive offset hardware architecture for high-efficiency video coding

    Science.gov (United States)

    Zhou, Wei; Yan, Chang; Zhang, Jingzhi; Zhou, Xin

    2018-03-01

    A high-throughput hardware architecture for a sample adaptive offset (SAO) filter in the high-efficiency video coding video coding standard is presented. First, an implementation-friendly and simplified bitrate estimation method of rate-distortion cost calculation is proposed to reduce the computational complexity in the mode decision of SAO. Then, a high-throughput VLSI architecture for SAO is presented based on the proposed bitrate estimation method. Furthermore, multiparallel VLSI architecture for in-loop filters, which integrates both deblocking filter and SAO filter, is proposed. Six parallel strategies are applied in the proposed in-loop filters architecture to improve the system throughput and filtering speed. Experimental results show that the proposed in-loop filters architecture can achieve up to 48% higher throughput in comparison with prior work. The proposed architecture can reach a high-operating clock frequency of 297 MHz with TSMC 65-nm library and meet the real-time requirement of the in-loop filters for 8 K × 4 K video format at 132 fps.

  16. High energy physics and cloud computing

    International Nuclear Information System (INIS)

    Cheng Yaodong; Liu Baoxu; Sun Gongxing; Chen Gang

    2011-01-01

    High Energy Physics (HEP) has been a strong promoter of computing technology, for example WWW (World Wide Web) and the grid computing. In the new era of cloud computing, HEP has still a strong demand, and major international high energy physics laboratories have launched a number of projects to research on cloud computing technologies and applications. It describes the current developments in cloud computing and its applications in high energy physics. Some ongoing projects in the institutes of high energy physics, Chinese Academy of Sciences, including cloud storage, virtual computing clusters, and BESⅢ elastic cloud, are also described briefly in the paper. (authors)

  17. Computationally Efficient Prediction of Ionic Liquid Properties

    DEFF Research Database (Denmark)

    Chaban, V. V.; Prezhdo, O. V.

    2014-01-01

    Due to fundamental differences, room-temperature ionic liquids (RTIL) are significantly more viscous than conventional molecular liquids and require long simulation times. At the same time, RTILs remain in the liquid state over a much broader temperature range than the ordinary liquids. We exploit...... to ambient temperatures. We numerically prove the validity of the proposed concept for density and ionic diffusion of four different RTILs. This simple method enhances the computational efficiency of the existing simulation approaches as applied to RTILs by more than an order of magnitude....

  18. Moment-Preserving Computational Approach for High Energy Charged Particle Transport

    Science.gov (United States)

    2016-05-16

    posed, but with modified cross sections such that the resulting single-event Monte Carlo simulation is computationally efficient (minutes vs . days...configurations, which are all characteristics of real world applications. In other words , it is possible to simulate real, physical phenomena using charged...0 < 0.95) ~ 1 2() ≫ 1, (3) Demonstrating that scattering is highly forward peaked. Thus, the picture of charged particle interactions

  19. Analysis and modeling of social influence in high performance computing workloads

    KAUST Repository

    Zheng, Shuai

    2011-01-01

    Social influence among users (e.g., collaboration on a project) creates bursty behavior in the underlying high performance computing (HPC) workloads. Using representative HPC and cluster workload logs, this paper identifies, analyzes, and quantifies the level of social influence across HPC users. We show the existence of a social graph that is characterized by a pattern of dominant users and followers. This pattern also follows a power-law distribution, which is consistent with those observed in mainstream social networks. Given its potential impact on HPC workloads prediction and scheduling, we propose a fast-converging, computationally-efficient online learning algorithm for identifying social groups. Extensive evaluation shows that our online algorithm can (1) quickly identify the social relationships by using a small portion of incoming jobs and (2) can efficiently track group evolution over time. © 2011 Springer-Verlag.

  20. Progress of OLED devices with high efficiency at high luminance

    Science.gov (United States)

    Nguyen, Carmen; Ingram, Grayson; Lu, Zhenghong

    2014-03-01

    Organic light emitting diodes (OLEDs) have progressed significantly over the last two decades. For years, OLEDs have been promoted as the next generation technology for flat panel displays and solid-state lighting due to their potential for high energy efficiency and dynamic range of colors. Although high efficiency can readily be obtained at low brightness levels, a significant decline at high brightness is commonly observed. In this report, we will review various strategies for achieving highly efficient phosphorescent OLED devices at high luminance. Specifically, we will provide details regarding the performance and general working principles behind each strategy. We will conclude by looking at how some of these strategies can be combined to produce high efficiency white OLEDs at high brightness.

  1. High-power, high-efficiency FELs

    International Nuclear Information System (INIS)

    Sessler, A.M.

    1989-04-01

    High power, high efficiency FELs require tapering, as the particles loose energy, so as to maintain resonance between the electromagnetic wave and the particles. They also require focusing of the particles (usually done with curved pole faces) and focusing of the electromagnetic wave (i.e. optical guiding). In addition, one must avoid transverse beam instabilities (primarily resistive wall) and longitudinal instabilities (i.e sidebands). 18 refs., 7 figs., 3 tabs

  2. Energy-Efficient Abundant-Data Computing: The N3XT 1,000X

    OpenAIRE

    Aly Mohamed M. Sabry; Gao Mingyu; Hills Gage; Lee Chi-Shuen; Pinter Greg; Shulaker Max M.; Wu Tony F.; Asheghi Mehdi; Bokor Jeff; Franchetti Franz; Goodson Kenneth E.; Kozyrakis Christos; Markov Igor; Olukotun Kunle; Pileggi Larry

    2015-01-01

    Next generation information technologies will process unprecedented amounts of loosely structured data that overwhelm existing computing systems. N3XT improves the energy efficiency of abundant data applications 1000 fold by using new logic and memory technologies 3D integration with fine grained connectivity and new architectures for computation immersed in memory.

  3. Improving computational efficiency of Monte Carlo simulations with variance reduction

    International Nuclear Information System (INIS)

    Turner, A.; Davis, A.

    2013-01-01

    CCFE perform Monte-Carlo transport simulations on large and complex tokamak models such as ITER. Such simulations are challenging since streaming and deep penetration effects are equally important. In order to make such simulations tractable, both variance reduction (VR) techniques and parallel computing are used. It has been found that the application of VR techniques in such models significantly reduces the efficiency of parallel computation due to 'long histories'. VR in MCNP can be accomplished using energy-dependent weight windows. The weight window represents an 'average behaviour' of particles, and large deviations in the arriving weight of a particle give rise to extreme amounts of splitting being performed and a long history. When running on parallel clusters, a long history can have a detrimental effect on the parallel efficiency - if one process is computing the long history, the other CPUs complete their batch of histories and wait idle. Furthermore some long histories have been found to be effectively intractable. To combat this effect, CCFE has developed an adaptation of MCNP which dynamically adjusts the WW where a large weight deviation is encountered. The method effectively 'de-optimises' the WW, reducing the VR performance but this is offset by a significant increase in parallel efficiency. Testing with a simple geometry has shown the method does not bias the result. This 'long history method' has enabled CCFE to significantly improve the performance of MCNP calculations for ITER on parallel clusters, and will be beneficial for any geometry combining streaming and deep penetration effects. (authors)

  4. Contributing to the design of run-time systems dedicated to high performance computing

    International Nuclear Information System (INIS)

    Perache, M.

    2006-10-01

    In the field of intensive scientific computing, the quest for performance has to face the increasing complexity of parallel architectures. Nowadays, these machines exhibit a deep memory hierarchy which complicates the design of efficient parallel applications. This thesis proposes a programming environment allowing to design efficient parallel programs on top of clusters of multi-processors. It features a programming model centered around collective communications and synchronizations, and provides load balancing facilities. The programming interface, named MPC, provides high level paradigms which are optimized according to the underlying architecture. The environment is fully functional and used within the CEA/DAM (TERANOVA) computing center. The evaluations presented in this document confirm the relevance of our approach. (author)

  5. Enabling Efficient Climate Science Workflows in High Performance Computing Environments

    Science.gov (United States)

    Krishnan, H.; Byna, S.; Wehner, M. F.; Gu, J.; O'Brien, T. A.; Loring, B.; Stone, D. A.; Collins, W.; Prabhat, M.; Liu, Y.; Johnson, J. N.; Paciorek, C. J.

    2015-12-01

    A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community. As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (MPI). We highlight the use and benefit of these tools by showing several climate science analysis use cases they have been applied to.

  6. 8th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2015-01-01

    Numerical simulation and modelling using High Performance Computing has evolved into an established technique in academic and industrial research. At the same time, the High Performance Computing infrastructure is becoming ever more complex. For instance, most of the current top systems around the world use thousands of nodes in which classical CPUs are combined with accelerator cards in order to enhance their compute power and energy efficiency. This complexity can only be mastered with adequate development and optimization tools. Key topics addressed by these tools include parallelization on heterogeneous systems, performance optimization for CPUs and accelerators, debugging of increasingly complex scientific applications, and optimization of energy usage in the spirit of green IT. This book represents the proceedings of the 8th International Parallel Tools Workshop, held October 1-2, 2014 in Stuttgart, Germany – which is a forum to discuss the latest advancements in the parallel tools.

  7. Out-of-Core Computations of High-Resolution Level Sets by Means of Code Transformation

    DEFF Research Database (Denmark)

    Christensen, Brian Bunch; Nielsen, Michael Bang; Museth, Ken

    2012-01-01

    We propose a storage efficient, fast and parallelizable out-of-core framework for streaming computations of high resolution level sets. The fundamental techniques are skewing and tiling transformations of streamed level set computations which allow for the combination of interface propagation, re...... computations are now CPU bound and consequently the overall performance is unaffected by disk latency and bandwidth limitations. We demonstrate this with several benchmark tests that show sustained out-of-core throughputs close to that of in-core level set simulations....

  8. Evaluating performance of high efficiency mist eliminators

    Energy Technology Data Exchange (ETDEWEB)

    Waggoner, Charles A.; Parsons, Michael S.; Giffin, Paxton K. [Mississippi State University, Institute for Clean Energy Technology, 205 Research Blvd, Starkville, MS (United States)

    2013-07-01

    material on each element at final dP. Plots of overall filtering efficiencies for DOP (spherical aerosol) and dry surrogate (aspherical aerosols) at specified dPs were computed for each filter. Filtering efficiencies were determined as a function of particle size. Curves were also generated showing the most penetrating particle size as a function of dP. A preliminary set of tests was conducted to evaluate spray location, duration, pressure, and wash volume for in-place cleaning the interior surface (reducing dP) of the HEME element. A variety of nozzle designs were evaluated and test results demonstrated the potential to overload the HEME (saturate filter medium) resulting in very high dPs and extensive drain times. At least one combination of spray nozzle design, spray location on the surface of the element, and spray time/pressure was successful in achieving extension of operational life. (authors)

  9. Efficient CUDA Polynomial Preconditioned Conjugate Gradient Solver for Finite Element Computation of Elasticity Problems

    Directory of Open Access Journals (Sweden)

    Jianfei Zhang

    2013-01-01

    Full Text Available Graphics processing unit (GPU has obtained great success in scientific computations for its tremendous computational horsepower and very high memory bandwidth. This paper discusses the efficient way to implement polynomial preconditioned conjugate gradient solver for the finite element computation of elasticity on NVIDIA GPUs using compute unified device architecture (CUDA. Sliced block ELLPACK (SBELL format is introduced to store sparse matrix arising from finite element discretization of elasticity with fewer padding zeros than traditional ELLPACK-based formats. Polynomial preconditioning methods have been investigated both in convergence and running time. From the overall performance, the least-squares (L-S polynomial method is chosen as a preconditioner in PCG solver to finite element equations derived from elasticity for its best results on different example meshes. In the PCG solver, mixed precision algorithm is used not only to reduce the overall computational, storage requirements and bandwidth but to make full use of the capacity of the GPU devices. With SBELL format and mixed precision algorithm, the GPU-based L-S preconditioned CG can get a speedup of about 7–9 to CPU-implementation.

  10. High Performance Computing in Science and Engineering '15 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2015. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  11. High Performance Computing in Science and Engineering '17 : Transactions of the High Performance Computing Center

    CERN Document Server

    Kröner, Dietmar; Resch, Michael; HLRS 2017

    2018-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2017. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance.The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  12. Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets

    KAUST Repository

    Sun, Ying; Stein, Michael L.

    2014-01-01

    For Gaussian process models, likelihood based methods are often difficult to use with large irregularly spaced spatial datasets, because exact calculations of the likelihood for n observations require O(n3) operations and O(n2) memory. Various approximation methods have been developed to address the computational difficulties. In this paper, we propose new unbiased estimating equations based on score equation approximations that are both computationally and statistically efficient. We replace the inverse covariance matrix that appears in the score equations by a sparse matrix to approximate the quadratic forms, then set the resulting quadratic forms equal to their expected values to obtain unbiased estimating equations. The sparse matrix is constructed by a sparse inverse Cholesky approach to approximate the inverse covariance matrix. The statistical efficiency of the resulting unbiased estimating equations are evaluated both in theory and by numerical studies. Our methods are applied to nearly 90,000 satellite-based measurements of water vapor levels over a region in the Southeast Pacific Ocean.

  13. Statistically and Computationally Efficient Estimating Equations for Large Spatial Datasets

    KAUST Repository

    Sun, Ying

    2014-11-07

    For Gaussian process models, likelihood based methods are often difficult to use with large irregularly spaced spatial datasets, because exact calculations of the likelihood for n observations require O(n3) operations and O(n2) memory. Various approximation methods have been developed to address the computational difficulties. In this paper, we propose new unbiased estimating equations based on score equation approximations that are both computationally and statistically efficient. We replace the inverse covariance matrix that appears in the score equations by a sparse matrix to approximate the quadratic forms, then set the resulting quadratic forms equal to their expected values to obtain unbiased estimating equations. The sparse matrix is constructed by a sparse inverse Cholesky approach to approximate the inverse covariance matrix. The statistical efficiency of the resulting unbiased estimating equations are evaluated both in theory and by numerical studies. Our methods are applied to nearly 90,000 satellite-based measurements of water vapor levels over a region in the Southeast Pacific Ocean.

  14. Energy-Efficient Office Buildings at High Latitudes

    Energy Technology Data Exchange (ETDEWEB)

    Lerum, V.

    1996-12-31

    This doctoral thesis describes a method for energy efficient office building design at high latitudes and cold climates. The method combines daylighting, passive solar heating, solar protection, and ventilative cooling. The thesis focuses on optimal design of an equatorial-facing fenestration system. A spreadsheet framework linking existing simplified methods is used. The daylight analysis uses location specific data on frequency distribution of diffuse daylight on vertical surfaces to estimate energy savings from optimal window and room configurations in combination with a daylight-responsive electric lighting system. The passive solar heating analysis is a generalization of a solar load ratio method adapted to cold climates by combining it with the Norwegian standard NS3031 for winter months when the solar savings fraction is negative. The emphasis is on very high computational efficiency to permit rapid and comprehensive examination of a large number of options early in design. The procedure is illustrated for a location in Trondheim, Norway, testing the relative significance of various design improvement options relative to a base case. The method is also tested for two other locations in Norway, at latitudes 58 and 70 degrees North. The band of latitudes between these limits covers cities in Alaska, Canada, Greenland, Iceland, Scandinavia, Finland, Russia, and Northern Japan. A comprehensive study of the ``whole building approach`` shows the impact of integrated daylighting and low-energy design strategies. In general, consumption of lighting electricity may be reduced by 50-80%, even at extremely high latitudes. The reduced internal heat from electric lights is replaced by passive solar heating. 113 refs., 85 figs., 25 tabs.

  15. High performance computing system in the framework of the Higgs boson studies

    CERN Document Server

    Belyaev, Nikita; The ATLAS collaboration; Velikhov, Vasily; Konoplich, Rostislav

    2017-01-01

    The Higgs boson physics is one of the most important and promising fields of study in the modern high energy physics. It is important to notice, that GRID computing resources become strictly limited due to increasing amount of statistics, required for physics analyses and unprecedented LHC performance. One of the possibilities to address the shortfall of computing resources is the usage of computer institutes' clusters, commercial computing resources and supercomputers. To perform precision measurements of the Higgs boson properties in these realities, it is also highly required to have effective instruments to simulate kinematic distributions of signal events. In this talk we give a brief description of the modern distribution reconstruction method called Morphing and perform few efficiency tests to demonstrate its potential. These studies have been performed on the WLCG and Kurchatov Institute’s Data Processing Center, including Tier-1 GRID site and supercomputer as well. We also analyze the CPU efficienc...

  16. High-End Scientific Computing

    Science.gov (United States)

    EPA uses high-end scientific computing, geospatial services and remote sensing/imagery analysis to support EPA's mission. The Center for Environmental Computing (CEC) assists the Agency's program offices and regions to meet staff needs in these areas.

  17. A synthetic visual plane algorithm for visibility computation in consideration of accuracy and efficiency

    Science.gov (United States)

    Yu, Jieqing; Wu, Lixin; Hu, Qingsong; Yan, Zhigang; Zhang, Shaoliang

    2017-12-01

    Visibility computation is of great interest to location optimization, environmental planning, ecology, and tourism. Many algorithms have been developed for visibility computation. In this paper, we propose a novel method of visibility computation, called synthetic visual plane (SVP), to achieve better performance with respect to efficiency, accuracy, or both. The method uses a global horizon, which is a synthesis of line-of-sight information of all nearer points, to determine the visibility of a point, which makes it an accurate visibility method. We used discretization of horizon to gain a good performance in efficiency. After discretization, the accuracy and efficiency of SVP depends on the scale of discretization (i.e., zone width). The method is more accurate at smaller zone widths, but this requires a longer operating time. Users must strike a balance between accuracy and efficiency at their discretion. According to our experiments, SVP is less accurate but more efficient than R2 if the zone width is set to one grid. However, SVP becomes more accurate than R2 when the zone width is set to 1/24 grid, while it continues to perform as fast or faster than R2. Although SVP performs worse than reference plane and depth map with respect to efficiency, it is superior in accuracy to these other two algorithms.

  18. Energy efficient hybrid computing systems using spin devices

    Science.gov (United States)

    Sharad, Mrigank

    Emerging spin-devices like magnetic tunnel junctions (MTJ's), spin-valves and domain wall magnets (DWM) have opened new avenues for spin-based logic design. This work explored potential computing applications which can exploit such devices for higher energy-efficiency and performance. The proposed applications involve hybrid design schemes, where charge-based devices supplement the spin-devices, to gain large benefits at the system level. As an example, lateral spin valves (LSV) involve switching of nanomagnets using spin-polarized current injection through a metallic channel such as Cu. Such spin-torque based devices possess several interesting properties that can be exploited for ultra-low power computation. Analog characteristic of spin current facilitate non-Boolean computation like majority evaluation that can be used to model a neuron. The magneto-metallic neurons can operate at ultra-low terminal voltage of ˜20mV, thereby resulting in small computation power. Moreover, since nano-magnets inherently act as memory elements, these devices can facilitate integration of logic and memory in interesting ways. The spin based neurons can be integrated with CMOS and other emerging devices leading to different classes of neuromorphic/non-Von-Neumann architectures. The spin-based designs involve `mixed-mode' processing and hence can provide very compact and ultra-low energy solutions for complex computation blocks, both digital as well as analog. Such low-power, hybrid designs can be suitable for various data processing applications like cognitive computing, associative memory, and currentmode on-chip global interconnects. Simulation results for these applications based on device-circuit co-simulation framework predict more than ˜100x improvement in computation energy as compared to state of the art CMOS design, for optimal spin-device parameters.

  19. Model Reduction of Computational Aerothermodynamics for Multi-Discipline Analysis in High Speed Flows

    Science.gov (United States)

    Crowell, Andrew Rippetoe

    This dissertation describes model reduction techniques for the computation of aerodynamic heat flux and pressure loads for multi-disciplinary analysis of hypersonic vehicles. NASA and the Department of Defense have expressed renewed interest in the development of responsive, reusable hypersonic cruise vehicles capable of sustained high-speed flight and access to space. However, an extensive set of technical challenges have obstructed the development of such vehicles. These technical challenges are partially due to both the inability to accurately test scaled vehicles in wind tunnels and to the time intensive nature of high-fidelity computational modeling, particularly for the fluid using Computational Fluid Dynamics (CFD). The aim of this dissertation is to develop efficient and accurate models for the aerodynamic heat flux and pressure loads to replace the need for computationally expensive, high-fidelity CFD during coupled analysis. Furthermore, aerodynamic heating and pressure loads are systematically evaluated for a number of different operating conditions, including: simple two-dimensional flow over flat surfaces up to three-dimensional flows over deformed surfaces with shock-shock interaction and shock-boundary layer interaction. An additional focus of this dissertation is on the implementation and computation of results using the developed aerodynamic heating and pressure models in complex fluid-thermal-structural simulations. Model reduction is achieved using a two-pronged approach. One prong focuses on developing analytical corrections to isothermal, steady-state CFD flow solutions in order to capture flow effects associated with transient spatially-varying surface temperatures and surface pressures (e.g., surface deformation, surface vibration, shock impingements, etc.). The second prong is focused on minimizing the computational expense of computing the steady-state CFD solutions by developing an efficient surrogate CFD model. The developed two

  20. How to build a high-performance compute cluster for the Grid

    CERN Document Server

    Reinefeld, A

    2001-01-01

    The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage resources at geographically distributed sites. Currently, much effort is spent on the design of Grid management software (Datagrid, Globus, etc.), while the effective integration of computing nodes has been largely neglected up to now. This is the focus of our work. We present a framework for a high- performance cluster that can be used as a reliable computing node in the Grid. We outline the cluster architecture, the management of distributed data and the seamless integration of the cluster into the Grid environment. (11 refs).

  1. Adding computationally efficient realism to Monte Carlo turbulence simulation

    Science.gov (United States)

    Campbell, C. W.

    1985-01-01

    Frequently in aerospace vehicle flight simulation, random turbulence is generated using the assumption that the craft is small compared to the length scales of turbulence. The turbulence is presumed to vary only along the flight path of the vehicle but not across the vehicle span. The addition of the realism of three-dimensionality is a worthy goal, but any such attempt will not gain acceptance in the simulator community unless it is computationally efficient. A concept for adding three-dimensional realism with a minimum of computational complexity is presented. The concept involves the use of close rational approximations to irrational spectra and cross-spectra so that systems of stable, explicit difference equations can be used to generate the turbulence.

  2. High-efficiency airfoil rudders applied to submarines

    Directory of Open Access Journals (Sweden)

    ZHOU Yimei

    2017-03-01

    Full Text Available Modern submarine design puts forward higher and higher requirements for control surfaces, and this creates a requirement for designers to constantly innovate new types of rudder so as to improve the efficiency of control surfaces. Adopting the high-efficiency airfoil rudder is one of the most effective measures for improving the efficiency of control surfaces. In this paper, we put forward an optimization method for a high-efficiency airfoil rudder on the basis of a comparative analysis of the various strengths and weaknesses of the airfoil, and the numerical calculation method is adopted to analyze the influence rule of the hydrodynamic characteristics and wake field by using the high-efficiency airfoil rudder and the conventional NACA rudder comparatively; at the same time, a model load test in a towing tank was carried out, and the test results and simulation calculation obtained good consistency:the error between them was less than 10%. The experimental results show that the steerage of a high-efficiency airfoil rudder is increased by more than 40% when compared with the conventional rudder, but the total resistance is close:the error is no more than 4%. Adopting a high-efficiency airfoil rudder brings much greater lifting efficiency than the total resistance of the boat. The results show that high-efficiency airfoil rudder has obvious advantages for improving the efficiency of control, giving it good application prospects.

  3. SEDRX: A computer program for the simulation Si(Li) and Ge(Hp) x-ray detectors efficiency

    International Nuclear Information System (INIS)

    Benamar, M.A.; Benouali, A.; Tchantchane, A.; Azbouche, A.; Tobbeche, S. Centre de Developpement des Techniques Nucleaires, Algiers; Labo. des Techniques Nucleaires)

    1992-12-01

    The difficulties encountered in measuring the x-ray detectors efficiency has motivated to develop a computer program to simulate this parameter. this program computes the efficiency of detectors as a function of energy. the computation of this parameter is based on the fitting coefficients of absorption in the case of photoelectric, coherent and incoherent factors. These coefficients are given by Mc Master library or may be determined by the interpolation based on cubic splines

  4. Numerical aspects for efficient welding computational mechanics

    Directory of Open Access Journals (Sweden)

    Aburuga Tarek Kh.S.

    2014-01-01

    Full Text Available The effect of the residual stresses and strains is one of the most important parameter in the structure integrity assessment. A finite element model is constructed in order to simulate the multi passes mismatched submerged arc welding SAW which used in the welded tensile test specimen. Sequentially coupled thermal mechanical analysis is done by using ABAQUS software for calculating the residual stresses and distortion due to welding. In this work, three main issues were studied in order to reduce the time consuming during welding simulation which is the major problem in the computational welding mechanics (CWM. The first issue is dimensionality of the problem. Both two- and three-dimensional models are constructed for the same analysis type, shell element for two dimension simulation shows good performance comparing with brick element. The conventional method to calculate residual stress is by using implicit scheme that because of the welding and cooling time is relatively high. In this work, the author shows that it could use the explicit scheme with the mass scaling technique, and time consuming during the analysis will be reduced very efficiently. By using this new technique, it will be possible to simulate relatively large three dimensional structures.

  5. Topic 14+16: High-performance and scientific applications and extreme-scale computing (Introduction)

    KAUST Repository

    Downes, Turlough P.

    2013-01-01

    As our understanding of the world around us increases it becomes more challenging to make use of what we already know, and to increase our understanding still further. Computational modeling and simulation have become critical tools in addressing this challenge. The requirements of high-resolution, accurate modeling have outstripped the ability of desktop computers and even small clusters to provide the necessary compute power. Many applications in the scientific and engineering domains now need very large amounts of compute time, while other applications, particularly in the life sciences, frequently have large data I/O requirements. There is thus a growing need for a range of high performance applications which can utilize parallel compute systems effectively, which have efficient data handling strategies and which have the capacity to utilise current and future systems. The High Performance and Scientific Applications topic aims to highlight recent progress in the use of advanced computing and algorithms to address the varied, complex and increasing challenges of modern research throughout both the "hard" and "soft" sciences. This necessitates being able to use large numbers of compute nodes, many of which are equipped with accelerators, and to deal with difficult I/O requirements. © 2013 Springer-Verlag.

  6. Scalable domain decomposition solvers for stochastic PDEs in high performance computing

    International Nuclear Information System (INIS)

    Desai, Ajit; Pettit, Chris; Poirel, Dominique; Sarkar, Abhijit

    2017-01-01

    Stochastic spectral finite element models of practical engineering systems may involve solutions of linear systems or linearized systems for non-linear problems with billions of unknowns. For stochastic modeling, it is therefore essential to design robust, parallel and scalable algorithms that can efficiently utilize high-performance computing to tackle such large-scale systems. Domain decomposition based iterative solvers can handle such systems. And though these algorithms exhibit excellent scalabilities, significant algorithmic and implementational challenges exist to extend them to solve extreme-scale stochastic systems using emerging computing platforms. Intrusive polynomial chaos expansion based domain decomposition algorithms are extended here to concurrently handle high resolution in both spatial and stochastic domains using an in-house implementation. Sparse iterative solvers with efficient preconditioners are employed to solve the resulting global and subdomain level local systems through multi-level iterative solvers. We also use parallel sparse matrix–vector operations to reduce the floating-point operations and memory requirements. Numerical and parallel scalabilities of these algorithms are presented for the diffusion equation having spatially varying diffusion coefficient modeled by a non-Gaussian stochastic process. Scalability of the solvers with respect to the number of random variables is also investigated.

  7. Efficient computation of electrograms and ECGs in human whole heart simulations using a reaction-eikonal model.

    Science.gov (United States)

    Neic, Aurel; Campos, Fernando O; Prassl, Anton J; Niederer, Steven A; Bishop, Martin J; Vigmond, Edward J; Plank, Gernot

    2017-10-01

    Anatomically accurate and biophysically detailed bidomain models of the human heart have proven a powerful tool for gaining quantitative insight into the links between electrical sources in the myocardium and the concomitant current flow in the surrounding medium as they represent their relationship mechanistically based on first principles. Such models are increasingly considered as a clinical research tool with the perspective of being used, ultimately, as a complementary diagnostic modality. An important prerequisite in many clinical modeling applications is the ability of models to faithfully replicate potential maps and electrograms recorded from a given patient. However, while the personalization of electrophysiology models based on the gold standard bidomain formulation is in principle feasible, the associated computational expenses are significant, rendering their use incompatible with clinical time frames. In this study we report on the development of a novel computationally efficient reaction-eikonal (R-E) model for modeling extracellular potential maps and electrograms. Using a biventricular human electrophysiology model, which incorporates a topologically realistic His-Purkinje system (HPS), we demonstrate by comparing against a high-resolution reaction-diffusion (R-D) bidomain model that the R-E model predicts extracellular potential fields, electrograms as well as ECGs at the body surface with high fidelity and offers vast computational savings greater than three orders of magnitude. Due to their efficiency R-E models are ideally suitable for forward simulations in clinical modeling studies which attempt to personalize electrophysiological model features.

  8. Efficient computation in adaptive artificial spiking neural networks

    NARCIS (Netherlands)

    D. Zambrano (Davide); R.B.P. Nusselder (Roeland); H.S. Scholte; S.M. Bohte (Sander)

    2017-01-01

    textabstractArtificial Neural Networks (ANNs) are bio-inspired models of neural computation that have proven highly effective. Still, ANNs lack a natural notion of time, and neural units in ANNs exchange analog values in a frame-based manner, a computationally and energetically inefficient form of

  9. [Characteristics of phosphorus uptake and use efficiency of rice with high yield and high phosphorus use efficiency].

    Science.gov (United States)

    Li, Li; Zhang, Xi-Zhou; Li, Tinx-Xuan; Yu, Hai-Ying; Ji, Lin; Chen, Guang-Deng

    2014-07-01

    A total of twenty seven middle maturing rice varieties as parent materials were divided into four types based on P use efficiency for grain yield in 2011 by field experiment with normal phosphorus (P) application. The rice variety with high yield and high P efficiency was identified by pot experiment with normal and low P applications, and the contribution rates of various P efficiencies to yield were investigated in 2012. There were significant genotype differences in yield and P efficiency of the test materials. GRLu17/AiTTP//Lu17_2 (QR20) was identified as a variety with high yield and high P efficiency, and its yields at the low and normal rates of P application were 1.96 and 1.92 times of that of Yuxiang B, respectively. The contribution rate of P accumulation to yield was greater than that of P grain production efficiency and P harvest index across field and pot experiments. The contribution rates of P accumulation and P grain production efficiency to yield were not significantly different under the normal P condition, whereas obvious differences were observed under the low P condition (66.5% and 26.6%). The minimal contribution to yield was P harvest index (11.8%). Under the normal P condition, the contribution rates of P accumulation to yield and P harvest index were the highest at the jointing-heading stage, which were 93.4% and 85.7%, respectively. In addition, the contribution rate of P accumulation to grain production efficiency was 41.8%. Under the low P condition, the maximal contribution rates of P accumulation to yield and grain production efficiency were observed at the tillering-jointing stage, which were 56.9% and 20.1% respectively. Furthermore, the contribution rate of P accumulation to P harvest index was 16.0%. The yield, P accumulation, and P harvest index of QR20 significantly increased under the normal P condition by 20.6%, 18.1% and 18.2% respectively compared with that in the low P condition. The rank of the contribution rates of P

  10. High-Temperature High-Efficiency Solar Thermoelectric Generators

    Energy Technology Data Exchange (ETDEWEB)

    Baranowski, LL; Warren, EL; Toberer, ES

    2014-03-01

    Inspired by recent high-efficiency thermoelectric modules, we consider thermoelectrics for terrestrial applications in concentrated solar thermoelectric generators (STEGs). The STEG is modeled as two subsystems: a TEG, and a solar absorber that efficiently captures the concentrated sunlight and limits radiative losses from the system. The TEG subsystem is modeled using thermoelectric compatibility theory; this model does not constrain the material properties to be constant with temperature. Considering a three-stage TEG based on current record modules, this model suggests that 18% efficiency could be experimentally expected with a temperature gradient of 1000A degrees C to 100A degrees C. Achieving 15% overall STEG efficiency thus requires an absorber efficiency above 85%, and we consider two methods to achieve this: solar-selective absorbers and thermally insulating cavities. When the TEG and absorber subsystem models are combined, we expect that the STEG modeled here could achieve 15% efficiency with optical concentration between 250 and 300 suns.

  11. Efficient modeling of interconnects and capacitive discontinuities in high-speed digital circuits. Thesis

    Science.gov (United States)

    Oh, K. S.; Schutt-Aine, J.

    1995-01-01

    Modeling of interconnects and associated discontinuities with the recent advances high-speed digital circuits has gained a considerable interest over the last decade although the theoretical bases for analyzing these structures were well-established as early as the 1960s. Ongoing research at the present time is focused on devising methods which can be applied to more general geometries than the ones considered in earlier days and, at the same time, improving the computational efficiency and accuracy of these methods. In this thesis, numerically efficient methods to compute the transmission line parameters of a multiconductor system and the equivalent capacitances of various strip discontinuities are presented based on the quasi-static approximation. The presented techniques are applicable to conductors embedded in an arbitrary number of dielectric layers with two possible locations of ground planes at the top and bottom of the dielectric layers. The cross-sections of conductors can be arbitrary as long as they can be described with polygons. An integral equation approach in conjunction with the collocation method is used in the presented methods. A closed-form Green's function is derived based on weighted real images thus avoiding nested infinite summations in the exact Green's function; therefore, this closed-form Green's function is numerically more efficient than the exact Green's function. All elements associated with the moment matrix are computed using the closed-form formulas. Various numerical examples are considered to verify the presented methods, and a comparison of the computed results with other published results showed good agreement.

  12. Computing in high energy physics

    International Nuclear Information System (INIS)

    Hertzberger, L.O.; Hoogland, W.

    1986-01-01

    This book deals with advanced computing applications in physics, and in particular in high energy physics environments. The main subjects covered are networking; vector and parallel processing; and embedded systems. Also examined are topics such as operating systems, future computer architectures and commercial computer products. The book presents solutions that are foreseen as coping, in the future, with computing problems in experimental and theoretical High Energy Physics. In the experimental environment the large amounts of data to be processed offer special problems on-line as well as off-line. For on-line data reduction, embedded special purpose computers, which are often used for trigger applications are applied. For off-line processing, parallel computers such as emulator farms and the cosmic cube may be employed. The analysis of these topics is therefore a main feature of this volume

  13. DVS-SOFTWARE: An Effective Tool for Applying Highly Parallelized Hardware To Computational Geophysics

    Science.gov (United States)

    Herrera, I.; Herrera, G. S.

    2015-12-01

    Most geophysical systems are macroscopic physical systems. The behavior prediction of such systems is carried out by means of computational models whose basic models are partial differential equations (PDEs) [1]. Due to the enormous size of the discretized version of such PDEs it is necessary to apply highly parallelized super-computers. For them, at present, the most efficient software is based on non-overlapping domain decomposition methods (DDM). However, a limiting feature of the present state-of-the-art techniques is due to the kind of discretizations used in them. Recently, I. Herrera and co-workers using 'non-overlapping discretizations' have produced the DVS-Software which overcomes this limitation [2]. The DVS-software can be applied to a great variety of geophysical problems and achieves very high parallel efficiencies (90%, or so [3]). It is therefore very suitable for effectively applying the most advanced parallel supercomputers available at present. In a parallel talk, in this AGU Fall Meeting, Graciela Herrera Z. will present how this software is being applied to advance MOD-FLOW. Key Words: Parallel Software for Geophysics, High Performance Computing, HPC, Parallel Computing, Domain Decomposition Methods (DDM)REFERENCES [1]. Herrera Ismael and George F. Pinder, Mathematical Modelling in Science and Engineering: An axiomatic approach", John Wiley, 243p., 2012. [2]. Herrera, I., de la Cruz L.M. and Rosas-Medina A. "Non Overlapping Discretization Methods for Partial, Differential Equations". NUMER METH PART D E, 30: 1427-1454, 2014, DOI 10.1002/num 21852. (Open source) [3]. Herrera, I., & Contreras Iván "An Innovative Tool for Effectively Applying Highly Parallelized Software To Problems of Elasticity". Geofísica Internacional, 2015 (In press)

  14. 9th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Hilbrich, Tobias; Niethammer, Christoph; Gracia, José; Nagel, Wolfgang; Resch, Michael

    2016-01-01

    High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in paral...

  15. Analytical Expressions of the Efficiency of Standard and High Contact Ratio Involute Spur Gears

    Directory of Open Access Journals (Sweden)

    Miguel Pleguezuelos

    2013-01-01

    Full Text Available Simple, traditional methods for computation of the efficiency of spur gears are based on the hypotheses of constant friction coefficient and uniform load sharing along the path of contact. However, none of them is accurate. The friction coefficient is variable along the path of contact, though average values can be often considered for preliminary calculations. Nevertheless, the nonuniform load sharing produced by the changing rigidity of the pair of teeth has significant influence on the friction losses, due to the different relative sliding at any contact point. In previous works, the authors obtained a nonuniform model of load distribution based on the minimum elastic potential criterion, which was applied to compute the efficiency of standard gears. In this work, this model of load sharing is applied to study the efficiency of both standard and high contact ratio involute spur gears (with contact ratio between 1 and 2 and greater than 2, resp.. Approximate expressions for the friction power losses and for the efficiency are presented assuming the friction coefficient to be constant along the path of contact. A study of the influence of some transmission parameters (as the gear ratio, pressure angle, etc. on the efficiency is also presented.

  16. Series-Tuned High Efficiency RF-Power Amplifiers

    DEFF Research Database (Denmark)

    Vidkjær, Jens

    2008-01-01

    An approach to high efficiency RF-power amplifier design is presented. It addresses simultaneously efficiency optimization and peak voltage limitations when transistors are pushed towards their power limits.......An approach to high efficiency RF-power amplifier design is presented. It addresses simultaneously efficiency optimization and peak voltage limitations when transistors are pushed towards their power limits....

  17. A Heterogeneous High-Performance System for Computational and Computer Science

    Science.gov (United States)

    2016-11-15

    expand the research infrastructure at the institution but also to enhance the high -performance computing training provided to both undergraduate and... cloud computing, supercomputing, and the availability of cheap memory and storage led to enormous amounts of data to be sifted through in forensic... High -Performance Computing (HPC) tools that can be integrated with existing curricula and support our research to modernize and dramatically advance

  18. COMPUTATIONAL EFFICIENCY OF A MODIFIED SCATTERING KERNEL FOR FULL-COUPLED PHOTON-ELECTRON TRANSPORT PARALLEL COMPUTING WITH UNSTRUCTURED TETRAHEDRAL MESHES

    Directory of Open Access Journals (Sweden)

    JONG WOON KIM

    2014-04-01

    In this paper, we introduce a modified scattering kernel approach to avoid the unnecessarily repeated calculations involved with the scattering source calculation, and used it with parallel computing to effectively reduce the computation time. Its computational efficiency was tested for three-dimensional full-coupled photon-electron transport problems using our computer program which solves the multi-group discrete ordinates transport equation by using the discontinuous finite element method with unstructured tetrahedral meshes for complicated geometrical problems. The numerical tests show that we can improve speed up to 17∼42 times for the elapsed time per iteration using the modified scattering kernel, not only in the single CPU calculation but also in the parallel computing with several CPUs.

  19. Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

    KAUST Repository

    Southern, J.A.

    2009-10-01

    The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level, the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time while still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study, the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counterintuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks, it is shown that the coupled method is up to 80% faster than the conventional uncoupled method-and that parallel performance is better for the larger coupled problem.

  20. Solving the Coupled System Improves Computational Efficiency of the Bidomain Equations

    KAUST Repository

    Southern, J.A.; Plank, G.; Vigmond, E.J.; Whiteley, J.P.

    2009-01-01

    The bidomain equations are frequently used to model the propagation of cardiac action potentials across cardiac tissue. At the whole organ level, the size of the computational mesh required makes their solution a significant computational challenge. As the accuracy of the numerical solution cannot be compromised, efficiency of the solution technique is important to ensure that the results of the simulation can be obtained in a reasonable time while still encapsulating the complexities of the system. In an attempt to increase efficiency of the solver, the bidomain equations are often decoupled into one parabolic equation that is computationally very cheap to solve and an elliptic equation that is much more expensive to solve. In this study, the performance of this uncoupled solution method is compared with an alternative strategy in which the bidomain equations are solved as a coupled system. This seems counterintuitive as the alternative method requires the solution of a much larger linear system at each time step. However, in tests on two 3-D rabbit ventricle benchmarks, it is shown that the coupled method is up to 80% faster than the conventional uncoupled method-and that parallel performance is better for the larger coupled problem.

  1. An efficient and accurate method for computation of energy release rates in beam structures with longitudinal cracks

    DEFF Research Database (Denmark)

    Blasques, José Pedro Albergaria Amaral; Bitsche, Robert

    2015-01-01

    This paper proposes a novel, efficient, and accurate framework for fracture analysis of beam structures with longitudinal cracks. The three-dimensional local stress field is determined using a high-fidelity beam model incorporating a finite element based cross section analysis tool. The Virtual...... Crack Closure Technique is used for computation of strain energy release rates. The devised framework was employed for analysis of cracks in beams with different cross section geometries. The results show that the accuracy of the proposed method is comparable to that of conventional three......-dimensional solid finite element models while using only a fraction of the computation time....

  2. Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency

    KAUST Repository

    Ltaief, Hatem

    2011-08-31

    This paper presents the power profile of two high performance dense linear algebra libraries i.e., LAPACK and PLASMA. The former is based on block algorithms that use the fork-join paradigm to achieve parallel performance. The latter uses fine-grained task parallelism that recasts the computation to operate on submatrices called tiles. In this way tile algorithms are formed. We show results from the power profiling of the most common routines, which permits us to clearly identify the different phases of the computations. This allows us to isolate the bottlenecks in terms of energy efficiency. Our results show that PLASMA surpasses LAPACK not only in terms of performance but also in terms of energy efficiency. © 2011 Springer-Verlag.

  3. Computer-aided modeling framework for efficient model development, analysis and identification

    DEFF Research Database (Denmark)

    Heitzig, Martina; Sin, Gürkan; Sales Cruz, Mauricio

    2011-01-01

    Model-based computer aided product-process engineering has attained increased importance in a number of industries, including pharmaceuticals, petrochemicals, fine chemicals, polymers, biotechnology, food, energy, and water. This trend is set to continue due to the substantial benefits computer-aided...... methods introduce. The key prerequisite of computer-aided product-process engineering is however the availability of models of different types, forms, and application modes. The development of the models required for the systems under investigation tends to be a challenging and time-consuming task....... The methodology has been implemented into a computer-aided modeling framework, which combines expert skills, tools, and database connections that are required for the different steps of the model development work-flow with the goal to increase the efficiency of the modeling process. The framework has two main...

  4. High efficiency, long life terrestrial solar panel

    Science.gov (United States)

    Chao, T.; Khemthong, S.; Ling, R.; Olah, S.

    1977-01-01

    The design of a high efficiency, long life terrestrial module was completed. It utilized 256 rectangular, high efficiency solar cells to achieve high packing density and electrical output. Tooling for the fabrication of solar cells was in house and evaluation of the cell performance was begun. Based on the power output analysis, the goal of a 13% efficiency module was achievable.

  5. Use of Debye's series to determine the optimal edge-effect terms for computing the extinction efficiencies of spheroids.

    Science.gov (United States)

    Lin, Wushao; Bi, Lei; Liu, Dong; Zhang, Kejun

    2017-08-21

    The extinction efficiencies of atmospheric particles are essential to determining radiation attenuation and thus are fundamentally related to atmospheric radiative transfer. The extinction efficiencies can also be used to retrieve particle sizes or refractive indices through particle characterization techniques. This study first uses the Debye series to improve the accuracy of high-frequency extinction formulae for spheroids in the context of Complex angular momentum theory by determining an optimal number of edge-effect terms. We show that the optimal edge-effect terms can be accurately obtained by comparing the results from the approximate formula with their counterparts computed from the invariant imbedding Debye series and T-matrix methods. An invariant imbedding T-matrix method is employed for particles with strong absorption, in which case the extinction efficiency is equivalent to two plus the edge-effect efficiency. For weakly absorptive or non-absorptive particles, the T-matrix results contain the interference between the diffraction and higher-order transmitted rays. Therefore, the Debye series was used to compute the edge-effect efficiency by separating the interference from the transmission on the extinction efficiency. We found that the optimal number strongly depends on the refractive index and is relatively insensitive to the particle geometry and size parameter. By building a table of optimal numbers of edge-effect terms, we developed an efficient and accurate extinction simulator that has been fully tested for randomly oriented spheroids with various aspect ratios and a wide range of refractive indices.

  6. High energy physics computing in Japan

    International Nuclear Information System (INIS)

    Watase, Yoshiyuki

    1989-01-01

    A brief overview of the computing provision for high energy physics in Japan is presented. Most of the computing power for high energy physics is concentrated in KEK. Here there are two large scale systems: one providing a general computing service including vector processing and the other dedicated to TRISTAN experiments. Each university group has a smaller sized mainframe or VAX system to facilitate both their local computing needs and the remote use of the KEK computers through a network. The large computer system for the TRISTAN experiments is described. An overview of a prospective future large facility is also given. (orig.)

  7. High-performance computing in seismology

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1996-09-01

    The scientific, technical, and economic importance of the issues discussed here presents a clear agenda for future research in computational seismology. In this way these problems will drive advances in high-performance computing in the field of seismology. There is a broad community that will benefit from this work, including the petroleum industry, research geophysicists, engineers concerned with seismic hazard mitigation, and governments charged with enforcing a comprehensive test ban treaty. These advances may also lead to new applications for seismological research. The recent application of high-resolution seismic imaging of the shallow subsurface for the environmental remediation industry is an example of this activity. This report makes the following recommendations: (1) focused efforts to develop validated documented software for seismological computations should be supported, with special emphasis on scalable algorithms for parallel processors; (2) the education of seismologists in high-performance computing technologies and methodologies should be improved; (3) collaborations between seismologists and computational scientists and engineers should be increased; (4) the infrastructure for archiving, disseminating, and processing large volumes of seismological data should be improved.

  8. A highly efficient sharp-interface immersed boundary method with adaptive mesh refinement for bio-inspired flow simulations

    Science.gov (United States)

    Deng, Xiaolong; Dong, Haibo

    2017-11-01

    Developing a high-fidelity, high-efficiency numerical method for bio-inspired flow problems with flow-structure interaction is important for understanding related physics and developing many bio-inspired technologies. To simulate a fast-swimming big fish with multiple finlets or fish schooling, we need fine grids and/or a big computational domain, which are big challenges for 3-D simulations. In current work, based on the 3-D finite-difference sharp-interface immersed boundary method for incompressible flows (Mittal et al., JCP 2008), we developed an octree-like Adaptive Mesh Refinement (AMR) technique to enhance the computational ability and increase the computational efficiency. The AMR is coupled with a multigrid acceleration technique and a MPI +OpenMP hybrid parallelization. In this work, different AMR layers are treated separately and the synchronization is performed in the buffer regions and iterations are performed for the convergence of solution. Each big region is calculated by a MPI process which then uses multiple OpenMP threads for further acceleration, so that the communication cost is reduced. With these acceleration techniques, various canonical and bio-inspired flow problems with complex boundaries can be simulated accurately and efficiently. This work is supported by the MURI Grant Number N00014-14-1-0533 and NSF Grant CBET-1605434.

  9. Secure Computation, I/O-Efficient Algorithms and Distributed Signatures

    DEFF Research Database (Denmark)

    Damgård, Ivan Bjerre; Kölker, Jonas; Toft, Tomas

    2012-01-01

    values of form r, gr for random secret-shared r ∈ ℤq and gr in a group of order q. This costs a constant number of exponentiation per player per value generated, even if less than n/3 players are malicious. This can be used for efficient distributed computing of Schnorr signatures. We further develop...... the technique so we can sign secret data in a distributed fashion at essentially the same cost....

  10. Computational study of a High Pressure Turbine Nozzle/Blade Interaction

    Science.gov (United States)

    Kopriva, James; Laskowski, Gregory; Sheikhi, Reza

    2015-11-01

    A downstream high pressure turbine blade has been designed for this study to be coupled with the upstream uncooled nozzle of Arts and Rouvroit [1992]. The computational domain is first held to a pitch-line section that includes no centrifugal forces (linear sliding-mesh). The stage geometry is intended to study the fundamental nozzle/blade interaction in a computationally cost efficient manner. Blade/Nozzle count of 2:1 is designed to maintain computational periodic boundary conditions for the coupled problem. Next the geometry is extended to a fully 3D domain with endwalls to understand the impact of secondary flow structures. A set of systematic computational studies are presented to understand the impact of turbulence on the nozzle and down-stream blade boundary layer development, resulting heat transfer, and downstream wake mixing in the absence of cooling. Doing so will provide a much better understanding of stage mixing losses and wall heat transfer which, in turn, can allow for improved engine performance. Computational studies are performed using WALE (Wale Adapted Local Eddy), IDDES (Improved Delayed Detached Eddy Simulation), SST (Shear Stress Transport) models in Fluent.

  11. High accuracy ion optics computing

    International Nuclear Information System (INIS)

    Amos, R.J.; Evans, G.A.; Smith, R.

    1986-01-01

    Computer simulation of focused ion beams for surface analysis of materials by SIMS, or for microfabrication by ion beam lithography plays an important role in the design of low energy ion beam transport and optical systems. Many computer packages currently available, are limited in their applications, being inaccurate or inappropriate for a number of practical purposes. This work describes an efficient and accurate computer programme which has been developed and tested for use on medium sized machines. The programme is written in Algol 68 and models the behaviour of a beam of charged particles through an electrostatic system. A variable grid finite difference method is used with a unique data structure, to calculate the electric potential in an axially symmetric region, for arbitrary shaped boundaries. Emphasis has been placed upon finding an economic method of solving the resulting set of sparse linear equations in the calculation of the electric field and several of these are described. Applications include individual ion lenses, extraction optics for ions in surface analytical instruments and the design of columns for ion beam lithography. Computational results have been compared with analytical calculations and with some data obtained from individual einzel lenses. (author)

  12. Implementing an Affordable High-Performance Computing for Teaching-Oriented Computer Science Curriculum

    Science.gov (United States)

    Abuzaghleh, Omar; Goldschmidt, Kathleen; Elleithy, Yasser; Lee, Jeongkyu

    2013-01-01

    With the advances in computing power, high-performance computing (HPC) platforms have had an impact on not only scientific research in advanced organizations but also computer science curriculum in the educational community. For example, multicore programming and parallel systems are highly desired courses in the computer science major. However,…

  13. Design Strategies for Ultra-high Efficiency Photovoltaics

    Science.gov (United States)

    Warmann, Emily Cathryn

    While concentrator photovoltaic cells have shown significant improvements in efficiency in the past ten years, once these cells are integrated into concentrating optics, connected to a power conditioning system and deployed in the field, the overall module efficiency drops to only 34 to 36%. This efficiency is impressive compared to conventional flat plate modules, but it is far short of the theoretical limits for solar energy conversion. Designing a system capable of achieving ultra high efficiency of 50% or greater cannot be achieved by refinement and iteration of current design approaches. This thesis takes a systems approach to designing a photovoltaic system capable of 50% efficient performance using conventional diode-based solar cells. The effort began with an exploration of the limiting efficiency of spectrum splitting ensembles with 2 to 20 sub cells in different electrical configurations. Incorporating realistic non-ideal performance with the computationally simple detailed balance approach resulted in practical limits that are useful to identify specific cell performance requirements. This effort quantified the relative benefit of additional cells and concentration for system efficiency, which will help in designing practical optical systems. Efforts to improve the quality of the solar cells themselves focused on the development of tunable lattice constant epitaxial templates. Initially intended to enable lattice matched multijunction solar cells, these templates would enable increased flexibility in band gap selection for spectrum splitting ensembles and enhanced radiative quality relative to metamorphic growth. The III-V material family is commonly used for multijunction solar cells both for its high radiative quality and for the ease of integrating multiple band gaps into one monolithic growth. The band gap flexibility is limited by the lattice constant of available growth templates. The virtual substrate consists of a thin III-V film with the desired

  14. 10th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Hilbrich, Tobias; Knüpfer, Andreas; Resch, Michael; Nagel, Wolfgang

    2017-01-01

    This book presents the proceedings of the 10th International Parallel Tools Workshop, held October 4-5, 2016 in Stuttgart, Germany – a forum to discuss the latest advances in parallel tools. High-performance computing plays an increasingly important role for numerical simulation and modelling in academic and industrial research. At the same time, using large-scale parallel systems efficiently is becoming more difficult. A number of tools addressing parallel program development and analysis have emerged from the high-performance computing community over the last decade, and what may have started as collection of small helper script has now matured to production-grade frameworks. Powerful user interfaces and an extensive body of documentation allow easy usage by non-specialists.

  15. New highly efficient piezoceramic materials

    International Nuclear Information System (INIS)

    Dantsiger, A.Ya.; Razumovskaya, O.N.; Reznichenko, L.A.; Grineva, L.D.; Devlikanova, R.U.; Dudkina, S.I.; Gavrilyachenko, S.V.; Dergunova, N.V.

    1993-01-01

    New high efficient piezoceramic materials with various combination of parameters inclusing high Curie point for high-temperature transducers using in atomic power engineering are worked. They can be used in systems for heated matters nondestructive testing, controllers for varied industrial power plants and other high-temperature equipment

  16. INSPIRED High School Computing Academies

    Science.gov (United States)

    Doerschuk, Peggy; Liu, Jiangjiang; Mann, Judith

    2011-01-01

    If we are to attract more women and minorities to computing we must engage students at an early age. As part of its mission to increase participation of women and underrepresented minorities in computing, the Increasing Student Participation in Research Development Program (INSPIRED) conducts computing academies for high school students. The…

  17. 7th International Workshop on Parallel Tools for High Performance Computing

    CERN Document Server

    Gracia, José; Nagel, Wolfgang; Resch, Michael

    2014-01-01

    Current advances in High Performance Computing (HPC) increasingly impact efficient software development workflows. Programmers for HPC applications need to consider trends such as increased core counts, multiple levels of parallelism, reduced memory per core, and I/O system challenges in order to derive well performing and highly scalable codes. At the same time, the increasing complexity adds further sources of program defects. While novel programming paradigms and advanced system libraries provide solutions for some of these challenges, appropriate supporting tools are indispensable. Such tools aid application developers in debugging, performance analysis, or code optimization and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 7th International Parallel Tools Workshop, held in Dresden, Germany, September 3-4, 2013.  

  18. I/O-Efficient Computation of Water Flow Across a Terrain

    DEFF Research Database (Denmark)

    Arge, Lars Allan; Revsbæk, Morten; Zeh, Norbert

    2010-01-01

    ). We present an I/O-efficient algorithm that solves this problem using O(sort(X) log (X/M) + sort(N)) I/Os, where N is the number of terrain vertices, X is the number of pits of the terrain, sort(N) is the cost of sorting N data items, and M is the size of the computer's main memory. Our algorithm...

  19. Efficient Computation of Casimir Interactions between Arbitrary 3D Objects

    International Nuclear Information System (INIS)

    Reid, M. T. Homer; Rodriguez, Alejandro W.; White, Jacob; Johnson, Steven G.

    2009-01-01

    We introduce an efficient technique for computing Casimir energies and forces between objects of arbitrarily complex 3D geometries. In contrast to other recently developed methods, our technique easily handles nonspheroidal, nonaxisymmetric objects, and objects with sharp corners. Using our new technique, we obtain the first predictions of Casimir interactions in a number of experimentally relevant geometries, including crossed cylinders and tetrahedral nanoparticles.

  20. Automatic domain updating technique for improving computational efficiency of 2-D flood-inundation simulation

    Science.gov (United States)

    Tanaka, T.; Tachikawa, Y.; Ichikawa, Y.; Yorozu, K.

    2017-12-01

    Flood is one of the most hazardous disasters and causes serious damage to people and property around the world. To prevent/mitigate flood damage through early warning system and/or river management planning, numerical modelling of flood-inundation processes is essential. In a literature, flood-inundation models have been extensively developed and improved to achieve flood flow simulation with complex topography at high resolution. With increasing demands on flood-inundation modelling, its computational burden is now one of the key issues. Improvements of computational efficiency of full shallow water equations are made from various perspectives such as approximations of the momentum equations, parallelization technique, and coarsening approaches. To support these techniques and more improve the computational efficiency of flood-inundation simulations, this study proposes an Automatic Domain Updating (ADU) method of 2-D flood-inundation simulation. The ADU method traces the wet and dry interface and automatically updates the simulation domain in response to the progress and recession of flood propagation. The updating algorithm is as follow: first, to register the simulation cells potentially flooded at initial stage (such as floodplains nearby river channels), and then if a registered cell is flooded, to register its surrounding cells. The time for this additional process is saved by checking only cells at wet and dry interface. The computation time is reduced by skipping the processing time of non-flooded area. This algorithm is easily applied to any types of 2-D flood inundation models. The proposed ADU method is implemented to 2-D local inertial equations for the Yodo River basin, Japan. Case studies for two flood events show that the simulation is finished within two to 10 times smaller time showing the same result as that without the ADU method.

  1. Computationally efficient implementation of combustion chemistry in parallel PDF calculations

    International Nuclear Information System (INIS)

    Lu Liuyan; Lantz, Steven R.; Ren Zhuyin; Pope, Stephen B.

    2009-01-01

    In parallel calculations of combustion processes with realistic chemistry, the serial in situ adaptive tabulation (ISAT) algorithm [S.B. Pope, Computationally efficient implementation of combustion chemistry using in situ adaptive tabulation, Combustion Theory and Modelling, 1 (1997) 41-63; L. Lu, S.B. Pope, An improved algorithm for in situ adaptive tabulation, Journal of Computational Physics 228 (2009) 361-386] substantially speeds up the chemistry calculations on each processor. To improve the parallel efficiency of large ensembles of such calculations in parallel computations, in this work, the ISAT algorithm is extended to the multi-processor environment, with the aim of minimizing the wall clock time required for the whole ensemble. Parallel ISAT strategies are developed by combining the existing serial ISAT algorithm with different distribution strategies, namely purely local processing (PLP), uniformly random distribution (URAN), and preferential distribution (PREF). The distribution strategies enable the queued load redistribution of chemistry calculations among processors using message passing. They are implemented in the software x2f m pi, which is a Fortran 95 library for facilitating many parallel evaluations of a general vector function. The relative performance of the parallel ISAT strategies is investigated in different computational regimes via the PDF calculations of multiple partially stirred reactors burning methane/air mixtures. The results show that the performance of ISAT with a fixed distribution strategy strongly depends on certain computational regimes, based on how much memory is available and how much overlap exists between tabulated information on different processors. No one fixed strategy consistently achieves good performance in all the regimes. Therefore, an adaptive distribution strategy, which blends PLP, URAN and PREF, is devised and implemented. It yields consistently good performance in all regimes. In the adaptive parallel

  2. High-efficiency cavity-dumped micro-chip Yb:YAG laser

    Science.gov (United States)

    Nishio, M.; Maruko, A.; Inoue, M.; Takama, M.; Matsubara, S.; Okunishi, H.; Kato, K.; Kyomoto, K.; Yoshida, T.; Shimabayashi, K.; Morioka, M.; Inayoshi, S.; Yamagata, S.; Kawato, S.

    2014-09-01

    High-efficiency cavity-dumped ytterbium-doped yttrium aluminum garnet (Yb:YAG) laser was developed. Although the high quantum efficiency of ytterbium-doped laser materials is appropriate for high-efficiency laser oscillation, the efficiency is decreased by their quasi-three/four laser natures. High gain operation by high intensity pumping is suitable for high efficiency oscillation on the quasi-three/four lasers without extremely low temperature cooling. In our group, highest efficiency oscillations for continuous wave, nanosecond to picosecond pulse lasers were achieved at room temperature by the high gain operation in which pump intensities were beyond 100 kW/cm2.

  3. An efficient algorithm to compute subsets of points in ℤ n

    OpenAIRE

    Pacheco Martínez, Ana María; Real Jurado, Pedro

    2012-01-01

    In this paper we show a more efficient algorithm than that in [8] to compute subsets of points non-congruent by isometries. This algorithm can be used to reconstruct the object from the digital image. Both algorithms are compared, highlighting the improvements obtained in terms of CPU time.

  4. Parallel computation of fluid-structural interactions using high resolution upwind schemes

    Science.gov (United States)

    Hu, Zongjun

    An efficient and accurate solver is developed to simulate the non-linear fluid-structural interactions in turbomachinery flutter flows. A new low diffusion E-CUSP scheme, Zha CUSP scheme, is developed to improve the efficiency and accuracy of the inviscid flux computation. The 3D unsteady Navier-Stokes equations with the Baldwin-Lomax turbulence model are solved using the finite volume method with the dual-time stepping scheme. The linearized equations are solved with Gauss-Seidel line iterations. The parallel computation is implemented using MPI protocol. The solver is validated with 2D cases for its turbulence modeling, parallel computation and unsteady calculation. The Zha CUSP scheme is validated with 2D cases, including a supersonic flat plate boundary layer, a transonic converging-diverging nozzle and a transonic inlet diffuser. The Zha CUSP2 scheme is tested with 3D cases, including a circular-to-rectangular nozzle, a subsonic compressor cascade and a transonic channel. The Zha CUSP schemes are proved to be accurate, robust and efficient in these tests. The steady and unsteady separation flows in a 3D stationary cascade under high incidence and three inlet Mach numbers are calculated to study the steady state separation flow patterns and their unsteady oscillation characteristics. The leading edge vortex shedding is the mechanism behind the unsteady characteristics of the high incidence separated flows. The separation flow characteristics is affected by the inlet Mach number. The blade aeroelasticity of a linear cascade with forced oscillating blades is studied using parallel computation. A simplified two-passage cascade with periodic boundary condition is first calculated under a medium frequency and a low incidence. The full scale cascade with 9 blades and two end walls is then studied more extensively under three oscillation frequencies and two incidence angles. The end wall influence and the blade stability are studied and compared under different

  5. Highly efficient high temperature electrolysis

    DEFF Research Database (Denmark)

    Hauch, Anne; Ebbesen, Sune; Jensen, Søren Højgaard

    2008-01-01

    High temperature electrolysis of water and steam may provide an efficient, cost effective and environmentally friendly production of H-2 Using electricity produced from sustainable, non-fossil energy sources. To achieve cost competitive electrolysis cells that are both high performing i.e. minimum...... internal resistance of the cell, and long-term stable, it is critical to develop electrode materials that are optimal for steam electrolysis. In this article electrolysis cells for electrolysis of water or steam at temperatures above 200 degrees C for production of H-2 are reviewed. High temperature...... electrolysis is favourable from a thermodynamic point of view, because a part of the required energy can be supplied as thermal heat, and the activation barrier is lowered increasing the H-2 production rate. Only two types of cells operating at high temperature (above 200 degrees C) have been described...

  6. Quantum Computing and the Limits of the Efficiently Computable

    CERN Multimedia

    CERN. Geneva

    2015-01-01

    I'll discuss how computational complexity---the study of what can and can't be feasibly computed---has been interacting with physics in interesting and unexpected ways. I'll first give a crash course about computer science's P vs. NP problem, as well as about the capabilities and limits of quantum computers. I'll then touch on speculative models of computation that would go even beyond quantum computers, using (for example) hypothetical nonlinearities in the Schrodinger equation. Finally, I'll discuss BosonSampling ---a proposal for a simple form of quantum computing, which nevertheless seems intractable to simulate using a classical computer---as well as the role of computational complexity in the black hole information puzzle.

  7. Energy- and cost-efficient lattice-QCD computations using graphics processing units

    Energy Technology Data Exchange (ETDEWEB)

    Bach, Matthias

    2014-07-01

    Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, perturbative methods can only be applied to QCD for high energies. Studies from first principles are possible via a discretization onto an Euclidean space-time grid. This discretization of QCD is called Lattice QCD (LQCD) and is the only ab-initio option outside of the high-energy regime. LQCD is extremely compute and memory intensive. In particular, it is by definition always bandwidth limited. Thus - despite the complexity of LQCD applications - it led to the development of several specialized compute platforms and influenced the development of others. However, in recent years General-Purpose computation on Graphics Processing Units (GPGPU) came up as a new means for parallel computing. Contrary to machines traditionally used for LQCD, graphics processing units (GPUs) are a massmarket product. This promises advantages in both the pace at which higher-performing hardware becomes available and its price. CL2QCD is an OpenCL based implementation of LQCD using Wilson fermions that was developed within this thesis. It operates on GPUs by all major vendors as well as on central processing units (CPUs). On the AMD Radeon HD 7970 it provides the fastest double-precision D kernel for a single GPU, achieving 120GFLOPS. D - the most compute intensive kernel in LQCD simulations - is commonly used to compare LQCD platforms. This performance is enabled by an in-depth analysis of optimization techniques for bandwidth-limited codes on GPUs. Further, analysis of the communication between GPU and CPU, as well as between multiple GPUs, enables high-performance Krylov space solvers and linear scaling to multiple GPUs within a single system. LQCD

  8. Energy- and cost-efficient lattice-QCD computations using graphics processing units

    International Nuclear Information System (INIS)

    Bach, Matthias

    2014-01-01

    Quarks and gluons are the building blocks of all hadronic matter, like protons and neutrons. Their interaction is described by Quantum Chromodynamics (QCD), a theory under test by large scale experiments like the Large Hadron Collider (LHC) at CERN and in the future at the Facility for Antiproton and Ion Research (FAIR) at GSI. However, perturbative methods can only be applied to QCD for high energies. Studies from first principles are possible via a discretization onto an Euclidean space-time grid. This discretization of QCD is called Lattice QCD (LQCD) and is the only ab-initio option outside of the high-energy regime. LQCD is extremely compute and memory intensive. In particular, it is by definition always bandwidth limited. Thus - despite the complexity of LQCD applications - it led to the development of several specialized compute platforms and influenced the development of others. However, in recent years General-Purpose computation on Graphics Processing Units (GPGPU) came up as a new means for parallel computing. Contrary to machines traditionally used for LQCD, graphics processing units (GPUs) are a massmarket product. This promises advantages in both the pace at which higher-performing hardware becomes available and its price. CL2QCD is an OpenCL based implementation of LQCD using Wilson fermions that was developed within this thesis. It operates on GPUs by all major vendors as well as on central processing units (CPUs). On the AMD Radeon HD 7970 it provides the fastest double-precision D kernel for a single GPU, achieving 120GFLOPS. D - the most compute intensive kernel in LQCD simulations - is commonly used to compare LQCD platforms. This performance is enabled by an in-depth analysis of optimization techniques for bandwidth-limited codes on GPUs. Further, analysis of the communication between GPU and CPU, as well as between multiple GPUs, enables high-performance Krylov space solvers and linear scaling to multiple GPUs within a single system. LQCD

  9. Efficient technique for computational design of thermoelectric materials

    Science.gov (United States)

    Núñez-Valdez, Maribel; Allahyari, Zahed; Fan, Tao; Oganov, Artem R.

    2018-01-01

    Efficient thermoelectric materials are highly desirable, and the quest for finding them has intensified as they could be promising alternatives to fossil energy sources. Here we present a general first-principles approach to predict, in multicomponent systems, efficient thermoelectric compounds. The method combines a robust evolutionary algorithm, a Pareto multiobjective optimization, density functional theory and a Boltzmann semi-classical calculation of thermoelectric efficiency. To test the performance and reliability of our overall framework, we use the well-known system Bi2Te3-Sb2Te3.

  10. Efficient implementation of multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

    2012-01-10

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  11. Development of a computer program to support an efficient non-regression test of a thermal-hydraulic system code

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Jun Yeob; Jeong, Jae Jun [School of Mechanical Engineering, Pusan National University, Busan (Korea, Republic of); Suh, Jae Seung [System Engineering and Technology Co., Daejeon (Korea, Republic of); Kim, Kyung Doo [Korea Atomic Energy Research Institute, Daejeon (Korea, Republic of)

    2014-10-15

    During the development process of a thermal-hydraulic system code, a non-regression test (NRT) must be performed repeatedly in order to prevent software regression. The NRT process, however, is time-consuming and labor-intensive. Thus, automation of this process is an ideal solution. In this study, we have developed a program to support an efficient NRT for the SPACE code and demonstrated its usability. This results in a high degree of efficiency for code development. The program was developed using the Visual Basic for Applications and designed so that it can be easily customized for the NRT of other computer codes.

  12. Towards a Highly Efficient Meshfree Simulation of Non-Newtonian Free Surface Ice Flow: Application to the Haut Glacier d'Arolla

    Science.gov (United States)

    Shcherbakov, V.; Ahlkrona, J.

    2016-12-01

    In this work we develop a highly efficient meshfree approach to ice sheet modeling. Traditionally mesh based methods such as finite element methods are employed to simulate glacier and ice sheet dynamics. These methods are mature and well developed. However, despite of numerous advantages these methods suffer from some drawbacks such as necessity to remesh the computational domain every time it changes its shape, which significantly complicates the implementation on moving domains, or a costly assembly procedure for nonlinear problems. We introduce a novel meshfree approach that frees us from all these issues. The approach is built upon a radial basis function (RBF) method that, thanks to its meshfree nature, allows for an efficient handling of moving margins and free ice surface. RBF methods are also accurate and easy to implement. Since the formulation is stated in strong form it allows for a substantial reduction of the computational cost associated with the linear system assembly inside the nonlinear solver. We implement a global RBF method that defines an approximation on the entire computational domain. This method exhibits high accuracy properties. However, it suffers from a disadvantage that the coefficient matrix is dense, and therefore the computational efficiency decreases. In order to overcome this issue we also implement a localized RBF method that rests upon a partition of unity approach to subdivide the domain into several smaller subdomains. The radial basis function partition of unity method (RBF-PUM) inherits high approximation characteristics form the global RBF method while resulting in a sparse system of equations, which essentially increases the computational efficiency. To demonstrate the usefulness of the RBF methods we model the velocity field of ice flow in the Haut Glacier d'Arolla. We assume that the flow is governed by the nonlinear Blatter-Pattyn equations. We test the methods for different basal conditions and for a free moving

  13. High-reliability computing for the smarter planet

    International Nuclear Information System (INIS)

    Quinn, Heather M.; Graham, Paul; Manuzzato, Andrea; Dehon, Andre

    2010-01-01

    The geometric rate of improvement of transistor size and integrated circuit performance, known as Moore's Law, has been an engine of growth for our economy, enabling new products and services, creating new value and wealth, increasing safety, and removing menial tasks from our daily lives. Affordable, highly integrated components have enabled both life-saving technologies and rich entertainment applications. Anti-lock brakes, insulin monitors, and GPS-enabled emergency response systems save lives. Cell phones, internet appliances, virtual worlds, realistic video games, and mp3 players enrich our lives and connect us together. Over the past 40 years of silicon scaling, the increasing capabilities of inexpensive computation have transformed our society through automation and ubiquitous communications. In this paper, we will present the concept of the smarter planet, how reliability failures affect current systems, and methods that can be used to increase the reliable adoption of new automation in the future. We will illustrate these issues using a number of different electronic devices in a couple of different scenarios. Recently IBM has been presenting the idea of a 'smarter planet.' In smarter planet documents, IBM discusses increased computer automation of roadways, banking, healthcare, and infrastructure, as automation could create more efficient systems. A necessary component of the smarter planet concept is to ensure that these new systems have very high reliability. Even extremely rare reliability problems can easily escalate to problematic scenarios when implemented at very large scales. For life-critical systems, such as automobiles, infrastructure, medical implantables, and avionic systems, unmitigated failures could be dangerous. As more automation moves into these types of critical systems, reliability failures will need to be managed. As computer automation continues to increase in our society, the need for greater radiation reliability is necessary

  14. High-reliability computing for the smarter planet

    Energy Technology Data Exchange (ETDEWEB)

    Quinn, Heather M [Los Alamos National Laboratory; Graham, Paul [Los Alamos National Laboratory; Manuzzato, Andrea [UNIV OF PADOVA; Dehon, Andre [UNIV OF PENN; Carter, Nicholas [INTEL CORPORATION

    2010-01-01

    The geometric rate of improvement of transistor size and integrated circuit performance, known as Moore's Law, has been an engine of growth for our economy, enabling new products and services, creating new value and wealth, increasing safety, and removing menial tasks from our daily lives. Affordable, highly integrated components have enabled both life-saving technologies and rich entertainment applications. Anti-lock brakes, insulin monitors, and GPS-enabled emergency response systems save lives. Cell phones, internet appliances, virtual worlds, realistic video games, and mp3 players enrich our lives and connect us together. Over the past 40 years of silicon scaling, the increasing capabilities of inexpensive computation have transformed our society through automation and ubiquitous communications. In this paper, we will present the concept of the smarter planet, how reliability failures affect current systems, and methods that can be used to increase the reliable adoption of new automation in the future. We will illustrate these issues using a number of different electronic devices in a couple of different scenarios. Recently IBM has been presenting the idea of a 'smarter planet.' In smarter planet documents, IBM discusses increased computer automation of roadways, banking, healthcare, and infrastructure, as automation could create more efficient systems. A necessary component of the smarter planet concept is to ensure that these new systems have very high reliability. Even extremely rare reliability problems can easily escalate to problematic scenarios when implemented at very large scales. For life-critical systems, such as automobiles, infrastructure, medical implantables, and avionic systems, unmitigated failures could be dangerous. As more automation moves into these types of critical systems, reliability failures will need to be managed. As computer automation continues to increase in our society, the need for greater radiation reliability is

  15. Real-time computer treatment of THz passive device images with the high image quality

    Science.gov (United States)

    Trofimov, Vyacheslav A.; Trofimov, Vladislav V.

    2012-06-01

    We demonstrate real-time computer code improving significantly the quality of images captured by the passive THz imaging system. The code is not only designed for a THz passive device: it can be applied to any kind of such devices and active THz imaging systems as well. We applied our code for computer processing of images captured by four passive THz imaging devices manufactured by different companies. It should be stressed that computer processing of images produced by different companies requires using the different spatial filters usually. The performance of current version of the computer code is greater than one image per second for a THz image having more than 5000 pixels and 24 bit number representation. Processing of THz single image produces about 20 images simultaneously corresponding to various spatial filters. The computer code allows increasing the number of pixels for processed images without noticeable reduction of image quality. The performance of the computer code can be increased many times using parallel algorithms for processing the image. We develop original spatial filters which allow one to see objects with sizes less than 2 cm. The imagery is produced by passive THz imaging devices which captured the images of objects hidden under opaque clothes. For images with high noise we develop an approach which results in suppression of the noise after using the computer processing and we obtain the good quality image. With the aim of illustrating the efficiency of the developed approach we demonstrate the detection of the liquid explosive, ordinary explosive, knife, pistol, metal plate, CD, ceramics, chocolate and other objects hidden under opaque clothes. The results demonstrate the high efficiency of our approach for the detection of hidden objects and they are a very promising solution for the security problem.

  16. Implementation of a computationally efficient least-squares algorithm for highly under-determined three-dimensional diffuse optical tomography problems.

    Science.gov (United States)

    Yalavarthy, Phaneendra K; Lynch, Daniel R; Pogue, Brian W; Dehghani, Hamid; Paulsen, Keith D

    2008-05-01

    Three-dimensional (3D) diffuse optical tomography is known to be a nonlinear, ill-posed and sometimes under-determined problem, where regularization is added to the minimization to allow convergence to a unique solution. In this work, a generalized least-squares (GLS) minimization method was implemented, which employs weight matrices for both data-model misfit and optical properties to include their variances and covariances, using a computationally efficient scheme. This allows inversion of a matrix that is of a dimension dictated by the number of measurements, instead of by the number of imaging parameters. This increases the computation speed up to four times per iteration in most of the under-determined 3D imaging problems. An analytic derivation, using the Sherman-Morrison-Woodbury identity, is shown for this efficient alternative form and it is proven to be equivalent, not only analytically, but also numerically. Equivalent alternative forms for other minimization methods, like Levenberg-Marquardt (LM) and Tikhonov, are also derived. Three-dimensional reconstruction results indicate that the poor recovery of quantitatively accurate values in 3D optical images can also be a characteristic of the reconstruction algorithm, along with the target size. Interestingly, usage of GLS reconstruction methods reduces error in the periphery of the image, as expected, and improves by 20% the ability to quantify local interior regions in terms of the recovered optical contrast, as compared to LM methods. Characterization of detector photo-multiplier tubes noise has enabled the use of the GLS method for reconstructing experimental data and showed a promise for better quantification of target in 3D optical imaging. Use of these new alternative forms becomes effective when the ratio of the number of imaging property parameters exceeds the number of measurements by a factor greater than 2.

  17. Empirical Analysis of High Efficient Remote Cloud Data Center Backup Using HBase and Cassandra

    OpenAIRE

    Chang, Bao Rong; Tsai, Hsiu-Fen; Chen, Chia-Yen; Guo, Cin-Long

    2015-01-01

    HBase, a master-slave framework, and Cassandra, a peer-to-peer (P2P) framework, are the two most commonly used large-scale distributed NoSQL databases, especially applicable to the cloud computing with high flexibility and scalability and the ease of big data processing. Regarding storage structure, different structure adopts distinct backup strategy to reduce the risks of data loss. This paper aims to realize high efficient remote cloud data center backup using HBase and Cassandra, and in or...

  18. High-performance computing — an overview

    Science.gov (United States)

    Marksteiner, Peter

    1996-08-01

    An overview of high-performance computing (HPC) is given. Different types of computer architectures used in HPC are discussed: vector supercomputers, high-performance RISC processors, various parallel computers like symmetric multiprocessors, workstation clusters, massively parallel processors. Software tools and programming techniques used in HPC are reviewed: vectorizing compilers, optimization and vector tuning, optimization for RISC processors; parallel programming techniques like shared-memory parallelism, message passing and data parallelism; and numerical libraries.

  19. High Efficiency Power Converter for Low Voltage High Power Applications

    DEFF Research Database (Denmark)

    Nymand, Morten

    The topic of this thesis is the design of high efficiency power electronic dc-to-dc converters for high-power, low-input-voltage to high-output-voltage applications. These converters are increasingly required for emerging sustainable energy systems such as fuel cell, battery or photo voltaic based......, and remote power generation for light towers, camper vans, boats, beacons, and buoys etc. A review of current state-of-the-art is presented. The best performing converters achieve moderately high peak efficiencies at high input voltage and medium power level. However, system dimensioning and cost are often...

  20. High-Speed Photonic Reservoir Computing Using a Time-Delay-Based Architecture: Million Words per Second Classification

    Directory of Open Access Journals (Sweden)

    Laurent Larger

    2017-02-01

    Full Text Available Reservoir computing, originally referred to as an echo state network or a liquid state machine, is a brain-inspired paradigm for processing temporal information. It involves learning a “read-out” interpretation for nonlinear transients developed by high-dimensional dynamics when the latter is excited by the information signal to be processed. This novel computational paradigm is derived from recurrent neural network and machine learning techniques. It has recently been implemented in photonic hardware for a dynamical system, which opens the path to ultrafast brain-inspired computing. We report on a novel implementation involving an electro-optic phase-delay dynamics designed with off-the-shelf optoelectronic telecom devices, thus providing the targeted wide bandwidth. Computational efficiency is demonstrated experimentally with speech-recognition tasks. State-of-the-art speed performances reach one million words per second, with very low word error rate. Additionally, to record speed processing, our investigations have revealed computing-efficiency improvements through yet-unexplored temporal-information-processing techniques, such as simultaneous multisample injection and pitched sampling at the read-out compared to information “write-in”.

  1. Implementing Molecular Dynamics for Hybrid High Performance Computers - 1. Short Range Forces

    International Nuclear Information System (INIS)

    Brown, W. Michael; Wang, Peng; Plimpton, Steven J.; Tharrington, Arnold N.

    2011-01-01

    The use of accelerators such as general-purpose graphics processing units (GPGPUs) have become popular in scientific computing applications due to their low cost, impressive floating-point capabilities, high memory bandwidth, and low electrical power requirements. Hybrid high performance computers, machines with more than one type of floating-point processor, are now becoming more prevalent due to these advantages. In this work, we discuss several important issues in porting a large molecular dynamics code for use on parallel hybrid machines - (1) choosing a hybrid parallel decomposition that works on central processing units (CPUs) with distributed memory and accelerator cores with shared memory, (2) minimizing the amount of code that must be ported for efficient acceleration, (3) utilizing the available processing power from both many-core CPUs and accelerators, and (4) choosing a programming model for acceleration. We present our solution to each of these issues for short-range force calculation in the molecular dynamics package LAMMPS. We describe algorithms for efficient short range force calculation on hybrid high performance machines. We describe a new approach for dynamic load balancing of work between CPU and accelerator cores. We describe the Geryon library that allows a single code to compile with both CUDA and OpenCL for use on a variety of accelerators. Finally, we present results on a parallel test cluster containing 32 Fermi GPGPUs and 180 CPU cores.

  2. A Cloud Computing-Enabled Spatio-Temporal Cyber-Physical Information Infrastructure for Efficient Soil Moisture Monitoring

    Directory of Open Access Journals (Sweden)

    Lianjie Zhou

    2016-06-01

    Full Text Available Comprehensive surface soil moisture (SM monitoring is a vital task in precision agriculture applications. SM monitoring includes remote sensing imagery monitoring and in situ sensor-based observational monitoring. Cloud computing can increase computational efficiency enormously. A geographical web service was developed to assist in agronomic decision making, and this tool can be scaled to any location and crop. By integrating cloud computing and the web service-enabled information infrastructure, this study uses the cloud computing-enabled spatio-temporal cyber-physical infrastructure (CESCI to provide an efficient solution for soil moisture monitoring in precision agriculture. On the server side of CESCI, diverse Open Geospatial Consortium web services work closely with each other. Hubei Province, located on the Jianghan Plain in central China, is selected as the remote sensing study area in the experiment. The Baoxie scientific experimental field in Wuhan City is selected as the in situ sensor study area. The results show that the proposed method enhances the efficiency of remote sensing imagery mapping and in situ soil moisture interpolation. In addition, the proposed method is compared to other existing precision agriculture infrastructures. In this comparison, the proposed infrastructure performs soil moisture mapping in Hubei Province in 1.4 min and near real-time in situ soil moisture interpolation in an efficient manner. Moreover, an enhanced performance monitoring method can help to reduce costs in precision agriculture monitoring, as well as increasing agricultural productivity and farmers’ net-income.

  3. STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

    International Nuclear Information System (INIS)

    Oelerich, Jan Oliver; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

    2017-01-01

    Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.

  4. STEMsalabim: A high-performance computing cluster friendly code for scanning transmission electron microscopy image simulations of thin specimens

    Energy Technology Data Exchange (ETDEWEB)

    Oelerich, Jan Oliver, E-mail: jan.oliver.oelerich@physik.uni-marburg.de; Duschek, Lennart; Belz, Jürgen; Beyer, Andreas; Baranovskii, Sergei D.; Volz, Kerstin

    2017-06-15

    Highlights: • We present STEMsalabim, a modern implementation of the multislice algorithm for simulation of STEM images. • Our package is highly parallelizable on high-performance computing clusters, combining shared and distributed memory architectures. • With STEMsalabim, computationally and memory expensive STEM image simulations can be carried out within reasonable time. - Abstract: We present a new multislice code for the computer simulation of scanning transmission electron microscope (STEM) images based on the frozen lattice approximation. Unlike existing software packages, the code is optimized to perform well on highly parallelized computing clusters, combining distributed and shared memory architectures. This enables efficient calculation of large lateral scanning areas of the specimen within the frozen lattice approximation and fine-grained sweeps of parameter space.

  5. Efficient quantum computing using coherent photon conversion.

    Science.gov (United States)

    Langford, N K; Ramelow, S; Prevedel, R; Munro, W J; Milburn, G J; Zeilinger, A

    2011-10-12

    Single photons are excellent quantum information carriers: they were used in the earliest demonstrations of entanglement and in the production of the highest-quality entanglement reported so far. However, current schemes for preparing, processing and measuring them are inefficient. For example, down-conversion provides heralded, but randomly timed, single photons, and linear optics gates are inherently probabilistic. Here we introduce a deterministic process--coherent photon conversion (CPC)--that provides a new way to generate and process complex, multiquanta states for photonic quantum information applications. The technique uses classically pumped nonlinearities to induce coherent oscillations between orthogonal states of multiple quantum excitations. One example of CPC, based on a pumped four-wave-mixing interaction, is shown to yield a single, versatile process that provides a full set of photonic quantum processing tools. This set satisfies the DiVincenzo criteria for a scalable quantum computing architecture, including deterministic multiqubit entanglement gates (based on a novel form of photon-photon interaction), high-quality heralded single- and multiphoton states free from higher-order imperfections, and robust, high-efficiency detection. It can also be used to produce heralded multiphoton entanglement, create optically switchable quantum circuits and implement an improved form of down-conversion with reduced higher-order effects. Such tools are valuable building blocks for many quantum-enabled technologies. Finally, using photonic crystal fibres we experimentally demonstrate quantum correlations arising from a four-colour nonlinear process suitable for CPC and use these measurements to study the feasibility of reaching the deterministic regime with current technology. Our scheme, which is based on interacting bosonic fields, is not restricted to optical systems but could also be implemented in optomechanical, electromechanical and superconducting

  6. Integrated computer network high-speed parallel interface

    International Nuclear Information System (INIS)

    Frank, R.B.

    1979-03-01

    As the number and variety of computers within Los Alamos Scientific Laboratory's Central Computer Facility grows, the need for a standard, high-speed intercomputer interface has become more apparent. This report details the development of a High-Speed Parallel Interface from conceptual through implementation stages to meet current and future needs for large-scle network computing within the Integrated Computer Network. 4 figures

  7. The path toward HEP High Performance Computing

    International Nuclear Information System (INIS)

    Apostolakis, John; Brun, René; Gheata, Andrei; Wenzel, Sandro; Carminati, Federico

    2014-01-01

    High Energy Physics code has been known for making poor use of high performance computing architectures. Efforts in optimising HEP code on vector and RISC architectures have yield limited results and recent studies have shown that, on modern architectures, it achieves a performance between 10% and 50% of the peak one. Although several successful attempts have been made to port selected codes on GPUs, no major HEP code suite has a 'High Performance' implementation. With LHC undergoing a major upgrade and a number of challenging experiments on the drawing board, HEP cannot any longer neglect the less-than-optimal performance of its code and it has to try making the best usage of the hardware. This activity is one of the foci of the SFT group at CERN, which hosts, among others, the Root and Geant4 project. The activity of the experiments is shared and coordinated via a Concurrency Forum, where the experience in optimising HEP code is presented and discussed. Another activity is the Geant-V project, centred on the development of a highperformance prototype for particle transport. Achieving a good concurrency level on the emerging parallel architectures without a complete redesign of the framework can only be done by parallelizing at event level, or with a much larger effort at track level. Apart the shareable data structures, this typically implies a multiplication factor in terms of memory consumption compared to the single threaded version, together with sub-optimal handling of event processing tails. Besides this, the low level instruction pipelining of modern processors cannot be used efficiently to speedup the program. We have implemented a framework that allows scheduling vectors of particles to an arbitrary number of computing resources in a fine grain parallel approach. The talk will review the current optimisation activities within the SFT group with a particular emphasis on the development perspectives towards a simulation framework able to profit

  8. Optimisation of the energy efficiency of bread-baking ovens using a combined experimental and computational approach

    International Nuclear Information System (INIS)

    Khatir, Zinedine; Paton, Joe; Thompson, Harvey; Kapur, Nik; Toropov, Vassili

    2013-01-01

    Highlights: ► A scientific framework for optimising oven operating conditions is presented. ► Experiments measuring local convective heat transfer coefficient are undertaken. ► An energy efficiency model is developed with experimentally calibrated CFD analysis. ► Designing ovens with optimum heat transfer coefficients reduces energy use. ► Results demonstrate a strong case to design and manufacture energy optimised ovens. - Abstract: Changing legislation and rising energy costs are bringing the need for efficient baking processes into much sharper focus. High-speed air impingement bread-baking ovens are complex systems using air flow to transfer heat to the product. In this paper, computational fluid dynamics (CFD) is combined with experimental analysis to develop a rigorous scientific framework for the rapid generation of forced convection oven designs. A design parameterisation of a three-dimensional generic oven model is carried out for a wide range of oven sizes and flow conditions to optimise desirable features such as temperature uniformity throughout the oven, energy efficiency and manufacturability. Coupled with the computational model, a series of experiments measuring the local convective heat transfer coefficient (h c ) are undertaken. The facility used for the heat transfer experiments is representative of a scaled-down production oven where the air temperature and velocity as well as important physical constraints such as nozzle dimensions and nozzle-to-surface distance can be varied. An efficient energy model is developed using a CFD analysis calibrated using experimentally determined inputs. Results from a range of oven designs are presented together with ensuing energy usage and savings

  9. High-efficiency and low-loss gallium nitride dielectric metasurfaces for nanophotonics at visible wavelengths

    Science.gov (United States)

    Emani, Naresh Kumar; Khaidarov, Egor; Paniagua-Domínguez, Ramón; Fu, Yuan Hsing; Valuckas, Vytautas; Lu, Shunpeng; Zhang, Xueliang; Tan, Swee Tiam; Demir, Hilmi Volkan; Kuznetsov, Arseniy I.

    2017-11-01

    The dielectric nanophotonics research community is currently exploring transparent material platforms (e.g., TiO2, Si3N4, and GaP) to realize compact high efficiency optical devices at visible wavelengths. Efficient visible-light operation is key to integrating atomic quantum systems for future quantum computing. Gallium nitride (GaN), a III-V semiconductor which is highly transparent at visible wavelengths, is a promising material choice for active, nonlinear, and quantum nanophotonic applications. Here, we present the design and experimental realization of high efficiency beam deflecting and polarization beam splitting metasurfaces consisting of GaN nanostructures etched on the GaN epitaxial substrate itself. We demonstrate a polarization insensitive beam deflecting metasurface with 64% and 90% absolute and relative efficiencies. Further, a polarization beam splitter with an extinction ratio of 8.6/1 (6.2/1) and a transmission of 73% (67%) for p-polarization (s-polarization) is implemented to demonstrate the broad functionality that can be realized on this platform. The metasurfaces in our work exhibit a broadband response in the blue wavelength range of 430-470 nm. This nanophotonic platform of GaN shows the way to off- and on-chip nonlinear and quantum photonic devices working efficiently at blue emission wavelengths common to many atomic quantum emitters such as Ca+ and Sr+ ions.

  10. Graphics processor efficiency for realization of rapid tabular computations

    International Nuclear Information System (INIS)

    Dudnik, V.A.; Kudryavtsev, V.I.; Us, S.A.; Shestakov, M.V.

    2016-01-01

    Capabilities of graphics processing units (GPU) and central processing units (CPU) have been investigated for realization of fast-calculation algorithms with the use of tabulated functions. The realization of tabulated functions is exemplified by the GPU/CPU architecture-based processors. Comparison is made between the operating efficiencies of GPU and CPU, employed for tabular calculations at different conditions of use. Recommendations are formulated for the use of graphical and central processors to speed up scientific and engineering computations through the use of tabulated functions

  11. Efficient quantum algorithm for computing n-time correlation functions.

    Science.gov (United States)

    Pedernales, J S; Di Candia, R; Egusquiza, I L; Casanova, J; Solano, E

    2014-07-11

    We propose a method for computing n-time correlation functions of arbitrary spinorial, fermionic, and bosonic operators, consisting of an efficient quantum algorithm that encodes these correlations in an initially added ancillary qubit for probe and control tasks. For spinorial and fermionic systems, the reconstruction of arbitrary n-time correlation functions requires the measurement of two ancilla observables, while for bosonic variables time derivatives of the same observables are needed. Finally, we provide examples applicable to different quantum platforms in the frame of the linear response theory.

  12. A-VCI: A flexible method to efficiently compute vibrational spectra

    Science.gov (United States)

    Odunlami, Marc; Le Bris, Vincent; Bégué, Didier; Baraille, Isabelle; Coulaud, Olivier

    2017-06-01

    The adaptive vibrational configuration interaction algorithm has been introduced as a new method to efficiently reduce the dimension of the set of basis functions used in a vibrational configuration interaction process. It is based on the construction of nested bases for the discretization of the Hamiltonian operator according to a theoretical criterion that ensures the convergence of the method. In the present work, the Hamiltonian is written as a sum of products of operators. The purpose of this paper is to study the properties and outline the performance details of the main steps of the algorithm. New parameters have been incorporated to increase flexibility, and their influence has been thoroughly investigated. The robustness and reliability of the method are demonstrated for the computation of the vibrational spectrum up to 3000 cm-1 of a widely studied 6-atom molecule (acetonitrile). Our results are compared to the most accurate up to date computation; we also give a new reference calculation for future work on this system. The algorithm has also been applied to a more challenging 7-atom molecule (ethylene oxide). The computed spectrum up to 3200 cm-1 is the most accurate computation that exists today on such systems.

  13. Toward efficient computation of the expected relative entropy for nonlinear experimental design

    International Nuclear Information System (INIS)

    Coles, Darrell; Prange, Michael

    2012-01-01

    The expected relative entropy between prior and posterior model-parameter distributions is a Bayesian objective function in experimental design theory that quantifies the expected gain in information of an experiment relative to a previous state of knowledge. The expected relative entropy is a preferred measure of experimental quality because it can handle nonlinear data-model relationships, an important fact due to the ubiquity of nonlinearity in science and engineering and its effects on post-inversion parameter uncertainty. This objective function does not necessarily yield experiments that mediate well-determined systems, but, being a Bayesian quality measure, it rigorously accounts for prior information which constrains model parameters that may be only weakly constrained by the optimized dataset. Historically, use of the expected relative entropy has been limited by the computing and storage requirements associated with high-dimensional numerical integration. Herein, a bifocal algorithm is developed that makes these computations more efficient. The algorithm is demonstrated on a medium-sized problem of sampling relaxation phenomena and on a large problem of source–receiver selection for a 2D vertical seismic profile. The method is memory intensive but workarounds are discussed. (paper)

  14. DOE research in utilization of high-performance computers

    International Nuclear Information System (INIS)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-12-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models whose execution is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex; consequently, it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure

  15. Parameters that affect parallel processing for computational electromagnetic simulation codes on high performance computing clusters

    Science.gov (United States)

    Moon, Hongsik

    changing computer hardware platforms in order to provide fast, accurate and efficient solutions to large, complex electromagnetic problems. The research in this dissertation proves that the performance of parallel code is intimately related to the configuration of the computer hardware and can be maximized for different hardware platforms. To benchmark and optimize the performance of parallel CEM software, a variety of large, complex projects are created and executed on a variety of computer platforms. The computer platforms used in this research are detailed in this dissertation. The projects run as benchmarks are also described in detail and results are presented. The parameters that affect parallel CEM software on High Performance Computing Clusters (HPCC) are investigated. This research demonstrates methods to maximize the performance of parallel CEM software code.

  16. Enabling the ATLAS Experiment at the LHC for High Performance Computing

    CERN Document Server

    AUTHOR|(CDS)2091107; Ereditato, Antonio

    In this thesis, I studied the feasibility of running computer data analysis programs from the Worldwide LHC Computing Grid, in particular large-scale simulations of the ATLAS experiment at the CERN LHC, on current general purpose High Performance Computing (HPC) systems. An approach for integrating HPC systems into the Grid is proposed, which has been implemented and tested on the „Todi” HPC machine at the Swiss National Supercomputing Centre (CSCS). Over the course of the test, more than 500000 CPU-hours of processing time have been provided to ATLAS, which is roughly equivalent to the combined computing power of the two ATLAS clusters at the University of Bern. This showed that current HPC systems can be used to efficiently run large-scale simulations of the ATLAS detector and of the detected physics processes. As a first conclusion of my work, one can argue that, in perspective, running large-scale tasks on a few large machines might be more cost-effective than running on relatively small dedicated com...

  17. High Efficiency Room Air Conditioner

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, Pradeep [Oak Ridge National Lab. (ORNL), Oak Ridge, TN (United States)

    2015-01-01

    This project was undertaken as a CRADA project between UT-Battelle and Geberal Electric Company and was funded by Department of Energy to design and develop of a high efficiency room air conditioner. A number of novel elements were investigated to improve the energy efficiency of a state-of-the-art WAC with base capacity of 10,000 BTU/h. One of the major modifications was made by downgrading its capacity from 10,000 BTU/hr to 8,000 BTU/hr by replacing the original compressor with a lower capacity (8,000 BTU/hr) but high efficiency compressor having an EER of 9.7 as compared with 9.3 of the original compressor. However, all heat exchangers from the original unit were retained to provide higher EER. The other subsequent major modifications included- (i) the AC fan motor was replaced by a brushless high efficiency ECM motor along with its fan housing, (ii) the capillary tube was replaced with a needle valve to better control the refrigerant flow and refrigerant set points, and (iii) the unit was tested with a drop-in environmentally friendly binary mixture of R32 (90% molar concentration)/R125 (10% molar concentration). The WAC was tested in the environmental chambers at ORNL as per the design rating conditions of AHAM/ASHRAE (Outdoor- 95F and 40%RH, Indoor- 80F, 51.5%RH). All these modifications resulted in enhancing the EER of the WAC by up to 25%.

  18. An efficient computational method for global sensitivity analysis and its application to tree growth modelling

    International Nuclear Information System (INIS)

    Wu, Qiong-Li; Cournède, Paul-Henry; Mathieu, Amélie

    2012-01-01

    Global sensitivity analysis has a key role to play in the design and parameterisation of functional–structural plant growth models which combine the description of plant structural development (organogenesis and geometry) and functional growth (biomass accumulation and allocation). We are particularly interested in this study in Sobol's method which decomposes the variance of the output of interest into terms due to individual parameters but also to interactions between parameters. Such information is crucial for systems with potentially high levels of non-linearity and interactions between processes, like plant growth. However, the computation of Sobol's indices relies on Monte Carlo sampling and re-sampling, whose costs can be very high, especially when model evaluation is also expensive, as for tree models. In this paper, we thus propose a new method to compute Sobol's indices inspired by Homma–Saltelli, which improves slightly their use of model evaluations, and then derive for this generic type of computational methods an estimator of the error estimation of sensitivity indices with respect to the sampling size. It allows the detailed control of the balance between accuracy and computing time. Numerical tests on a simple non-linear model are convincing and the method is finally applied to a functional–structural model of tree growth, GreenLab, whose particularity is the strong level of interaction between plant functioning and organogenesis. - Highlights: ► We study global sensitivity analysis in the context of functional–structural plant modelling. ► A new estimator based on Homma–Saltelli method is proposed to compute Sobol indices, based on a more balanced re-sampling strategy. ► The estimation accuracy of sensitivity indices for a class of Sobol's estimators can be controlled by error analysis. ► The proposed algorithm is implemented efficiently to compute Sobol indices for a complex tree growth model.

  19. Computationally Efficient 2D DOA Estimation for L-Shaped Array with Unknown Mutual Coupling

    Directory of Open Access Journals (Sweden)

    Yang-Yang Dong

    2018-01-01

    Full Text Available Although L-shaped array can provide good angle estimation performance and is easy to implement, its two-dimensional (2D direction-of-arrival (DOA performance degrades greatly in the presence of mutual coupling. To deal with the mutual coupling effect, a novel 2D DOA estimation method for L-shaped array with low computational complexity is developed in this paper. First, we generalize the conventional mutual coupling model for L-shaped array and compensate the mutual coupling blindly via sacrificing a few sensors as auxiliary elements. Then we apply the propagator method twice to mitigate the effect of strong source signal correlation effect. Finally, the estimations of azimuth and elevation angles are achieved simultaneously without pair matching via the complex eigenvalue technique. Compared with the existing methods, the proposed method is computationally efficient without spectrum search or polynomial rooting and also has fine angle estimation performance for highly correlated source signals. Theoretical analysis and simulation results have demonstrated the effectiveness of the proposed method.

  20. High Performance Computing in Science and Engineering '16 : Transactions of the High Performance Computing Center, Stuttgart (HLRS) 2016

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2016-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS) in 2016. The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and engineers. The book comes with a wealth of color illustrations and tables of results.

  1. High-performance computing on the Intel Xeon Phi how to fully exploit MIC architectures

    CERN Document Server

    Wang, Endong; Shen, Bo; Zhang, Guangyong; Lu, Xiaowei; Wu, Qing; Wang, Yajuan

    2014-01-01

    The aim of this book is to explain to high-performance computing (HPC) developers how to utilize the Intel® Xeon Phi™ series products efficiently. To that end, it introduces some computing grammar, programming technology and optimization methods for using many-integrated-core (MIC) platforms and also offers tips and tricks for actual use, based on the authors' first-hand optimization experience.The material is organized in three sections. The first section, "Basics of MIC", introduces the fundamentals of MIC architecture and programming, including the specific Intel MIC programming environment

  2. An efficient hysteresis modeling methodology and its implementation in field computation applications

    Energy Technology Data Exchange (ETDEWEB)

    Adly, A.A., E-mail: adlyamr@gmail.com [Electrical Power and Machines Dept., Faculty of Engineering, Cairo University, Giza 12613 (Egypt); Abd-El-Hafiz, S.K. [Engineering Mathematics Department, Faculty of Engineering, Cairo University, Giza 12613 (Egypt)

    2017-07-15

    Highlights: • An approach to simulate hysteresis while taking shape anisotropy into consideration. • Utilizing the ensemble of triangular sub-regions hysteresis models in field computation. • A novel tool capable of carrying out field computation while keeping track of hysteresis losses. • The approach may be extended for 3D tetra-hedra sub-volumes. - Abstract: Field computation in media exhibiting hysteresis is crucial to a variety of applications such as magnetic recording processes and accurate determination of core losses in power devices. Recently, Hopfield neural networks (HNN) have been successfully configured to construct scalar and vector hysteresis models. This paper presents an efficient hysteresis modeling methodology and its implementation in field computation applications. The methodology is based on the application of the integral equation approach on discretized triangular magnetic sub-regions. Within every triangular sub-region, hysteresis properties are realized using a 3-node HNN. Details of the approach and sample computation results are given in the paper.

  3. Computing in high-energy physics

    International Nuclear Information System (INIS)

    Mount, Richard P.

    2016-01-01

    I present a very personalized journey through more than three decades of computing for experimental high-energy physics, pointing out the enduring lessons that I learned. This is followed by a vision of how the computing environment will evolve in the coming ten years and the technical challenges that this will bring. I then address the scale and cost of high-energy physics software and examine the many current and future challenges, particularly those of management, funding and software-lifecycle management. Lastly, I describe recent developments aimed at improving the overall coherence of high-energy physics software

  4. Computing in high-energy physics

    Science.gov (United States)

    Mount, Richard P.

    2016-04-01

    I present a very personalized journey through more than three decades of computing for experimental high-energy physics, pointing out the enduring lessons that I learned. This is followed by a vision of how the computing environment will evolve in the coming ten years and the technical challenges that this will bring. I then address the scale and cost of high-energy physics software and examine the many current and future challenges, particularly those of management, funding and software-lifecycle management. Finally, I describe recent developments aimed at improving the overall coherence of high-energy physics software.

  5. Using high performance interconnects in a distributed computing and mass storage environment

    International Nuclear Information System (INIS)

    Ernst, M.

    1994-01-01

    Detector Collaborations of the HERA Experiments typically involve more than 500 physicists from a few dozen institutes. These physicists require access to large amounts of data in a fully transparent manner. Important issues include Distributed Mass Storage Management Systems in a Distributed and Heterogeneous Computing Environment. At the very center of a distributed system, including tens of CPUs and network attached mass storage peripherals are the communication links. Today scientists are witnessing an integration of computing and communication technology with the open-quote network close-quote becoming the computer. This contribution reports on a centrally operated computing facility for the HERA Experiments at DESY, including Symmetric Multiprocessor Machines (84 Processors), presently more than 400 GByte of magnetic disk and 40 TB of automoted tape storage, tied together by a HIPPI open-quote network close-quote. Focussing on the High Performance Interconnect technology, details will be provided about the HIPPI based open-quote Backplane close-quote configured around a 20 Gigabit/s Multi Media Router and the performance and efficiency of the related computer interfaces

  6. Improving Computational Efficiency of Prediction in Model-Based Prognostics Using the Unscented Transform

    Science.gov (United States)

    Daigle, Matthew John; Goebel, Kai Frank

    2010-01-01

    Model-based prognostics captures system knowledge in the form of physics-based models of components, and how they fail, in order to obtain accurate predictions of end of life (EOL). EOL is predicted based on the estimated current state distribution of a component and expected profiles of future usage. In general, this requires simulations of the component using the underlying models. In this paper, we develop a simulation-based prediction methodology that achieves computational efficiency by performing only the minimal number of simulations needed in order to accurately approximate the mean and variance of the complete EOL distribution. This is performed through the use of the unscented transform, which predicts the means and covariances of a distribution passed through a nonlinear transformation. In this case, the EOL simulation acts as that nonlinear transformation. In this paper, we review the unscented transform, and describe how this concept is applied to efficient EOL prediction. As a case study, we develop a physics-based model of a solenoid valve, and perform simulation experiments to demonstrate improved computational efficiency without sacrificing prediction accuracy.

  7. Computationally Efficient and Noise Robust DOA and Pitch Estimation

    DEFF Research Database (Denmark)

    Karimian-Azari, Sam; Jensen, Jesper Rindom; Christensen, Mads Græsbøll

    2016-01-01

    Many natural signals, such as voiced speech and some musical instruments, are approximately periodic over short intervals. These signals are often described in mathematics by the sum of sinusoids (harmonics) with frequencies that are proportional to the fundamental frequency, or pitch. In sensor...... a joint DOA and pitch estimator. In white Gaussian noise, we derive even more computationally efficient solutions which are designed using the narrowband power spectrum of the harmonics. Numerical results reveal the performance of the estimators in colored noise compared with the Cram\\'{e}r-Rao lower...

  8. High Performance Computing in Science and Engineering '02 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2003-01-01

    This book presents the state-of-the-art in modeling and simulation on supercomputers. Leading German research groups present their results achieved on high-end systems of the High Performance Computing Center Stuttgart (HLRS) for the year 2002. Reports cover all fields of supercomputing simulation ranging from computational fluid dynamics to computer science. Special emphasis is given to industrially relevant applications. Moreover, by presenting results for both vector sytems and micro-processor based systems the book allows to compare performance levels and usability of a variety of supercomputer architectures. It therefore becomes an indispensable guidebook to assess the impact of the Japanese Earth Simulator project on supercomputing in the years to come.

  9. Complexity control algorithm based on adaptive mode selection for interframe coding in high efficiency video coding

    Science.gov (United States)

    Chen, Gang; Yang, Bing; Zhang, Xiaoyun; Gao, Zhiyong

    2017-07-01

    The latest high efficiency video coding (HEVC) standard significantly increases the encoding complexity for improving its coding efficiency. Due to the limited computational capability of handheld devices, complexity constrained video coding has drawn great attention in recent years. A complexity control algorithm based on adaptive mode selection is proposed for interframe coding in HEVC. Considering the direct proportionality between encoding time and computational complexity, the computational complexity is measured in terms of encoding time. First, complexity is mapped to a target in terms of prediction modes. Then, an adaptive mode selection algorithm is proposed for the mode decision process. Specifically, the optimal mode combination scheme that is chosen through offline statistics is developed at low complexity. If the complexity budget has not been used up, an adaptive mode sorting method is employed to further improve coding efficiency. The experimental results show that the proposed algorithm achieves a very large complexity control range (as low as 10%) for the HEVC encoder while maintaining good rate-distortion performance. For the lowdelayP condition, compared with the direct resource allocation method and the state-of-the-art method, an average gain of 0.63 and 0.17 dB in BDPSNR is observed for 18 sequences when the target complexity is around 40%.

  10. Technical Report: Toward a Scalable Algorithm to Compute High-Dimensional Integrals of Arbitrary Functions

    International Nuclear Information System (INIS)

    Snyder, Abigail C.; Jiao, Yu

    2010-01-01

    Neutron experiments at the Spallation Neutron Source (SNS) at Oak Ridge National Laboratory (ORNL) frequently generate large amounts of data (on the order of 106-1012 data points). Hence, traditional data analysis tools run on a single CPU take too long to be practical and scientists are unable to efficiently analyze all data generated by experiments. Our goal is to develop a scalable algorithm to efficiently compute high-dimensional integrals of arbitrary functions. This algorithm can then be used to integrate the four-dimensional integrals that arise as part of modeling intensity from the experiments at the SNS. Here, three different one-dimensional numerical integration solvers from the GNU Scientific Library were modified and implemented to solve four-dimensional integrals. The results of these solvers on a final integrand provided by scientists at the SNS can be compared to the results of other methods, such as quasi-Monte Carlo methods, computing the same integral. A parallelized version of the most efficient method can allow scientists the opportunity to more effectively analyze all experimental data.

  11. 40 CFR 761.71 - High efficiency boilers.

    Science.gov (United States)

    2010-07-01

    ... 40 Protection of Environment 30 2010-07-01 2010-07-01 false High efficiency boilers. 761.71... PROHIBITIONS Storage and Disposal § 761.71 High efficiency boilers. (a) To burn mineral oil dielectric fluid containing a PCB concentration of ≥50 ppm, but boiler shall comply with the following...

  12. Evaluation of the efficiency of computer-aided spectra search systems based on information theory

    International Nuclear Information System (INIS)

    Schaarschmidt, K.

    1979-01-01

    Application of information theory allows objective evaluation of the efficiency of computer-aided spectra search systems. For this purpose, a significant number of search processes must be analyzed. The amount of information gained by computer application is considered as the difference between the entropy of the data bank and a conditional entropy depending on the proportion of unsuccessful search processes and ballast. The influence of the following factors can be estimated: volume, structure, and quality of the spectra collection stored, efficiency of the encoding instruction and the comparing algorithm, and subjective errors involved in the encoding of spectra. The relations derived are applied to two published storage and retrieval systems for infared spectra. (Auth.)

  13. Unconventional, High-Efficiency Propulsors

    DEFF Research Database (Denmark)

    Andersen, Poul

    1996-01-01

    The development of ship propellers has generally been characterized by search for propellers with as high efficiency as possible and at the same time low noise and vibration levels and little or no cavitation. This search has lead to unconventional propulsors, like vane-wheel propulsors, contra-r...

  14. A computationally efficient 3D finite-volume scheme for violent liquid–gas sloshing

    CSIR Research Space (South Africa)

    Oxtoby, Oliver F

    2015-10-01

    Full Text Available We describe a semi-implicit volume-of-fluid free-surface-modelling methodology for flow problems involving violent free-surface motion. For efficient computation, a hybrid-unstructured edge-based vertex-centred finite volume discretisation...

  15. Phosphorene Co-catalyst Advancing Highly Efficient Visible-Light Photocatalytic Hydrogen Production.

    Science.gov (United States)

    Ran, Jingrun; Zhu, Bicheng; Qiao, Shi-Zhang

    2017-08-21

    Transitional metals are widely used as co-catalysts boosting photocatalytic H 2 production. However, metal-based co-catalysts suffer from high cost, limited abundance and detrimental environment impact. To date, metal-free co-catalyst is rarely reported. Here we for the first time utilized density functional calculations to guide the application of phosphorene as a high-efficiency metal-free co-catalyst for CdS, Zn 0.8 Cd 0.2 S or ZnS. Particularly, phosphorene modified CdS shows a high apparent quantum yield of 34.7 % at 420 nm. This outstanding activity arises from the strong electronic coupling between phosphorene and CdS, as well as the favorable band structure, high charge mobility and massive active sites of phosphorene, supported by computations and advanced characterizations, for example, synchrotron-based X-ray absorption near edge spectroscopy. This work brings new opportunities to prepare highly-active, cheap and green photocatalysts. © 2017 Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.

  16. Trends in high-performance computing for engineering calculations.

    Science.gov (United States)

    Giles, M B; Reguly, I

    2014-08-13

    High-performance computing has evolved remarkably over the past 20 years, and that progress is likely to continue. However, in recent years, this progress has been achieved through greatly increased hardware complexity with the rise of multicore and manycore processors, and this is affecting the ability of application developers to achieve the full potential of these systems. This article outlines the key developments on the hardware side, both in the recent past and in the near future, with a focus on two key issues: energy efficiency and the cost of moving data. It then discusses the much slower evolution of system software, and the implications of all of this for application developers. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  17. Research & Implementation of AC - DC Converter with High Power Factor & High Efficiency

    Directory of Open Access Journals (Sweden)

    Hsiou-Hsian Nien

    2014-05-01

    Full Text Available In this paper, we design and develop a high power factor, high efficiency two-stage AC - DC power converter. This paper proposes a two-stage AC - DC power converter. The first stage is boost active power factor correction circuit. The latter stage is near constant frequency LLC resonant converter. In addition to traditional LLC high efficiency advantages, light-load conversion efficiency of this power converter can be improved. And it possesses high power factor and near constant frequency operating characteristics, can significantly reduce the electromagnetic interference. This paper first discusses the main structure and control manner of power factor correction circuit. And then by the LLC resonant converter equivalent model proceed to circuit analysis to determine the important parameters of the converter circuit elements. Then design a variable frequency resonant tank. The resonant frequency can change automatically on the basis of the load to reach near constant frequency operation and a purpose of high efficiency. Finally, actually design and produce an AC – DC power converter with output of 190W to verify the characteristics and feasibility of this converter. The experimental results show that in a very light load (9.5 W the efficiency is as high as 81%, the highest efficiency of 88% (90 W. Full load efficiency is 87%. At 19 W ~ 190 W power changes, the operating frequency change is only 0.4 kHz (AC 110 V and 0.3 kHz (AC 220 V.

  18. A High Performance COTS Based Computer Architecture

    Science.gov (United States)

    Patte, Mathieu; Grimoldi, Raoul; Trautner, Roland

    2014-08-01

    Using Commercial Off The Shelf (COTS) electronic components for space applications is a long standing idea. Indeed the difference in processing performance and energy efficiency between radiation hardened components and COTS components is so important that COTS components are very attractive for use in mass and power constrained systems. However using COTS components in space is not straightforward as one must account with the effects of the space environment on the COTS components behavior. In the frame of the ESA funded activity called High Performance COTS Based Computer, Airbus Defense and Space and its subcontractor OHB CGS have developed and prototyped a versatile COTS based architecture for high performance processing. The rest of the paper is organized as follows: in a first section we will start by recapitulating the interests and constraints of using COTS components for space applications; then we will briefly describe existing fault mitigation architectures and present our solution for fault mitigation based on a component called the SmartIO; in the last part of the paper we will describe the prototyping activities executed during the HiP CBC project.

  19. Efficiency of poly-generating high temperature fuel cells

    Energy Technology Data Exchange (ETDEWEB)

    Margalef, Pere; Brown, Tim; Brouwer, Jacob; Samuelsen, Scott [National Fuel Cell Research Center (NFCRC), University of California, Irvine, CA 92697-3550 (United States)

    2011-02-15

    High temperature fuel cells can be designed and operated to poly-generate electricity, heat, and useful chemicals (e.g., hydrogen) in a variety of configurations. The highly integrated and synergistic nature of poly-generating high temperature fuel cells, however, precludes a simple definition of efficiency for analysis and comparison of performance to traditional methods. There is a need to develop and define a methodology to calculate each of the co-product efficiencies that is useful for comparative analyses. Methodologies for calculating poly-generation efficiencies are defined and discussed. The methodologies are applied to analysis of a Hydrogen Energy Station (H{sub 2}ES) showing that high conversion efficiency can be achieved for poly-generation of electricity and hydrogen. (author)

  20. Computational Thinking and Practice - A Generic Approach to Computing in Danish High Schools

    DEFF Research Database (Denmark)

    Caspersen, Michael E.; Nowack, Palle

    2014-01-01

    Internationally, there is a growing awareness on the necessity of providing relevant computing education in schools, particularly high schools. We present a new and generic approach to Computing in Danish High Schools based on a conceptual framework derived from ideas related to computational thi...

  1. Computational Efficient Upscaling Methodology for Predicting Thermal Conductivity of Nuclear Waste forms

    International Nuclear Information System (INIS)

    Li, Dongsheng; Sun, Xin; Khaleel, Mohammad A.

    2011-01-01

    This study evaluated different upscaling methods to predict thermal conductivity in loaded nuclear waste form, a heterogeneous material system. The efficiency and accuracy of these methods were compared. Thermal conductivity in loaded nuclear waste form is an important property specific to scientific researchers, in waste form Integrated performance and safety code (IPSC). The effective thermal conductivity obtained from microstructure information and local thermal conductivity of different components is critical in predicting the life and performance of waste form during storage. How the heat generated during storage is directly related to thermal conductivity, which in turn determining the mechanical deformation behavior, corrosion resistance and aging performance. Several methods, including the Taylor model, Sachs model, self-consistent model, and statistical upscaling models were developed and implemented. Due to the absence of experimental data, prediction results from finite element method (FEM) were used as reference to determine the accuracy of different upscaling models. Micrographs from different loading of nuclear waste were used in the prediction of thermal conductivity. Prediction results demonstrated that in term of efficiency, boundary models (Taylor and Sachs model) are better than self consistent model, statistical upscaling method and FEM. Balancing the computation resource and accuracy, statistical upscaling is a computational efficient method in predicting effective thermal conductivity for nuclear waste form.

  2. Efficient implementation of a multidimensional fast fourier transform on a distributed-memory parallel multi-node computer

    Science.gov (United States)

    Bhanot, Gyan V [Princeton, NJ; Chen, Dong [Croton-On-Hudson, NY; Gara, Alan G [Mount Kisco, NY; Giampapa, Mark E [Irvington, NY; Heidelberger, Philip [Cortlandt Manor, NY; Steinmacher-Burow, Burkhard D [Mount Kisco, NY; Vranas, Pavlos M [Bedford Hills, NY

    2008-01-01

    The present in invention is directed to a method, system and program storage device for efficiently implementing a multidimensional Fast Fourier Transform (FFT) of a multidimensional array comprising a plurality of elements initially distributed in a multi-node computer system comprising a plurality of nodes in communication over a network, comprising: distributing the plurality of elements of the array in a first dimension across the plurality of nodes of the computer system over the network to facilitate a first one-dimensional FFT; performing the first one-dimensional FFT on the elements of the array distributed at each node in the first dimension; re-distributing the one-dimensional FFT-transformed elements at each node in a second dimension via "all-to-all" distribution in random order across other nodes of the computer system over the network; and performing a second one-dimensional FFT on elements of the array re-distributed at each node in the second dimension, wherein the random order facilitates efficient utilization of the network thereby efficiently implementing the multidimensional FFT. The "all-to-all" re-distribution of array elements is further efficiently implemented in applications other than the multidimensional FFT on the distributed-memory parallel supercomputer.

  3. Development of a Computational Steering Framework for High Performance Computing Environments on Blue Gene/P Systems

    KAUST Repository

    Danani, Bob K.

    2012-07-01

    Computational steering has revolutionized the traditional workflow in high performance computing (HPC) applications. The standard workflow that consists of preparation of an application’s input, running of a simulation, and visualization of simulation results in a post-processing step is now transformed into a real-time interactive workflow that significantly reduces development and testing time. Computational steering provides the capability to direct or re-direct the progress of a simulation application at run-time. It allows modification of application-defined control parameters at run-time using various user-steering applications. In this project, we propose a computational steering framework for HPC environments that provides an innovative solution and easy-to-use platform, which allows users to connect and interact with running application(s) in real-time. This framework uses RealityGrid as the underlying steering library and adds several enhancements to the library to enable steering support for Blue Gene systems. Included in the scope of this project is the development of a scalable and efficient steering relay server that supports many-to-many connectivity between multiple steered applications and multiple steering clients. Steered applications can range from intermediate simulation and physical modeling applications to complex computational fluid dynamics (CFD) applications or advanced visualization applications. The Blue Gene supercomputer presents special challenges for remote access because the compute nodes reside on private networks. This thesis presents an implemented solution and demonstrates it on representative applications. Thorough implementation details and application enablement steps are also presented in this thesis to encourage direct usage of this framework.

  4. An Effective Transform Unit Size Decision Method for High Efficiency Video Coding

    Directory of Open Access Journals (Sweden)

    Chou-Chen Wang

    2014-01-01

    Full Text Available High efficiency video coding (HEVC is the latest video coding standard. HEVC can achieve higher compression performance than previous standards, such as MPEG-4, H.263, and H.264/AVC. However, HEVC requires enormous computational complexity in encoding process due to quadtree structure. In order to reduce the computational burden of HEVC encoder, an early transform unit (TU decision algorithm (ETDA is adopted to pruning the residual quadtree (RQT at early stage based on the number of nonzero DCT coefficients (called NNZ-EDTA to accelerate the encoding process. However, the NNZ-ETDA cannot effectively reduce the computational load for sequences with active motion or rich texture. Therefore, in order to further improve the performance of NNZ-ETDA, we propose an adaptive RQT-depth decision for NNZ-ETDA (called ARD-NNZ-ETDA by exploiting the characteristics of high temporal-spatial correlation that exist in nature video sequences. Simulation results show that the proposed method can achieve time improving ratio (TIR about 61.26%~81.48% when compared to the HEVC test model 8.1 (HM 8.1 with insignificant loss of image quality. Compared with the NNZ-ETDA, the proposed method can further achieve an average TIR about 8.29%~17.92%.

  5. Development of a computationally efficient algorithm for attitude estimation of a remote sensing satellite

    Science.gov (United States)

    Labibian, Amir; Bahrami, Amir Hossein; Haghshenas, Javad

    2017-09-01

    This paper presents a computationally efficient algorithm for attitude estimation of remote a sensing satellite. In this study, gyro, magnetometer, sun sensor and star tracker are used in Extended Kalman Filter (EKF) structure for the purpose of Attitude Determination (AD). However, utilizing all of the measurement data simultaneously in EKF structure increases computational burden. Specifically, assuming n observation vectors, an inverse of a 3n×3n matrix is required for gain calculation. In order to solve this problem, an efficient version of EKF, namely Murrell's version, is employed. This method utilizes measurements separately at each sampling time for gain computation. Therefore, an inverse of a 3n×3n matrix is replaced by an inverse of a 3×3 matrix for each measurement vector. Moreover, gyro drifts during the time can reduce the pointing accuracy. Therefore, a calibration algorithm is utilized for estimation of the main gyro parameters.

  6. Towards the development of run times leveraging virtualization for high performance computing

    International Nuclear Information System (INIS)

    Diakhate, F.

    2010-12-01

    In recent years, there has been a growing interest in using virtualization to improve the efficiency of data centers. This success is rooted in virtualization's excellent fault tolerance and isolation properties, in the overall flexibility it brings, and in its ability to exploit multi-core architectures efficiently. These characteristics also make virtualization an ideal candidate to tackle issues found in new compute cluster architectures. However, in spite of recent improvements in virtualization technology, overheads in the execution of parallel applications remain, which prevent its use in the field of high performance computing. In this thesis, we propose a virtual device dedicated to message passing between virtual machines, so as to improve the performance of parallel applications executed in a cluster of virtual machines. We also introduce a set of techniques facilitating the deployment of virtualized parallel applications. These functionalities have been implemented as part of a runtime system which allows to benefit from virtualization's properties in a way that is as transparent as possible to the user while minimizing performance overheads. (author)

  7. High Efficiency Computation of the Variances of Structural Evolutionary Random Responses

    Directory of Open Access Journals (Sweden)

    J.H. Lin

    2000-01-01

    Full Text Available For structures subjected to stationary or evolutionary white/colored random noise, their various response variances satisfy algebraic or differential Lyapunov equations. The solution of these Lyapunov equations used to be very difficult. A precise integration method is proposed in the present paper, which solves such Lyapunov equations accurately and very efficiently.

  8. High efficiency, variable geometry, centrifugal cryogenic pump

    International Nuclear Information System (INIS)

    Forsha, M.D.; Nichols, K.E.; Beale, C.A.

    1994-01-01

    A centrifugal cryogenic pump has been developed which has a basic design that is rugged and reliable with variable speed and variable geometry features that achieve high pump efficiency over a wide range of head-flow conditions. The pump uses a sealless design and rolling element bearings to achieve high reliability and the ruggedness to withstand liquid-vapor slugging. The pump can meet a wide range of variable head, off-design flow requirements and maintain design point efficiency by adjusting the pump speed. The pump also has features that allow the impeller and diffuser blade heights to be adjusted. The adjustable height blades were intended to enhance the pump efficiency when it is operating at constant head, off-design flow rates. For small pumps, the adjustable height blades are not recommended. For larger pumps, they could provide off-design efficiency improvements. This pump was developed for supercritical helium service, but the design is well suited to any cryogenic application where high efficiency is required over a wide range of head-flow conditions

  9. Research Activity in Computational Physics utilizing High Performance Computing: Co-authorship Network Analysis

    Science.gov (United States)

    Ahn, Sul-Ah; Jung, Youngim

    2016-10-01

    The research activities of the computational physicists utilizing high performance computing are analyzed by bibliometirc approaches. This study aims at providing the computational physicists utilizing high-performance computing and policy planners with useful bibliometric results for an assessment of research activities. In order to achieve this purpose, we carried out a co-authorship network analysis of journal articles to assess the research activities of researchers for high-performance computational physics as a case study. For this study, we used journal articles of the Scopus database from Elsevier covering the time period of 2004-2013. We extracted the author rank in the physics field utilizing high-performance computing by the number of papers published during ten years from 2004. Finally, we drew the co-authorship network for 45 top-authors and their coauthors, and described some features of the co-authorship network in relation to the author rank. Suggestions for further studies are discussed.

  10. Computationally efficient design of optimal output feedback strategies for controllable passive damping devices

    International Nuclear Information System (INIS)

    Kamalzare, Mahmoud; Johnson, Erik A; Wojtkiewicz, Steven F

    2014-01-01

    Designing control strategies for smart structures, such as those with semiactive devices, is complicated by the nonlinear nature of the feedback control, secondary clipping control and other additional requirements such as device saturation. The usual design approach resorts to large-scale simulation parameter studies that are computationally expensive. The authors have previously developed an approach for state-feedback semiactive clipped-optimal control design, based on a nonlinear Volterra integral equation that provides for the computationally efficient simulation of such systems. This paper expands the applicability of the approach by demonstrating that it can also be adapted to accommodate more realistic cases when, instead of full state feedback, only a limited set of noisy response measurements is available to the controller. This extension requires incorporating a Kalman filter (KF) estimator, which is linear, into the nominal model of the uncontrolled system. The efficacy of the approach is demonstrated by a numerical study of a 100-degree-of-freedom frame model, excited by a filtered Gaussian random excitation, with noisy acceleration sensor measurements to determine the semiactive control commands. The results show that the proposed method can improve computational efficiency by more than two orders of magnitude relative to a conventional solver, while retaining a comparable level of accuracy. Further, the proposed approach is shown to be similarly efficient for an extensive Monte Carlo simulation to evaluate the effects of sensor noise levels and KF tuning on the accuracy of the response. (paper)

  11. High Efficiency EBCOT with Parallel Coding Architecture for JPEG2000

    Directory of Open Access Journals (Sweden)

    Chiang Jen-Shiun

    2006-01-01

    Full Text Available This work presents a parallel context-modeling coding architecture and a matching arithmetic coder (MQ-coder for the embedded block coding (EBCOT unit of the JPEG2000 encoder. Tier-1 of the EBCOT consumes most of the computation time in a JPEG2000 encoding system. The proposed parallel architecture can increase the throughput rate of the context modeling. To match the high throughput rate of the parallel context-modeling architecture, an efficient pipelined architecture for context-based adaptive arithmetic encoder is proposed. This encoder of JPEG2000 can work at 180 MHz to encode one symbol each cycle. Compared with the previous context-modeling architectures, our parallel architectures can improve the throughput rate up to 25%.

  12. Improved dissection efficiency in the human gross anatomy laboratory by the integration of computers and modern technology.

    Science.gov (United States)

    Reeves, Rustin E; Aschenbrenner, John E; Wordinger, Robert J; Roque, Rouel S; Sheedlo, Harold J

    2004-05-01

    The need to increase the efficiency of dissection in the gross anatomy laboratory has been the driving force behind the technologic changes we have recently implemented. With the introduction of an integrated systems-based medical curriculum and a reduction in laboratory teaching hours, anatomy faculty at the University of North Texas Health Science Center (UNTHSC) developed a computer-based dissection manual to adjust to these curricular changes and time constraints. At each cadaver workstation, Apple iMac computers were added and a new dissection manual, running in a browser-based format, was installed. Within the text of the manual, anatomical structures required for dissection were linked to digital images from prosected materials; in addition, for each body system, the dissection manual included images from cross sections, radiographs, CT scans, and histology. Although we have placed a high priority on computerization of the anatomy laboratory, we remain strong advocates of the importance of cadaver dissection. It is our belief that the utilization of computers for dissection is a natural evolution of technology and fosters creative teaching strategies adapted for anatomy laboratories in the 21st century. Our strategy has significantly enhanced the independence and proficiency of our students, the efficiency of their dissection time, and the quality of laboratory instruction by the faculty. Copyright 2004 Wiley-Liss, Inc.

  13. Hydrogen production methods efficiency coupled to an advanced high temperature accelerator driven system

    International Nuclear Information System (INIS)

    Rodríguez, Daniel González; Lira, Carlos Alberto Brayner de Oliveira

    2017-01-01

    The hydrogen economy is one of the most promising concepts for the energy future. In this scenario, oil is replaced by hydrogen as an energy carrier. This hydrogen, rather than oil, must be produced in volumes not provided by the currently employed methods. In this work two high temperature hydrogen production methods coupled to an advanced nuclear system are presented. A new design of a pebbled-bed accelerator nuclear driven system called TADSEA is chosen because of the advantages it has in matters of transmutation and safety. For the conceptual design of the high temperature electrolysis process a detailed computational fluid dynamics model was developed to analyze the solid oxide electrolytic cell that has a huge influence on the process efficiency. A detailed flowsheet of the high temperature electrolysis process coupled to TADSEA through a Brayton gas cycle was developed using chemical process simulation software: Aspen HYSYS®. The model with optimized operating conditions produces 0.1627 kg/s of hydrogen, resulting in an overall process efficiency of 34.51%, a value in the range of results reported by other authors. A conceptual design of the iodine-sulfur thermochemical water splitting cycle was also developed. The overall efficiency of the process was calculated performing an energy balance resulting in 22.56%. The values of efficiency, hydrogen production rate and energy consumption of the proposed models are in the values considered acceptable in the hydrogen economy concept, being also compatible with the TADSEA design parameters. (author)

  14. Hydrogen production methods efficiency coupled to an advanced high temperature accelerator driven system

    Energy Technology Data Exchange (ETDEWEB)

    Rodríguez, Daniel González; Lira, Carlos Alberto Brayner de Oliveira [Universidade Federal de Pernambuco (UFPE), Recife, PE (Brazil). Departamento de Energia Nuclear; Fernández, Carlos García, E-mail: danielgonro@gmail.com, E-mail: mmhamada@ipen.br [Instituto Superior de Tecnologías y Ciencias aplicadas (InSTEC), La Habana (Cuba)

    2017-07-01

    The hydrogen economy is one of the most promising concepts for the energy future. In this scenario, oil is replaced by hydrogen as an energy carrier. This hydrogen, rather than oil, must be produced in volumes not provided by the currently employed methods. In this work two high temperature hydrogen production methods coupled to an advanced nuclear system are presented. A new design of a pebbled-bed accelerator nuclear driven system called TADSEA is chosen because of the advantages it has in matters of transmutation and safety. For the conceptual design of the high temperature electrolysis process a detailed computational fluid dynamics model was developed to analyze the solid oxide electrolytic cell that has a huge influence on the process efficiency. A detailed flowsheet of the high temperature electrolysis process coupled to TADSEA through a Brayton gas cycle was developed using chemical process simulation software: Aspen HYSYS®. The model with optimized operating conditions produces 0.1627 kg/s of hydrogen, resulting in an overall process efficiency of 34.51%, a value in the range of results reported by other authors. A conceptual design of the iodine-sulfur thermochemical water splitting cycle was also developed. The overall efficiency of the process was calculated performing an energy balance resulting in 22.56%. The values of efficiency, hydrogen production rate and energy consumption of the proposed models are in the values considered acceptable in the hydrogen economy concept, being also compatible with the TADSEA design parameters. (author)

  15. Empirical Analysis of High Efficient Remote Cloud Data Center Backup Using HBase and Cassandra

    Directory of Open Access Journals (Sweden)

    Bao Rong Chang

    2015-01-01

    Full Text Available HBase, a master-slave framework, and Cassandra, a peer-to-peer (P2P framework, are the two most commonly used large-scale distributed NoSQL databases, especially applicable to the cloud computing with high flexibility and scalability and the ease of big data processing. Regarding storage structure, different structure adopts distinct backup strategy to reduce the risks of data loss. This paper aims to realize high efficient remote cloud data center backup using HBase and Cassandra, and in order to verify the high efficiency backup they have applied Thrift Java for cloud data center to take a stress test by performing strictly data read/write and remote database backup in the large amounts of data. Finally, in terms of the effectiveness-cost evaluation to assess the remote datacenter backup, a cost-performance ratio has been evaluated for several benchmark databases and the proposed ones. As a result, the proposed HBase approach outperforms the other databases.

  16. Algorithmic design of a noise-resistant and efficient closed-loop deep brain stimulation system: A computational approach.

    Directory of Open Access Journals (Sweden)

    Sofia D Karamintziou

    Full Text Available Advances in the field of closed-loop neuromodulation call for analysis and modeling approaches capable of confronting challenges related to the complex neuronal response to stimulation and the presence of strong internal and measurement noise in neural recordings. Here we elaborate on the algorithmic aspects of a noise-resistant closed-loop subthalamic nucleus deep brain stimulation system for advanced Parkinson's disease and treatment-refractory obsessive-compulsive disorder, ensuring remarkable performance in terms of both efficiency and selectivity of stimulation, as well as in terms of computational speed. First, we propose an efficient method drawn from dynamical systems theory, for the reliable assessment of significant nonlinear coupling between beta and high-frequency subthalamic neuronal activity, as a biomarker for feedback control. Further, we present a model-based strategy through which optimal parameters of stimulation for minimum energy desynchronizing control of neuronal activity are being identified. The strategy integrates stochastic modeling and derivative-free optimization of neural dynamics based on quadratic modeling. On the basis of numerical simulations, we demonstrate the potential of the presented modeling approach to identify, at a relatively low computational cost, stimulation settings potentially associated with a significantly higher degree of efficiency and selectivity compared with stimulation settings determined post-operatively. Our data reinforce the hypothesis that model-based control strategies are crucial for the design of novel stimulation protocols at the backstage of clinical applications.

  17. Algorithmic design of a noise-resistant and efficient closed-loop deep brain stimulation system: A computational approach.

    Science.gov (United States)

    Karamintziou, Sofia D; Custódio, Ana Luísa; Piallat, Brigitte; Polosan, Mircea; Chabardès, Stéphan; Stathis, Pantelis G; Tagaris, George A; Sakas, Damianos E; Polychronaki, Georgia E; Tsirogiannis, George L; David, Olivier; Nikita, Konstantina S

    2017-01-01

    Advances in the field of closed-loop neuromodulation call for analysis and modeling approaches capable of confronting challenges related to the complex neuronal response to stimulation and the presence of strong internal and measurement noise in neural recordings. Here we elaborate on the algorithmic aspects of a noise-resistant closed-loop subthalamic nucleus deep brain stimulation system for advanced Parkinson's disease and treatment-refractory obsessive-compulsive disorder, ensuring remarkable performance in terms of both efficiency and selectivity of stimulation, as well as in terms of computational speed. First, we propose an efficient method drawn from dynamical systems theory, for the reliable assessment of significant nonlinear coupling between beta and high-frequency subthalamic neuronal activity, as a biomarker for feedback control. Further, we present a model-based strategy through which optimal parameters of stimulation for minimum energy desynchronizing control of neuronal activity are being identified. The strategy integrates stochastic modeling and derivative-free optimization of neural dynamics based on quadratic modeling. On the basis of numerical simulations, we demonstrate the potential of the presented modeling approach to identify, at a relatively low computational cost, stimulation settings potentially associated with a significantly higher degree of efficiency and selectivity compared with stimulation settings determined post-operatively. Our data reinforce the hypothesis that model-based control strategies are crucial for the design of novel stimulation protocols at the backstage of clinical applications.

  18. GRID computing for experimental high energy physics

    International Nuclear Information System (INIS)

    Moloney, G.R.; Martin, L.; Seviour, E.; Taylor, G.N.; Moorhead, G.F.

    2002-01-01

    Full text: The Large Hadron Collider (LHC), to be completed at the CERN laboratory in 2006, will generate 11 petabytes of data per year. The processing of this large data stream requires a large, distributed computing infrastructure. A recent innovation in high performance distributed computing, the GRID, has been identified as an important tool in data analysis for the LHC. GRID computing has actual and potential application in many fields which require computationally intensive analysis of large, shared data sets. The Australian experimental High Energy Physics community has formed partnerships with the High Performance Computing community to establish a GRID node at the University of Melbourne. Through Australian membership of the ATLAS experiment at the LHC, Australian researchers have an opportunity to be involved in the European DataGRID project. This presentation will include an introduction to the GRID, and it's application to experimental High Energy Physics. We will present the results of our studies, including participation in the first LHC data challenge

  19. Methods for Efficiently and Accurately Computing Quantum Mechanical Free Energies for Enzyme Catalysis.

    Science.gov (United States)

    Kearns, F L; Hudson, P S; Boresch, S; Woodcock, H L

    2016-01-01

    Enzyme activity is inherently linked to free energies of transition states, ligand binding, protonation/deprotonation, etc.; these free energies, and thus enzyme function, can be affected by residue mutations, allosterically induced conformational changes, and much more. Therefore, being able to predict free energies associated with enzymatic processes is critical to understanding and predicting their function. Free energy simulation (FES) has historically been a computational challenge as it requires both the accurate description of inter- and intramolecular interactions and adequate sampling of all relevant conformational degrees of freedom. The hybrid quantum mechanical molecular mechanical (QM/MM) framework is the current tool of choice when accurate computations of macromolecular systems are essential. Unfortunately, robust and efficient approaches that employ the high levels of computational theory needed to accurately describe many reactive processes (ie, ab initio, DFT), while also including explicit solvation effects and accounting for extensive conformational sampling are essentially nonexistent. In this chapter, we will give a brief overview of two recently developed methods that mitigate several major challenges associated with QM/MM FES: the QM non-Boltzmann Bennett's acceptance ratio method and the QM nonequilibrium work method. We will also describe usage of these methods to calculate free energies associated with (1) relative properties and (2) along reaction paths, using simple test cases with relevance to enzymes examples. © 2016 Elsevier Inc. All rights reserved.

  20. Does computer-aided surgical simulation improve efficiency in bimaxillary orthognathic surgery?

    Science.gov (United States)

    Schwartz, H C

    2014-05-01

    The purpose of this study was to compare the efficiency of bimaxillary orthognathic surgery using computer-aided surgical simulation (CASS), with cases planned using traditional methods. Total doctor time was used to measure efficiency. While costs vary widely in different localities and in different health schemes, time is a valuable and limited resource everywhere. For this reason, total doctor time is a more useful measure of efficiency than is cost. Even though we use CASS primarily for planning more complex cases at the present time, this study showed an average saving of 60min for each case. In the context of a department that performs 200 bimaxillary cases each year, this would represent a saving of 25 days of doctor time, if applied to every case. It is concluded that CASS offers great potential for improving efficiency when used in the planning of bimaxillary orthognathic surgery. It saves significant doctor time that can be applied to additional surgical work. Copyright © 2013 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.

  1. High performance parallel computing of flows in complex geometries: II. Applications

    International Nuclear Information System (INIS)

    Gourdain, N; Gicquel, L; Staffelbach, G; Vermorel, O; Duchaine, F; Boussuge, J-F; Poinsot, T

    2009-01-01

    Present regulations in terms of pollutant emissions, noise and economical constraints, require new approaches and designs in the fields of energy supply and transportation. It is now well established that the next breakthrough will come from a better understanding of unsteady flow effects and by considering the entire system and not only isolated components. However, these aspects are still not well taken into account by the numerical approaches or understood whatever the design stage considered. The main challenge is essentially due to the computational requirements inferred by such complex systems if it is to be simulated by use of supercomputers. This paper shows how new challenges can be addressed by using parallel computing platforms for distinct elements of a more complex systems as encountered in aeronautical applications. Based on numerical simulations performed with modern aerodynamic and reactive flow solvers, this work underlines the interest of high-performance computing for solving flow in complex industrial configurations such as aircrafts, combustion chambers and turbomachines. Performance indicators related to parallel computing efficiency are presented, showing that establishing fair criterions is a difficult task for complex industrial applications. Examples of numerical simulations performed in industrial systems are also described with a particular interest for the computational time and the potential design improvements obtained with high-fidelity and multi-physics computing methods. These simulations use either unsteady Reynolds-averaged Navier-Stokes methods or large eddy simulation and deal with turbulent unsteady flows, such as coupled flow phenomena (thermo-acoustic instabilities, buffet, etc). Some examples of the difficulties with grid generation and data analysis are also presented when dealing with these complex industrial applications.

  2. Efficient computation of aerodynamic influence coefficients for aeroelastic analysis on a transputer network

    Science.gov (United States)

    Janetzke, David C.; Murthy, Durbha V.

    1991-01-01

    Aeroelastic analysis is multi-disciplinary and computationally expensive. Hence, it can greatly benefit from parallel processing. As part of an effort to develop an aeroelastic capability on a distributed memory transputer network, a parallel algorithm for the computation of aerodynamic influence coefficients is implemented on a network of 32 transputers. The aerodynamic influence coefficients are calculated using a 3-D unsteady aerodynamic model and a parallel discretization. Efficiencies up to 85 percent were demonstrated using 32 processors. The effect of subtask ordering, problem size, and network topology are presented. A comparison to results on a shared memory computer indicates that higher speedup is achieved on the distributed memory system.

  3. Ground-glass opacity: High-resolution computed tomography and 64-multi-slice computed tomography findings comparison

    International Nuclear Information System (INIS)

    Sergiacomi, Gianluigi; Ciccio, Carmelo; Boi, Luca; Velari, Luca; Crusco, Sonia; Orlacchio, Antonio; Simonetti, Giovanni

    2010-01-01

    Objective: Comparative evaluation of ground-glass opacity using conventional high-resolution computed tomography technique and volumetric computed tomography by 64-row multi-slice scanner, verifying advantage of volumetric acquisition and post-processing technique allowed by 64-row CT scanner. Methods: Thirty-four patients, in which was assessed ground-glass opacity pattern by previous high-resolution computed tomography during a clinical-radiological follow-up for their lung disease, were studied by means of 64-row multi-slice computed tomography. Comparative evaluation of image quality was done by both CT modalities. Results: It was reported good inter-observer agreement (k value 0.78-0.90) in detection of ground-glass opacity with high-resolution computed tomography technique and volumetric Computed Tomography acquisition with moderate increasing of intra-observer agreement (k value 0.46) using volumetric computed tomography than high-resolution computed tomography. Conclusions: In our experience, volumetric computed tomography with 64-row scanner shows good accuracy in detection of ground-glass opacity, providing a better spatial and temporal resolution and advanced post-processing technique than high-resolution computed tomography.

  4. High performance parallel computing of flows in complex geometries: I. Methods

    International Nuclear Information System (INIS)

    Gourdain, N; Gicquel, L; Montagnac, M; Vermorel, O; Staffelbach, G; Garcia, M; Boussuge, J-F; Gazaix, M; Poinsot, T

    2009-01-01

    Efficient numerical tools coupled with high-performance computers, have become a key element of the design process in the fields of energy supply and transportation. However flow phenomena that occur in complex systems such as gas turbines and aircrafts are still not understood mainly because of the models that are needed. In fact, most computational fluid dynamics (CFD) predictions as found today in industry focus on a reduced or simplified version of the real system (such as a periodic sector) and are usually solved with a steady-state assumption. This paper shows how to overcome such barriers and how such a new challenge can be addressed by developing flow solvers running on high-end computing platforms, using thousands of computing cores. Parallel strategies used by modern flow solvers are discussed with particular emphases on mesh-partitioning, load balancing and communication. Two examples are used to illustrate these concepts: a multi-block structured code and an unstructured code. Parallel computing strategies used with both flow solvers are detailed and compared. This comparison indicates that mesh-partitioning and load balancing are more straightforward with unstructured grids than with multi-block structured meshes. However, the mesh-partitioning stage can be challenging for unstructured grids, mainly due to memory limitations of the newly developed massively parallel architectures. Finally, detailed investigations show that the impact of mesh-partitioning on the numerical CFD solutions, due to rounding errors and block splitting, may be of importance and should be accurately addressed before qualifying massively parallel CFD tools for a routine industrial use.

  5. Some computational challenges of developing efficient parallel algorithms for data-dependent computations in thermal-hydraulics supercomputer applications

    International Nuclear Information System (INIS)

    Woodruff, S.B.

    1992-01-01

    The Transient Reactor Analysis Code (TRAC), which features a two- fluid treatment of thermal-hydraulics, is designed to model transients in water reactors and related facilities. One of the major computational costs associated with TRAC and similar codes is calculating constitutive coefficients. Although the formulations for these coefficients are local the costs are flow-regime- or data-dependent; i.e., the computations needed for a given spatial node often vary widely as a function of time. Consequently, poor load balancing will degrade efficiency on either vector or data parallel architectures when the data are organized according to spatial location. Unfortunately, a general automatic solution to the load-balancing problem associated with data-dependent computations is not yet available for massively parallel architectures. This document discusses why developers algorithms, such as a neural net representation, that do not exhibit algorithms, such as a neural net representation, that do not exhibit load-balancing problems

  6. HIGH EFFICIENCY TURBINE

    OpenAIRE

    VARMA, VIJAYA KRUSHNA

    2012-01-01

    Varma designed ultra modern and high efficiency turbines which can use gas, steam or fuels as feed to produce electricity or mechanical work for wide range of usages and applications in industries or at work sites. Varma turbine engines can be used in all types of vehicles. These turbines can also be used in aircraft, ships, battle tanks, dredgers, mining equipment, earth moving machines etc, Salient features of Varma Turbines. 1. Varma turbines are simple in design, easy to manufac...

  7. A Case Study on Neural Inspired Dynamic Memory Management Strategies for High Performance Computing.

    Energy Technology Data Exchange (ETDEWEB)

    Vineyard, Craig Michael [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Verzi, Stephen Joseph [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-09-01

    As high performance computing architectures pursue more computational power there is a need for increased memory capacity and bandwidth as well. A multi-level memory (MLM) architecture addresses this need by combining multiple memory types with different characteristics as varying levels of the same architecture. How to efficiently utilize this memory infrastructure is an unknown challenge, and in this research we sought to investigate whether neural inspired approaches can meaningfully help with memory management. In particular we explored neurogenesis inspired re- source allocation, and were able to show a neural inspired mixed controller policy can beneficially impact how MLM architectures utilize memory.

  8. High-performance computing for airborne applications

    International Nuclear Information System (INIS)

    Quinn, Heather M.; Manuzatto, Andrea; Fairbanks, Tom; Dallmann, Nicholas; Desgeorges, Rose

    2010-01-01

    Recently, there has been attempts to move common satellite tasks to unmanned aerial vehicles (UAVs). UAVs are significantly cheaper to buy than satellites and easier to deploy on an as-needed basis. The more benign radiation environment also allows for an aggressive adoption of state-of-the-art commercial computational devices, which increases the amount of data that can be collected. There are a number of commercial computing devices currently available that are well-suited to high-performance computing. These devices range from specialized computational devices, such as field-programmable gate arrays (FPGAs) and digital signal processors (DSPs), to traditional computing platforms, such as microprocessors. Even though the radiation environment is relatively benign, these devices could be susceptible to single-event effects. In this paper, we will present radiation data for high-performance computing devices in a accelerated neutron environment. These devices include a multi-core digital signal processor, two field-programmable gate arrays, and a microprocessor. From these results, we found that all of these devices are suitable for many airplane environments without reliability problems.

  9. Gamma-ray spectrometer system with high efficiency and high resolution

    International Nuclear Information System (INIS)

    Moss, C.E.; Bernard, W.; Dowdy, E.J.; Garcia, C.; Lucas, M.C.; Pratt, J.C.

    1983-01-01

    Our gamma-ray spectrometer system, designed for field use, offers high efficiency and high resolution for safeguards applications. The system consists of three 40% high-purity germanium detectors and a LeCroy 3500 data acquisition system that calculates a composite spectrum for the three detectors. The LeCroy 3500 mainframe can be operated remotely from the detector array with control exercised through modems and the telephone system. System performance with a mixed source of 125 Sb, 154 Eu, and 155 Eu confirms the expected efficiency of 120% with the overall resolution showing little degradation over that of the worst detector

  10. High voltage generator circuit with low power and high efficiency applied in EEPROM

    International Nuclear Information System (INIS)

    Liu Yan; Zhang Shilin; Zhao Yiqiang

    2012-01-01

    This paper presents a low power and high efficiency high voltage generator circuit embedded in electrically erasable programmable read-only memory (EEPROM). The low power is minimized by a capacitance divider circuit and a regulator circuit using the controlling clock switch technique. The high efficiency is dependent on the zero threshold voltage (V th ) MOSFET and the charge transfer switch (CTS) charge pump. The proposed high voltage generator circuit has been implemented in a 0.35 μm EEPROM CMOS process. Measured results show that the proposed high voltage generator circuit has a low power consumption of about 150.48 μW and a higher pumping efficiency (83.3%) than previously reported circuits. This high voltage generator circuit can also be widely used in low-power flash devices due to its high efficiency and low power dissipation. (semiconductor integrated circuits)

  11. Quantum Accelerators for High-performance Computing Systems

    Energy Technology Data Exchange (ETDEWEB)

    Humble, Travis S. [ORNL; Britt, Keith A. [ORNL; Mohiyaddin, Fahd A. [ORNL

    2017-11-01

    We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantum-accelerator framework that uses specialized kernels to offload select workloads while integrating with existing computing infrastructure. We elaborate on the role of the host operating system to manage these unique accelerator resources, the prospects for deploying quantum modules, and the requirements placed on the language hierarchy connecting these different system components. We draw on recent advances in the modeling and simulation of quantum computing systems with the development of architectures for hybrid high-performance computing systems and the realization of software stacks for controlling quantum devices. Finally, we present simulation results that describe the expected system-level behavior of high-performance computing systems composed from compute nodes with quantum processing units. We describe performance for these hybrid systems in terms of time-to-solution, accuracy, and energy consumption, and we use simple application examples to estimate the performance advantage of quantum acceleration.

  12. An innovative computationally efficient hydromechanical coupling approach for fault reactivation in geological subsurface utilization

    Science.gov (United States)

    Adams, M.; Kempka, T.; Chabab, E.; Ziegler, M.

    2018-02-01

    Estimating the efficiency and sustainability of geological subsurface utilization, i.e., Carbon Capture and Storage (CCS) requires an integrated risk assessment approach, considering the occurring coupled processes, beside others, the potential reactivation of existing faults. In this context, hydraulic and mechanical parameter uncertainties as well as different injection rates have to be considered and quantified to elaborate reliable environmental impact assessments. Consequently, the required sensitivity analyses consume significant computational time due to the high number of realizations that have to be carried out. Due to the high computational costs of two-way coupled simulations in large-scale 3D multiphase fluid flow systems, these are not applicable for the purpose of uncertainty and risk assessments. Hence, an innovative semi-analytical hydromechanical coupling approach for hydraulic fault reactivation will be introduced. This approach determines the void ratio evolution in representative fault elements using one preliminary base simulation, considering one model geometry and one set of hydromechanical parameters. The void ratio development is then approximated and related to one reference pressure at the base of the fault. The parametrization of the resulting functions is then directly implemented into a multiphase fluid flow simulator to carry out the semi-analytical coupling for the simulation of hydromechanical processes. Hereby, the iterative parameter exchange between the multiphase and mechanical simulators is omitted, since the update of porosity and permeability is controlled by one reference pore pressure at the fault base. The suggested procedure is capable to reduce the computational time required by coupled hydromechanical simulations of a multitude of injection rates by a factor of up to 15.

  13. A transport layer protocol for the future high speed grid computing: SCTP versus fast tcp multihoming

    International Nuclear Information System (INIS)

    Arshad, M.J.; Mian, M.S.

    2010-01-01

    TCP (Transmission Control Protocol) is designed for reliable data transfer on the global Internet today. One of its strong points is its use of flow control algorithm that allows TCP to adjust its congestion window if network congestion is occurred. A number of studies and investigations have confirmed that traditional TCP is not suitable for each and every type of application, for example, bulk data transfer over high speed long distance networks. TCP sustained the time of low-capacity and short-delay networks, however, for numerous factors it cannot be capable to efficiently deal with today's growing technologies (such as wide area Grid computing and optical-fiber networks). This research work surveys the congestion control mechanism of transport protocols, and addresses the different issues involved for transferring the huge data over the future high speed Grid computing and optical-fiber networks. This work also presents the simulations to compare the performance of FAST TCP multihoming with SCTP (Stream Control Transmission Protocol) multihoming in high speed networks. These simulation results show that FAST TCP multihoming achieves bandwidth aggregation efficiently and outperforms SCTP multihoming under a similar network conditions. The survey and simulation results presented in this work reveal that multihoming support into FAST TCP does provide a lot of benefits like redundancy, load-sharing and policy-based routing, which largely improves the whole performance of a network and can meet the increasing demand of the future high-speed network infrastructures (such as in Grid computing). (author)

  14. CFD application to advanced design for high efficiency spacer grid

    International Nuclear Information System (INIS)

    Ikeda, Kazuo

    2014-01-01

    Highlights: • A new LDV was developed to investigate the local velocity in a rod bundle and inside a spacer grid. • The design information that utilizes for high efficiency spacer grid has been obtained. • CFD methodology that predicts flow field in a PWR fuel has been developed. • The high efficiency spacer grid was designed using the CFD methodology. - Abstract: Pressurized water reactor (PWR) fuels have been developed to meet the needs of the market. A spacer grid is a key component to improve thermal hydraulic performance of a PWR fuel assembly. Mixing structures (vanes) of a spacer grid promote coolant mixing and enhance heat removal from fuel rods. A larger mixing vane would improve mixing effect, which would increase the departure from nucleate boiling (DNB) benefit for fuel. However, the increased pressure loss at large mixing vanes would reduce the coolant flow at the mixed fuel core, which would reduce the DNB margin. The solution is to develop a spacer grid whose pressure loss is equal to or less than the current spacer grid and that has higher critical heat flux (CHF) performance. For this reason, a requirement of design tool for predicting the pressure loss and CHF performance of spacer grids has been increased. The author and co-workers have been worked for development of high efficiency spacer grid using Computational Fluid Dynamics (CFD) for nearly 20 years. A new laser Doppler velocimetry (LDV), which is miniaturized with fiber optics embedded in a fuel cladding, was developed to investigate the local velocity profile in a rod bundle and inside a spacer grid. The rod-embedded fiber LDV (rod LDV) can be inserted in an arbitrary grid cell instead of a fuel rod, and has the advantage of not disturbing the flow field since it is the same shape as a fuel rod. The probe volume of the rod LDV is small enough to measure spatial velocity profile in a rod gap and inside a spacer grid. According to benchmark experiments such as flow velocity

  15. CFD application to advanced design for high efficiency spacer grid

    Energy Technology Data Exchange (ETDEWEB)

    Ikeda, Kazuo, E-mail: kazuo3_ikeda@ndc.mhi.co.jp

    2014-11-15

    Highlights: • A new LDV was developed to investigate the local velocity in a rod bundle and inside a spacer grid. • The design information that utilizes for high efficiency spacer grid has been obtained. • CFD methodology that predicts flow field in a PWR fuel has been developed. • The high efficiency spacer grid was designed using the CFD methodology. - Abstract: Pressurized water reactor (PWR) fuels have been developed to meet the needs of the market. A spacer grid is a key component to improve thermal hydraulic performance of a PWR fuel assembly. Mixing structures (vanes) of a spacer grid promote coolant mixing and enhance heat removal from fuel rods. A larger mixing vane would improve mixing effect, which would increase the departure from nucleate boiling (DNB) benefit for fuel. However, the increased pressure loss at large mixing vanes would reduce the coolant flow at the mixed fuel core, which would reduce the DNB margin. The solution is to develop a spacer grid whose pressure loss is equal to or less than the current spacer grid and that has higher critical heat flux (CHF) performance. For this reason, a requirement of design tool for predicting the pressure loss and CHF performance of spacer grids has been increased. The author and co-workers have been worked for development of high efficiency spacer grid using Computational Fluid Dynamics (CFD) for nearly 20 years. A new laser Doppler velocimetry (LDV), which is miniaturized with fiber optics embedded in a fuel cladding, was developed to investigate the local velocity profile in a rod bundle and inside a spacer grid. The rod-embedded fiber LDV (rod LDV) can be inserted in an arbitrary grid cell instead of a fuel rod, and has the advantage of not disturbing the flow field since it is the same shape as a fuel rod. The probe volume of the rod LDV is small enough to measure spatial velocity profile in a rod gap and inside a spacer grid. According to benchmark experiments such as flow velocity

  16. A highly efficient 3D level-set grain growth algorithm tailored for ccNUMA architecture

    Science.gov (United States)

    Mießen, C.; Velinov, N.; Gottstein, G.; Barrales-Mora, L. A.

    2017-12-01

    A highly efficient simulation model for 2D and 3D grain growth was developed based on the level-set method. The model introduces modern computational concepts to achieve excellent performance on parallel computer architectures. Strong scalability was measured on cache-coherent non-uniform memory access (ccNUMA) architectures. To achieve this, the proposed approach considers the application of local level-set functions at the grain level. Ideal and non-ideal grain growth was simulated in 3D with the objective to study the evolution of statistical representative volume elements in polycrystals. In addition, microstructure evolution in an anisotropic magnetic material affected by an external magnetic field was simulated.

  17. High-Performance Computing Paradigm and Infrastructure

    CERN Document Server

    Yang, Laurence T

    2006-01-01

    With hyperthreading in Intel processors, hypertransport links in next generation AMD processors, multi-core silicon in today's high-end microprocessors from IBM and emerging grid computing, parallel and distributed computers have moved into the mainstream

  18. Robust fault detection of linear systems using a computationally efficient set-membership method

    DEFF Research Database (Denmark)

    Tabatabaeipour, Mojtaba; Bak, Thomas

    2014-01-01

    In this paper, a computationally efficient set-membership method for robust fault detection of linear systems is proposed. The method computes an interval outer-approximation of the output of the system that is consistent with the model, the bounds on noise and disturbance, and the past measureme...... is trivially parallelizable. The method is demonstrated for fault detection of a hydraulic pitch actuator of a wind turbine. We show the effectiveness of the proposed method by comparing our results with two zonotope-based set-membership methods....

  19. Development and characterization of high-efficiency, high-specific impulse xenon Hall thrusters

    Science.gov (United States)

    Hofer, Richard Robert

    This dissertation presents research aimed at extending the efficient operation of 1600 s specific impulse Hall thruster technology to the 2000--3000 s range. While recent studies of commercially developed Hall thrusters demonstrated greater than 4000 s specific impulse, maximum efficiency occurred at less than 3000 s. It was hypothesized that the efficiency maximum resulted as a consequence of modern magnetic field designs, optimized for 1600 s, which were unsuitable at high-specific impulse. Motivated by the industry efforts and mission studies, the aim of this research was to develop and characterize xenon Hall thrusters capable of both high-specific impulse and high-efficiency operation. The research divided into development and characterization phases. During the development phase, the laboratory-model NASA-173M Hall thrusters were designed with plasma lens magnetic field topographies and their performance and plasma characteristics were evaluated. Experiments with the NASA-173M version 1 (v1) validated the plasma lens design by showing how changing the magnetic field topography at high-specific impulse improved efficiency. Experiments with the NASA-173M version 2 (v2) showed there was a minimum current density and optimum magnetic field topography at which efficiency monotonically increased with voltage. Between 300--1000 V, total specific impulse and total efficiency of the NASA-173Mv2 operating at 10 mg/s ranged from 1600--3400 s and 51--61%, respectively. Comparison of the thrusters showed that efficiency can be optimized for specific impulse by varying the plasma lens design. During the characterization phase, additional plasma properties of the NASA-173Mv2 were measured and a performance model was derived accounting for a multiply-charged, partially-ionized plasma. Results from the model based on experimental data showed how efficient operation at high-specific impulse was enabled through regulation of the electron current with the magnetic field. The

  20. Defect correction and multigrid for an efficient and accurate computation of airfoil flows

    NARCIS (Netherlands)

    Koren, B.

    1988-01-01

    Results are presented for an efficient solution method for second-order accurate discretizations of the 2D steady Euler equations. The solution method is based on iterative defect correction. Several schemes are considered for the computation of the second-order defect. In each defect correction

  1. High-efficiency white OLEDs based on small molecules

    Science.gov (United States)

    Hatwar, Tukaram K.; Spindler, Jeffrey P.; Ricks, M. L.; Young, Ralph H.; Hamada, Yuuhiko; Saito, N.; Mameno, Kazunobu; Nishikawa, Ryuji; Takahashi, Hisakazu; Rajeswaran, G.

    2004-02-01

    Eastman Kodak Company and SANYO Electric Co., Ltd. recently demonstrated a 15" full-color, organic light-emitting diode display (OLED) using a high-efficiency white emitter combined with a color-filter array. Although useful for display applications, white emission from organic structures is also under consideration for other applications, such as solid-state lighting, where high efficiency and good color rendition are important. By incorporating adjacent blue and orange emitting layers in a multi-layer structure, highly efficient, stable white emission has been attained. With suitable host and dopant combinations, a luminance yield of 20 cd/A and efficiency of 8 lm/W have been achieved at a drive voltage of less than 8 volts and luminance level of 1000 cd/m2. The estimated external efficiency of this device is 6.3% and a high level of operational stability is observed. To our knowledge, this is the highest performance reported so far for white organic electroluminescent devices. We will review white OLED technology and discuss the fabrication and operating characteristics of these devices.

  2. Development of High-speed Visualization System of Hypocenter Data Using CUDA-based GPU computing

    Science.gov (United States)

    Kumagai, T.; Okubo, K.; Uchida, N.; Matsuzawa, T.; Kawada, N.; Takeuchi, N.

    2014-12-01

    After the Great East Japan Earthquake on March 11, 2011, intelligent visualization of seismic information is becoming important to understand the earthquake phenomena. On the other hand, to date, the quantity of seismic data becomes enormous as a progress of high accuracy observation network; we need to treat many parameters (e.g., positional information, origin time, magnitude, etc.) to efficiently display the seismic information. Therefore, high-speed processing of data and image information is necessary to handle enormous amounts of seismic data. Recently, GPU (Graphic Processing Unit) is used as an acceleration tool for data processing and calculation in various study fields. This movement is called GPGPU (General Purpose computing on GPUs). In the last few years the performance of GPU keeps on improving rapidly. GPU computing gives us the high-performance computing environment at a lower cost than before. Moreover, use of GPU has an advantage of visualization of processed data, because GPU is originally architecture for graphics processing. In the GPU computing, the processed data is always stored in the video memory. Therefore, we can directly write drawing information to the VRAM on the video card by combining CUDA and the graphics API. In this study, we employ CUDA and OpenGL and/or DirectX to realize full-GPU implementation. This method makes it possible to write drawing information to the VRAM on the video card without PCIe bus data transfer: It enables the high-speed processing of seismic data. The present study examines the GPU computing-based high-speed visualization and the feasibility for high-speed visualization system of hypocenter data.

  3. Approaches to achieve high grain yield and high resource use efficiency in rice

    Directory of Open Access Journals (Sweden)

    Jianchang YANG

    2015-06-01

    Full Text Available This article discusses approaches to simultaneously increase grain yield and resource use efficiency in rice. Breeding nitrogen efficient cultivars without sacrificing rice yield potential, improving grain fill in later-flowering inferior spikelets and enhancing harvest index are three important approaches to achieving the dual goal of high grain yield and high resource use efficiency. Deeper root distribution and higher leaf photosynthetic N use efficiency at lower N rates could be used as selection criteria to develop N-efficient cultivars. Enhancing sink activity through increasing sugar-spikelet ratio at the heading time and enhancing the conversion efficiency from sucrose to starch though increasing the ratio of abscisic acid to ethylene in grains during grain fill could effectively improve grain fill in inferior spikelets. Several practices, such as post-anthesis controlled soil drying, an alternate wetting and moderate soil drying regime during the whole growing season, and non-flooded straw mulching cultivation, could substantially increase grain yield and water use efficiency, mainly via enhanced remobilization of stored carbon from vegetative tissues to grains and improved harvest index. Further research is needed to understand synergistic interaction between water and N on crop and soil and the mechanism underlying high resource use efficiency in high-yielding rice.

  4. Grid computing in high energy physics

    CERN Document Server

    Avery, P

    2004-01-01

    Over the next two decades, major high energy physics (HEP) experiments, particularly at the Large Hadron Collider, will face unprecedented challenges to achieving their scientific potential. These challenges arise primarily from the rapidly increasing size and complexity of HEP datasets that will be collected and the enormous computational, storage and networking resources that will be deployed by global collaborations in order to process, distribute and analyze them. Coupling such vast information technology resources to globally distributed collaborations of several thousand physicists requires extremely capable computing infrastructures supporting several key areas: (1) computing (providing sufficient computational and storage resources for all processing, simulation and analysis tasks undertaken by the collaborations); (2) networking (deploying high speed networks to transport data quickly between institutions around the world); (3) software (supporting simple and transparent access to data and software r...

  5. Efficiency and Loading Evaluation of High Efficiency Mist Eliminators (HEME) - 12003

    Energy Technology Data Exchange (ETDEWEB)

    Giffin, Paxton K.; Parsons, Michael S.; Waggoner, Charles A. [Institute for Clean Energy Technology, Mississippi State University, 205 Research Blvd Starkville, MS 39759 (United States)

    2012-07-01

    High efficiency mist eliminators (HEME) are filters primarily used to remove moisture and/or liquid aerosols from an air stream. HEME elements are designed to reduce aerosol and particulate load on primary High Efficiency Particulate Air (HEPA) filters and to have a liquid particle removal efficiency of approximately 99.5% for aerosols down to sub-micron size particulates. The investigation presented here evaluates the loading capacity of the element in the absence of a water spray cleaning system. The theory is that without the cleaning system, the HEME element will suffer rapid buildup of solid aerosols, greatly reducing the particle loading capacity. Evaluation consists of challenging the element with a waste surrogate dry aerosol and di-octyl phthalate (DOP) at varying intervals of differential pressure to examine the filtering efficiency of three different element designs at three different media velocities. Also, the elements are challenged with a liquid waste surrogate using Laskin nozzles and large dispersion nozzles. These tests allow the loading capacity of the unit to be determined and the effectiveness of washing down the interior of the elements to be evaluated. (authors)

  6. Critical study of high efficiency deep grinding

    OpenAIRE

    Johnstone, lain

    2002-01-01

    The recent years, the aerospace industry in particular has embraced and actively pursued the development of stronger high performance materials, namely nickel based superalloys and hardwearing steels. This has resulted in a need for a more efficient method of machining, and this need was answered with the advent of High Efficiency Deep Grinding (HEDG). This relatively new process using Cubic Boron Nitride (CBN) electroplated grinding wheels has been investigated through experim...

  7. Efficiency Analysis of the Parallel Implementation of the SIMPLE Algorithm on Multiprocessor Computers

    Science.gov (United States)

    Lashkin, S. V.; Kozelkov, A. S.; Yalozo, A. V.; Gerasimov, V. Yu.; Zelensky, D. K.

    2017-12-01

    This paper describes the details of the parallel implementation of the SIMPLE algorithm for numerical solution of the Navier-Stokes system of equations on arbitrary unstructured grids. The iteration schemes for the serial and parallel versions of the SIMPLE algorithm are implemented. In the description of the parallel implementation, special attention is paid to computational data exchange among processors under the condition of the grid model decomposition using fictitious cells. We discuss the specific features for the storage of distributed matrices and implementation of vector-matrix operations in parallel mode. It is shown that the proposed way of matrix storage reduces the number of interprocessor exchanges. A series of numerical experiments illustrates the effect of the multigrid SLAE solver tuning on the general efficiency of the algorithm; the tuning involves the types of the cycles used (V, W, and F), the number of iterations of a smoothing operator, and the number of cells for coarsening. Two ways (direct and indirect) of efficiency evaluation for parallelization of the numerical algorithm are demonstrated. The paper presents the results of solving some internal and external flow problems with the evaluation of parallelization efficiency by two algorithms. It is shown that the proposed parallel implementation enables efficient computations for the problems on a thousand processors. Based on the results obtained, some general recommendations are made for the optimal tuning of the multigrid solver, as well as for selecting the optimal number of cells per processor.

  8. Combustion phasing for maximum efficiency for conventional and high efficiency engines

    International Nuclear Information System (INIS)

    Caton, Jerald A.

    2014-01-01

    Highlights: • Combustion phasing for max efficiency is a function of engine parameters. • Combustion phasing is most affected by heat transfer, compression ratio, burn duration. • Combustion phasing is less affected by speed, load, equivalence ratio and EGR. • Combustion phasing for a high efficiency engine was more advanced. • Exergy destruction during combustion as functions of combustion phasing is reported. - Abstract: The importance of the phasing of the combustion event for internal-combustion engines is well appreciated, but quantitative details are sparse. The objective of the current work was to examine the optimum combustion phasing (based on maximum bmep) as functions of engine design and operating variables. A thermodynamic, engine cycle simulation was used to complete this assessment. As metrics for the combustion phasing, both the crank angle for 50% fuel mass burned (CA 50 ) and the crank angle for peak pressure (CA pp ) are reported as functions of the engine variables. In contrast to common statements in the literature, the optimum CA 50 and CA pp vary depending on the design and operating variables. Optimum, as used in this paper, refers to the combustion timing that provides the maximum bmep and brake thermal efficiency (MBT timing). For this work, the variables with the greatest influence on the optimum CA 50 and CA pp were the heat transfer level, the burn duration and the compression ratio. Other variables such as equivalence ratio, EGR level, engine speed and engine load had a much smaller impact on the optimum CA 50 and CA pp . For the conventional engine, for the conditions examined, the optimum CA 50 varied between about 5 and 11°aTDC, and the optimum CA pp varied between about 9 and 16°aTDC. For a high efficiency engine (high dilution, high compression ratio), the optimum CA 50 was 2.5°aTDC, and the optimum CA pp was 7.8°aTDC. These more advanced values for the optimum CA 50 and CA pp for the high efficiency engine were

  9. Efficient Sustainable Operation Mechanism of Distributed Desktop Integration Storage Based on Virtualization with Ubiquitous Computing

    Directory of Open Access Journals (Sweden)

    Hyun-Woo Kim

    2015-06-01

    Full Text Available Following the rapid growth of ubiquitous computing, many jobs that were previously manual have now been automated. This automation has increased the amount of time available for leisure; diverse services are now being developed for this leisure time. In addition, the development of small and portable devices like smartphones, diverse Internet services can be used regardless of time and place. Studies regarding diverse virtualization are currently in progress. These studies aim to determine ways to efficiently store and process the big data generated by the multitude of devices and services in use. One topic of such studies is desktop storage virtualization, which integrates distributed desktop resources and provides these resources to users to integrate into distributed legacy desktops via virtualization. In the case of desktop storage virtualization, high availability of virtualization is necessary and important for providing reliability to users. Studies regarding hierarchical structures and resource integration are currently in progress. These studies aim to create efficient data distribution and storage for distributed desktops based on resource integration environments. However, studies regarding efficient responses to server faults occurring in desktop-based resource integration environments have been insufficient. This paper proposes a mechanism for the sustainable operation of desktop storage (SODS for high operational availability. It allows for the easy addition and removal of desktops in desktop-based integration environments. It also activates alternative servers when a fault occurs within a system.

  10. Gaussian Radial Basis Function for Efficient Computation of Forest Indirect Illumination

    Science.gov (United States)

    Abbas, Fayçal; Babahenini, Mohamed Chaouki

    2018-06-01

    Global illumination of natural scenes in real time like forests is one of the most complex problems to solve, because the multiple inter-reflections between the light and material of the objects composing the scene. The major problem that arises is the problem of visibility computation. In fact, the computing of visibility is carried out for all the set of leaves visible from the center of a given leaf, given the enormous number of leaves present in a tree, this computation performed for each leaf of the tree which also reduces performance. We describe a new approach that approximates visibility queries, which precede in two steps. The first step is to generate point cloud representing the foliage. We assume that the point cloud is composed of two classes (visible, not-visible) non-linearly separable. The second step is to perform a point cloud classification by applying the Gaussian radial basis function, which measures the similarity in term of distance between each leaf and a landmark leaf. It allows approximating the visibility requests to extract the leaves that will be used to calculate the amount of indirect illumination exchanged between neighbor leaves. Our approach allows efficiently treat the light exchanges in the scene of a forest, it allows a fast computation and produces images of good visual quality, all this takes advantage of the immense power of computation of the GPU.

  11. A parallelization study of the general purpose Monte Carlo code MCNP4 on a distributed memory highly parallel computer

    International Nuclear Information System (INIS)

    Yamazaki, Takao; Fujisaki, Masahide; Okuda, Motoi; Takano, Makoto; Masukawa, Fumihiro; Naito, Yoshitaka

    1993-01-01

    The general purpose Monte Carlo code MCNP4 has been implemented on the Fujitsu AP1000 distributed memory highly parallel computer. Parallelization techniques developed and studied are reported. A shielding analysis function of the MCNP4 code is parallelized in this study. A technique to map a history to each processor dynamically and to map control process to a certain processor was applied. The efficiency of parallelized code is up to 80% for a typical practical problem with 512 processors. These results demonstrate the advantages of a highly parallel computer to the conventional computers in the field of shielding analysis by Monte Carlo method. (orig.)

  12. Energy Efficient Control of High Speed IPMSM Drives - A Generalized PSO Approach

    Directory of Open Access Journals (Sweden)

    GECIC, M.

    2016-02-01

    Full Text Available In this paper, a generalized particle swarm optimization (GPSO algorithm was applied to the problems of optimal control of high speed low cost interior permanent magnet motor (IPMSM drives. In order to minimize the total controllable electrical losses and to increase the efficiency, the optimum current vector references are calculated offline based on GPSO for the wide speed range and for different load conditions. The voltage and current limits of the drive system and the variation of stator inductances are all included in the optimization method. The stored optimal current vector references are used during the real time control and the proposed algorithm is compared with the conventional high speed control algorithm, which is mostly voltage limit based. The computer simulations and experimental results on 1 kW low cost high speed IPMSM drive are discussed in details.

  13. Grid Computing in High Energy Physics

    International Nuclear Information System (INIS)

    Avery, Paul

    2004-01-01

    Over the next two decades, major high energy physics (HEP) experiments, particularly at the Large Hadron Collider, will face unprecedented challenges to achieving their scientific potential. These challenges arise primarily from the rapidly increasing size and complexity of HEP datasets that will be collected and the enormous computational, storage and networking resources that will be deployed by global collaborations in order to process, distribute and analyze them.Coupling such vast information technology resources to globally distributed collaborations of several thousand physicists requires extremely capable computing infrastructures supporting several key areas: (1) computing (providing sufficient computational and storage resources for all processing, simulation and analysis tasks undertaken by the collaborations); (2) networking (deploying high speed networks to transport data quickly between institutions around the world); (3) software (supporting simple and transparent access to data and software resources, regardless of location); (4) collaboration (providing tools that allow members full and fair access to all collaboration resources and enable distributed teams to work effectively, irrespective of location); and (5) education, training and outreach (providing resources and mechanisms for training students and for communicating important information to the public).It is believed that computing infrastructures based on Data Grids and optical networks can meet these challenges and can offer data intensive enterprises in high energy physics and elsewhere a comprehensive, scalable framework for collaboration and resource sharing. A number of Data Grid projects have been underway since 1999. Interestingly, the most exciting and far ranging of these projects are led by collaborations of high energy physicists, computer scientists and scientists from other disciplines in support of experiments with massive, near-term data needs. I review progress in this

  14. Energy efficient distributed computing systems

    CERN Document Server

    Lee, Young-Choon

    2012-01-01

    The energy consumption issue in distributed computing systems raises various monetary, environmental and system performance concerns. Electricity consumption in the US doubled from 2000 to 2005.  From a financial and environmental standpoint, reducing the consumption of electricity is important, yet these reforms must not lead to performance degradation of the computing systems.  These contradicting constraints create a suite of complex problems that need to be resolved in order to lead to 'greener' distributed computing systems.  This book brings together a group of outsta

  15. Contemporary high performance computing from petascale toward exascale

    CERN Document Server

    Vetter, Jeffrey S

    2013-01-01

    Contemporary High Performance Computing: From Petascale toward Exascale focuses on the ecosystems surrounding the world's leading centers for high performance computing (HPC). It covers many of the important factors involved in each ecosystem: computer architectures, software, applications, facilities, and sponsors. The first part of the book examines significant trends in HPC systems, including computer architectures, applications, performance, and software. It discusses the growth from terascale to petascale computing and the influence of the TOP500 and Green500 lists. The second part of the

  16. Highly efficient red electrophosphorescent devices at high current densities

    International Nuclear Information System (INIS)

    Wu Youzhi; Zhu Wenqing; Zheng Xinyou; Sun, Runguang; Jiang Xueyin; Zhang Zhilin; Xu Shaohong

    2007-01-01

    Efficiency decrease at high current densities in red electrophosphorescent devices is drastically restrained compared with that from conventional electrophosphorescent devices by using bis(2-methyl-8-quinolinato)4-phenylphenolate aluminum (BAlq) as a hole and exciton blocker. Ir complex, bis(2-(2'-benzo[4,5-α]thienyl) pyridinato-N,C 3' ) iridium (acetyl-acetonate) is used as an emitter, maximum external quantum efficiency (QE) of 7.0% and luminance of 10000cd/m 2 are obtained. The QE is still as high as 4.1% at higher current density J=100mA/cm 2 . CIE-1931 co-ordinates are 0.672, 0.321. A carrier trapping mechanism is revealed to dominate in the process of electroluminescence

  17. CHEP95: Computing in high energy physics. Abstracts

    International Nuclear Information System (INIS)

    1995-01-01

    These proceedings cover the technical papers on computation in High Energy Physics, including computer codes, computer devices, control systems, simulations, data acquisition systems. New approaches on computer architectures are also discussed

  18. High Efficiency, Low Emission Refrigeration System

    Energy Technology Data Exchange (ETDEWEB)

    Fricke, Brian A [ORNL; Sharma, Vishaldeep [ORNL

    2016-08-01

    Supermarket refrigeration systems account for approximately 50% of supermarket energy use, placing this class of equipment among the highest energy consumers in the commercial building domain. In addition, the commonly used refrigeration system in supermarket applications is the multiplex direct expansion (DX) system, which is prone to refrigerant leaks due to its long lengths of refrigerant piping. This leakage reduces the efficiency of the system and increases the impact of the system on the environment. The high Global Warming Potential (GWP) of the hydrofluorocarbon (HFC) refrigerants commonly used in these systems, coupled with the large refrigerant charge and the high refrigerant leakage rates leads to significant direct emissions of greenhouse gases into the atmosphere. Methods for reducing refrigerant leakage and energy consumption are available, but underutilized. Further work needs to be done to reduce costs of advanced system designs to improve market utilization. In addition, refrigeration system retrofits that result in reduced energy consumption are needed since the majority of applications address retrofits rather than new stores. The retrofit market is also of most concern since it involves large-volume refrigerant systems with high leak rates. Finally, alternative refrigerants for new and retrofit applications are needed to reduce emissions and reduce the impact on the environment. The objective of this Collaborative Research and Development Agreement (CRADA) between the Oak Ridge National Laboratory and Hill Phoenix is to develop a supermarket refrigeration system that reduces greenhouse gas emissions and has 25 to 30 percent lower energy consumption than existing systems. The outcomes of this project will include the design of a low emission, high efficiency commercial refrigeration system suitable for use in current U.S. supermarkets. In addition, a prototype low emission, high efficiency supermarket refrigeration system will be produced for

  19. Efficiency improvement opportunities for personal computer monitors. Implications for market transformation programs

    Energy Technology Data Exchange (ETDEWEB)

    Park, Won Young; Phadke, Amol; Shah, Nihar [Environmental Energy Technologies Division, Lawrence Berkeley National Laboratory, Berkeley, CA (United States)

    2013-08-15

    Displays account for a significant portion of electricity consumed in personal computer (PC) use, and global PC monitor shipments are expected to continue to increase. We assess the market trends in the energy efficiency of PC monitors that are likely to occur without any additional policy intervention and estimate that PC monitor efficiency will likely improve by over 40 % by 2015 with saving potential of 4.5 TWh per year in 2015, compared to today's technology. We discuss various energy-efficiency improvement options and evaluate the cost-effectiveness of three of them, at least one of which improves efficiency by at least 20 % cost effectively beyond the ongoing market trends. We assess the potential for further improving efficiency taking into account the recent development of universal serial bus-powered liquid crystal display monitors and find that the current technology available and deployed in them has the potential to deeply and cost effectively reduce energy consumption by as much as 50 %. We provide insights for policies and programs that can be used to accelerate the adoption of efficient technologies to further capture global energy saving potential from PC monitors which we estimate to be 9.2 TWh per year in 2015.

  20. Development of an optical resonator with high-efficient output coupler for the JAERI far-infrared free-electron laser

    International Nuclear Information System (INIS)

    Nagai, Ryoji; Hajima, Ryoichi; Nishimori, Nobuyuki; Sawamura, Masaru; Kikuzawa, Nobuhiro; Shizuma, Toshiyuki; Minehara, Eisuke

    2001-01-01

    An optical resonator with a high-efficient output coupler was developed for the JAERI far-infrared free-electron laser. The optical resonator is symmetrical near-concentric geometry with an insertable scraper output coupler. As a result of the development of the optical resonator, the JAERI-FEL has been successfully, lased with averaged power over 1 kW. Performance of the optical resonator with the output coupler was evaluated at optical wavelength of 22 μm by using an optical mode calculation code. The output coupling and diffractive loss with a dominant eigen-mode of the resonator were calculated using an iterative computation called Fox-Li procedure. An efficiency factor of the optical resonator was introduced for the evaluation of the optical resonator performance. The efficiency factor was derived by the amount of the output coupling and diffractive loss of the optical resonator. It was found that the optical resonator with the insertable scraper coupler was the most suitable to a high-power and high-efficient far-infrared free-electron laser. (author)

  1. High quality ceramic coatings sprayed by high efficiency hypersonic plasma spraying gun

    International Nuclear Information System (INIS)

    Zhu Sheng; Xu Binshi; Yao JiuKun

    2005-01-01

    This paper introduced the structure of the high efficiency hypersonic plasma spraying gun and the effects of hypersonic plasma jet on the sprayed particles. The optimised spraying process parameters for several ceramic powders such as Al 2 O 3 , Cr 2 O 3 , ZrO 2 , Cr 3 C 2 and Co-WC were listed. The properties and microstructure of the sprayed ceramic coatings were investigated. Nano Al 2 O 3 -TiO 2 ceramic coating sprayed by using the high efficiency hypersonic plasma spraying was also studied. Compared with the conventional air plasma spraying, high efficiency hypersonic plasma spraying improves greatly the ceramic coatings quality but at low cost. (orig.)

  2. Simple Motor Control Concept Results High Efficiency at High Velocities

    Science.gov (United States)

    Starin, Scott; Engel, Chris

    2013-09-01

    The need for high velocity motors in space applications for reaction wheels and detectors has stressed the limits of Brushless Permanent Magnet Motors (BPMM). Due to inherent hysteresis core losses, conventional BPMMs try to balance the need for torque verses hysteresis losses. Cong-less motors have significantly less hysteresis losses but suffer from lower efficiencies. Additionally, the inherent low inductance in cog-less motors result in high ripple currents or high switching frequencies, which lowers overall efficiency and increases performance demands on the control electronics.However, using a somewhat forgotten but fully qualified technology of Isotropic Magnet Motors (IMM), extremely high velocities may be achieved at low power input using conventional drive electronics. This paper will discuss the trade study efforts and empirical test data on a 34,000 RPM IMM.

  3. Efficiency Improvement Opportunities for Personal Computer Monitors. Implications for Market Transformation Programs

    Energy Technology Data Exchange (ETDEWEB)

    Park, Won Young [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Phadke, Amol [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Shah, Nihar [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2012-06-29

    Displays account for a significant portion of electricity consumed in personal computer (PC) use, and global PC monitor shipments are expected to continue to increase. We assess the market trends in the energy efficiency of PC monitors that are likely to occur without any additional policy intervention and estimate that display efficiency will likely improve by over 40% by 2015 compared to today’s technology. We evaluate the cost effectiveness of a key technology which further improves efficiency beyond this level by at least 20% and find that its adoption is cost effective. We assess the potential for further improving efficiency taking into account the recent development of universal serial bus (USB) powered liquid crystal display (LCD) monitors and find that the current technology available and deployed in USB powered monitors has the potential to deeply reduce energy consumption by as much as 50%. We provide insights for policies and programs that can be used to accelerate the adoption of efficient technologies to capture global energy saving potential from PC monitors which we estimate to be 9.2 terawatt-hours [TWh] per year in 2015.

  4. High efficiency nebulization for helium inductively coupled plasma mass spectrometry

    International Nuclear Information System (INIS)

    Jorabchi, Kaveh; McCormick, Ryan; Levine, Jonathan A.; Liu Huiying; Nam, S.-H.; Montaser, Akbar

    2006-01-01

    A pneumatically-driven, high efficiency nebulizer is explored for helium inductively coupled plasma mass spectrometry. The aerosol characteristics and analyte transport efficiencies of the high efficiency nebulizer for nebulization with helium are measured and compared to the results obtained with argon. Analytical performance indices of the helium inductively coupled plasma mass spectrometry are evaluated in terms of detection limits and precision. The helium inductively coupled plasma mass spectrometry detection limits obtained with the high efficiency nebulizer at 200 μL/min are higher than those achieved with the ultrasonic nebulizer consuming 2 mL/min solution, however, precision is generally better with high efficiency nebulizer (1-4% vs. 3-8% with ultrasonic nebulizer). Detection limits with the high efficiency nebulizer at 200 μL/min solution uptake rate approach those using ultrasonic nebulizer upon efficient desolvation with a heated spray chamber followed by a Peltier-cooled multipass condenser

  5. Efficient Geo-Computational Algorithms for Constructing Space-Time Prisms in Road Networks

    Directory of Open Access Journals (Sweden)

    Hui-Ping Chen

    2016-11-01

    Full Text Available The Space-time prism (STP is a key concept in time geography for analyzing human activity-travel behavior under various Space-time constraints. Most existing time-geographic studies use a straightforward algorithm to construct STPs in road networks by using two one-to-all shortest path searches. However, this straightforward algorithm can introduce considerable computational overhead, given the fact that accessible links in a STP are generally a small portion of the whole network. To address this issue, an efficient geo-computational algorithm, called NTP-A*, is proposed. The proposed NTP-A* algorithm employs the A* and branch-and-bound techniques to discard inaccessible links during two shortest path searches, and thereby improves the STP construction performance. Comprehensive computational experiments are carried out to demonstrate the computational advantage of the proposed algorithm. Several implementation techniques, including the label-correcting technique and the hybrid link-node labeling technique, are discussed and analyzed. Experimental results show that the proposed NTP-A* algorithm can significantly improve STP construction performance in large-scale road networks by a factor of 100, compared with existing algorithms.

  6. IMPROVING TACONITE PROCESSING PLANT EFFICIENCY BY COMPUTER SIMULATION, Final Report

    Energy Technology Data Exchange (ETDEWEB)

    William M. Bond; Salih Ersayin

    2007-03-30

    This project involved industrial scale testing of a mineral processing simulator to improve the efficiency of a taconite processing plant, namely the Minorca mine. The Concentrator Modeling Center at the Coleraine Minerals Research Laboratory, University of Minnesota Duluth, enhanced the capabilities of available software, Usim Pac, by developing mathematical models needed for accurate simulation of taconite plants. This project provided funding for this technology to prove itself in the industrial environment. As the first step, data representing existing plant conditions were collected by sampling and sample analysis. Data were then balanced and provided a basis for assessing the efficiency of individual devices and the plant, and also for performing simulations aimed at improving plant efficiency. Performance evaluation served as a guide in developing alternative process strategies for more efficient production. A large number of computer simulations were then performed to quantify the benefits and effects of implementing these alternative schemes. Modification of makeup ball size was selected as the most feasible option for the target performance improvement. This was combined with replacement of existing hydrocyclones with more efficient ones. After plant implementation of these modifications, plant sampling surveys were carried out to validate findings of the simulation-based study. Plant data showed very good agreement with the simulated data, confirming results of simulation. After the implementation of modifications in the plant, several upstream bottlenecks became visible. Despite these bottlenecks limiting full capacity, concentrator energy improvement of 7% was obtained. Further improvements in energy efficiency are expected in the near future. The success of this project demonstrated the feasibility of a simulation-based approach. Currently, the Center provides simulation-based service to all the iron ore mining companies operating in northern

  7. Cross-scale Efficient Tensor Contractions for Coupled Cluster Computations Through Multiple Programming Model Backends

    Energy Technology Data Exchange (ETDEWEB)

    Ibrahim, Khaled Z. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Epifanovsky, Evgeny [Q-Chem, Inc., Pleasanton, CA (United States); Williams, Samuel W. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Computational Research Division; Krylov, Anna I. [Univ. of Southern California, Los Angeles, CA (United States). Dept. of Chemistry

    2016-07-26

    Coupled-cluster methods provide highly accurate models of molecular structure by explicit numerical calculation of tensors representing the correlation between electrons. These calculations are dominated by a sequence of tensor contractions, motivating the development of numerical libraries for such operations. While based on matrix-matrix multiplication, these libraries are specialized to exploit symmetries in the molecular structure and in electronic interactions, and thus reduce the size of the tensor representation and the complexity of contractions. The resulting algorithms are irregular and their parallelization has been previously achieved via the use of dynamic scheduling or specialized data decompositions. We introduce our efforts to extend the Libtensor framework to work in the distributed memory environment in a scalable and energy efficient manner. We achieve up to 240 speedup compared with the best optimized shared memory implementation. We attain scalability to hundreds of thousands of compute cores on three distributed-memory architectures, (Cray XC30&XC40, BlueGene/Q), and on a heterogeneous GPU-CPU system (Cray XK7). As the bottlenecks shift from being compute-bound DGEMM's to communication-bound collectives as the size of the molecular system scales, we adopt two radically different parallelization approaches for handling load-imbalance. Nevertheless, we preserve a uni ed interface to both programming models to maintain the productivity of computational quantum chemists.

  8. High Efficiency Power Converter for Low Voltage High Power Applications

    DEFF Research Database (Denmark)

    Nymand, Morten

    The topic of this thesis is the design of high efficiency power electronic dc-to-dc converters for high-power, low-input-voltage to high-output-voltage applications. These converters are increasingly required for emerging sustainable energy systems such as fuel cell, battery or photo voltaic based...

  9. High Current Planar Transformer for Very High Efficiency Isolated Boost DC-DC Converters

    DEFF Research Database (Denmark)

    Pittini, Riccardo; Zhang, Zhe; Andersen, Michael A. E.

    2014-01-01

    This paper presents a design and optimization of a high current planar transformer for very high efficiency dc-dc isolated boost converters. The analysis considers different winding arrangements, including very high copper thickness windings. The analysis is focused on the winding ac-resistance a......This paper presents a design and optimization of a high current planar transformer for very high efficiency dc-dc isolated boost converters. The analysis considers different winding arrangements, including very high copper thickness windings. The analysis is focused on the winding ac......-resistance and transformer leakage inductance. Design and optimization procedures are validated based on an experimental prototype of a 6 kW dcdc isolated full bridge boost converter developed on fully planar magnetics. The prototype is rated at 30-80 V 0-80 A on the low voltage side and 700-800 V on the high voltage side...... with a peak efficiency of 97.8% at 80 V 3.5 kW. Results highlights that thick copper windings can provide good performance at low switching frequencies due to the high transformer filling factor. PCB windings can also provide very high efficiency if stacked in parallel utilizing the transformer winding window...

  10. Compressor and Turbine Multidisciplinary Design for Highly Efficient Micro-gas Turbine

    Science.gov (United States)

    Barsi, Dario; Perrone, Andrea; Qu, Yonglei; Ratto, Luca; Ricci, Gianluca; Sergeev, Vitaliy; Zunino, Pietro

    2018-06-01

    Multidisciplinary design optimization (MDO) is widely employed to enhance turbomachinery components efficiency. The aim of this work is to describe a complete tool for the aero-mechanical design of a radial inflow turbine and a centrifugal compressor. The high rotational speed of such machines and the high exhaust gas temperature (only for the turbine) expose blades to really high stresses and therefore the aerodynamics design has to be coupled with the mechanical one through an integrated procedure. The described approach employs a fully 3D Reynolds Averaged Navier-Stokes (RANS) solver for the aerodynamics and an open source Finite Element Analysis (FEA) solver for the mechanical integrity assessment. Due to the high computational cost of both these two solvers, a meta model, such as an artificial neural network (ANN), is used to speed up the optimization design process. The interaction between two codes, the mesh generation and the post processing of the results are achieved via in-house developed scripting modules. The obtained results are widely presented and discussed.

  11. Computer model of copper resistivity will improve the efficiency of field-compression devices

    International Nuclear Information System (INIS)

    Burgess, T.J.

    1977-01-01

    By detonating a ring of high explosive around an existing magnetic field, we can, under certain conditions, compress the field and multiply its strength tremendously. In this way, we can duplicate for a fraction of a second the extreme pressures that normally exist only in the interior of stars and planets. Under such pressures, materials may exhibit behavior that will confirm or alter current notions about the fundamental structure of matter and the ongoing processes in planetary interiors. However, we cannot design an efficient field-compression device unless we can calculate the electrical resistivity of certain basic metal components, which interact with the field. To aid in the design effort, we have developed a computer code that calculates the resistivity of copper and other metals over the wide range of temperatures and pressures found in a field-compression device

  12. High Efficiency Colloidal Quantum Dot Phosphors

    Energy Technology Data Exchange (ETDEWEB)

    Kahen, Keith

    2013-12-31

    The project showed that non-Cd containing, InP-based nanocrystals (semiconductor materials with dimensions of ~6 nm) have high potential for enabling next-generation, nanocrystal-based, on chip phosphors for solid state lighting. Typical nanocrystals fall short of the requirements for on chip phosphors due to their loss of quantum efficiency under the operating conditions of LEDs, such as, high temperature (up to 150 °C) and high optical flux (up to 200 W/cm2). The InP-based nanocrystals invented during this project maintain high quantum efficiency (>80%) in polymer-based films under these operating conditions for emission wavelengths ranging from ~530 to 620 nm. These nanocrystals also show other desirable attributes, such as, lack of blinking (a common problem with nanocrystals which limits their performance) and no increase in the emission spectral width from room to 150 °C (emitters with narrower spectral widths enable higher efficiency LEDs). Prior to these nanocrystals, no nanocrystal system (regardless of nanocrystal type) showed this collection of properties; in fact, other nanocrystal systems are typically limited to showing only one desirable trait (such as high temperature stability) but being deficient in other properties (such as high flux stability). The project showed that one can reproducibly obtain these properties by generating a novel compositional structure inside of the nanomaterials; in addition, the project formulated an initial theoretical framework linking the compositional structure to the list of high performance optical properties. Over the course of the project, the synthetic methodology for producing the novel composition was evolved to enable the synthesis of these nanomaterials at a cost approximately equal to that required for forming typical conventional nanocrystals. Given the above results, the last major remaining step prior to scale up of the nanomaterials is to limit the oxidation of these materials during the tens of

  13. SmartVeh: Secure and Efficient Message Access Control and Authentication for Vehicular Cloud Computing.

    Science.gov (United States)

    Huang, Qinlong; Yang, Yixian; Shi, Yuxiang

    2018-02-24

    With the growing number of vehicles and popularity of various services in vehicular cloud computing (VCC), message exchanging among vehicles under traffic conditions and in emergency situations is one of the most pressing demands, and has attracted significant attention. However, it is an important challenge to authenticate the legitimate sources of broadcast messages and achieve fine-grained message access control. In this work, we propose SmartVeh, a secure and efficient message access control and authentication scheme in VCC. A hierarchical, attribute-based encryption technique is utilized to achieve fine-grained and flexible message sharing, which ensures that vehicles whose persistent or dynamic attributes satisfy the access policies can access the broadcast message with equipped on-board units (OBUs). Message authentication is enforced by integrating an attribute-based signature, which achieves message authentication and maintains the anonymity of the vehicles. In order to reduce the computations of the OBUs in the vehicles, we outsource the heavy computations of encryption, decryption and signing to a cloud server and road-side units. The theoretical analysis and simulation results reveal that our secure and efficient scheme is suitable for VCC.

  14. A Practical Evaluation of a High-Security Energy-Efficient Gateway for IoT Fog Computing Applications.

    Science.gov (United States)

    Suárez-Albela, Manuel; Fernández-Caramés, Tiago M; Fraga-Lamas, Paula; Castedo, Luis

    2017-08-29

    Fog computing extends cloud computing to the edge of a network enabling new Internet of Things (IoT) applications and services, which may involve critical data that require privacy and security. In an IoT fog computing system, three elements can be distinguished: IoT nodes that collect data, the cloud, and interconnected IoT gateways that exchange messages with the IoT nodes and with the cloud. This article focuses on securing IoT gateways, which are assumed to be constrained in terms of computational resources, but that are able to offload some processing from the cloud and to reduce the latency in the responses to the IoT nodes. However, it is usually taken for granted that IoT gateways have direct access to the electrical grid, which is not always the case: in mission-critical applications like natural disaster relief or environmental monitoring, it is common to deploy IoT nodes and gateways in large areas where electricity comes from solar or wind energy that charge the batteries that power every device. In this article, how to secure IoT gateway communications while minimizing power consumption is analyzed. The throughput and power consumption of Rivest-Shamir-Adleman (RSA) and Elliptic Curve Cryptography (ECC) are considered, since they are really popular, but have not been thoroughly analyzed when applied to IoT scenarios. Moreover, the most widespread Transport Layer Security (TLS) cipher suites use RSA as the main public key-exchange algorithm, but the key sizes needed are not practical for most IoT devices and cannot be scaled to high security levels. In contrast, ECC represents a much lighter and scalable alternative. Thus, RSA and ECC are compared for equivalent security levels, and power consumption and data throughput are measured using a testbed of IoT gateways. The measurements obtained indicate that, in the specific fog computing scenario proposed, ECC is clearly a much better alternative than RSA, obtaining energy consumption reductions of up to

  15. A Practical Evaluation of a High-Security Energy-Efficient Gateway for IoT Fog Computing Applications

    Science.gov (United States)

    Castedo, Luis

    2017-01-01

    Fog computing extends cloud computing to the edge of a network enabling new Internet of Things (IoT) applications and services, which may involve critical data that require privacy and security. In an IoT fog computing system, three elements can be distinguished: IoT nodes that collect data, the cloud, and interconnected IoT gateways that exchange messages with the IoT nodes and with the cloud. This article focuses on securing IoT gateways, which are assumed to be constrained in terms of computational resources, but that are able to offload some processing from the cloud and to reduce the latency in the responses to the IoT nodes. However, it is usually taken for granted that IoT gateways have direct access to the electrical grid, which is not always the case: in mission-critical applications like natural disaster relief or environmental monitoring, it is common to deploy IoT nodes and gateways in large areas where electricity comes from solar or wind energy that charge the batteries that power every device. In this article, how to secure IoT gateway communications while minimizing power consumption is analyzed. The throughput and power consumption of Rivest–Shamir–Adleman (RSA) and Elliptic Curve Cryptography (ECC) are considered, since they are really popular, but have not been thoroughly analyzed when applied to IoT scenarios. Moreover, the most widespread Transport Layer Security (TLS) cipher suites use RSA as the main public key-exchange algorithm, but the key sizes needed are not practical for most IoT devices and cannot be scaled to high security levels. In contrast, ECC represents a much lighter and scalable alternative. Thus, RSA and ECC are compared for equivalent security levels, and power consumption and data throughput are measured using a testbed of IoT gateways. The measurements obtained indicate that, in the specific fog computing scenario proposed, ECC is clearly a much better alternative than RSA, obtaining energy consumption reductions of up

  16. High Performance Computing in Science and Engineering '14

    CERN Document Server

    Kröner, Dietmar; Resch, Michael

    2015-01-01

    This book presents the state-of-the-art in supercomputer simulation. It includes the latest findings from leading researchers using systems from the High Performance Computing Center Stuttgart (HLRS). The reports cover all fields of computational science and engineering ranging from CFD to computational physics and from chemistry to computer science with a special emphasis on industrially relevant applications. Presenting findings of one of Europe’s leading systems, this volume covers a wide variety of applications that deliver a high level of sustained performance. The book covers the main methods in high-performance computing. Its outstanding results in achieving the best performance for production codes are of particular interest for both scientists and   engineers. The book comes with a wealth of color illustrations and tables of results.  

  17. Highly efficient photonic nanowire single-photon sources for quantum information applications

    DEFF Research Database (Denmark)

    Gregersen, Niels; Claudon, J.; Munsch, M.

    2013-01-01

    to a collection efficiency of only 1-2 %, and efficient light extraction thus poses a major challenge in SPS engineering. Initial efforts to improve the efficiency have exploited cavity quantum electrodynamics (cQED) to efficiently couple the emitted photons to the optical cavity mode. An alternative approach......Within the emerging field of optical quantum information processing, the current challenge is to construct the basic building blocks for the quantum computing and communication systems. A key component is the singlephoton source (SPS) capable of emitting single photons on demand. Ideally, the SPS...... must feature near-unity efficiency, where the efficiency is defined as the number of detected photons per trigger, the probability g(2)(τ=0) of multi-photon emission events should be 0 and the emitted photons are required to be indistinguishable. An optically or electrically triggered quantum light...

  18. A New Method of Histogram Computation for Efficient Implementation of the HOG Algorithm

    Directory of Open Access Journals (Sweden)

    Mariana-Eugenia Ilas

    2018-03-01

    Full Text Available In this paper we introduce a new histogram computation method to be used within the histogram of oriented gradients (HOG algorithm. The new method replaces the arctangent with the slope computation and the classical magnitude allocation based on interpolation with a simpler algorithm. The new method allows a more efficient implementation of HOG in general, and particularly in field-programmable gate arrays (FPGAs, by considerably reducing the area (thus increasing the level of parallelism, while maintaining very close classification accuracy compared to the original algorithm. Thus, the new method is attractive for many applications, including car detection and classification.

  19. Bringing together high energy physicist and computer scientist

    International Nuclear Information System (INIS)

    Bock, R.K.

    1989-01-01

    The Oxford Conference on Computing in High Energy Physics approached the physics and computing issues with the question, ''Can computer science help?'' always in mind. This summary is a personal recollection of what I considered to be the highlights of the conference: the parts which contributed to my own learning experience. It can be used as a general introduction to the following papers, or as a brief overview of the current states of computer science within high energy physics. (orig.)

  20. Highly efficient fully transparent inverted OLEDs

    Science.gov (United States)

    Meyer, J.; Winkler, T.; Hamwi, S.; Schmale, S.; Kröger, M.; Görrn, P.; Johannes, H.-H.; Riedl, T.; Lang, E.; Becker, D.; Dobbertin, T.; Kowalsky, W.

    2007-09-01

    One of the unique selling propositions of OLEDs is their potential to realize highly transparent devices over the visible spectrum. This is because organic semiconductors provide a large Stokes-Shift and low intrinsic absorption losses. Hence, new areas of applications for displays and ambient lighting become accessible, for instance, the integration of OLEDs into the windshield or the ceiling of automobiles. The main challenge in the realization of fully transparent devices is the deposition of the top electrode. ITO is commonly used as transparent bottom anode in a conventional OLED. To obtain uniform light emission over the entire viewing angle and a low series resistance, a TCO such as ITO is desirable as top contact as well. However, sputter deposition of ITO on top of organic layers causes damage induced by high energetic particles and UV radiation. We have found an efficient process to protect the organic layers against the ITO rf magnetron deposition process of ITO for an inverted OLED (IOLED). The inverted structure allows the integration of OLEDs in more powerful n-channel transistors used in active matrix backplanes. Employing the green electrophosphorescent material Ir(ppy) 3 lead to IOLED with a current efficiency of 50 cd/A and power efficiency of 24 lm/W at 100 cd/m2. The average transmittance exceeds 80 % in the visible region. The on-set voltage for light emission is lower than 3 V. In addition, by vertical stacking we achieved a very high current efficiency of more than 70 cd/A for transparent IOLED.

  1. High efficiency lithium-thionyl chloride cell

    Science.gov (United States)

    Doddapaneni, N.

    1982-08-01

    The polarization characteristics and the specific cathode capacity of Teflon bonded carbon electrodes in the Li/SOCl2 system have been evaluated. Doping of electrocatalysts such as cobalt and iron phthalocyanine complexes improved both cell voltage and cell rate capability. High efficiency Li/SOCl2 cells were thus achieved with catalyzed cathodes. The electrochemical reduction of SOCl2 seems to undergo modification at catalyzed cathode. For example, the reduction of SOCl2 at FePc catalyzed cathode involves 2-1/2 e-/mole of SOCl2. Furthermore, the reduction mechanism is simplified and unwanted chemical species are eliminated by the catalyst. Thus a potentially safer high efficiency Li/SOCl2 can be anticipated.

  2. A flexible, extendable, modular and computationally efficient approach to scattering-integral-based seismic full waveform inversion

    Science.gov (United States)

    Schumacher, F.; Friederich, W.; Lamara, S.

    2016-02-01

    We present a new conceptual approach to scattering-integral-based seismic full waveform inversion (FWI) that allows a flexible, extendable, modular and both computationally and storage-efficient numerical implementation. To achieve maximum modularity and extendability, interactions between the three fundamental steps carried out sequentially in each iteration of the inversion procedure, namely, solving the forward problem, computing waveform sensitivity kernels and deriving a model update, are kept at an absolute minimum and are implemented by dedicated interfaces. To realize storage efficiency and maximum flexibility, the spatial discretization of the inverted earth model is allowed to be completely independent of the spatial discretization employed by the forward solver. For computational efficiency reasons, the inversion is done in the frequency domain. The benefits of our approach are as follows: (1) Each of the three stages of an iteration is realized by a stand-alone software program. In this way, we avoid the monolithic, unflexible and hard-to-modify codes that have often been written for solving inverse problems. (2) The solution of the forward problem, required for kernel computation, can be obtained by any wave propagation modelling code giving users maximum flexibility in choosing the forward modelling method. Both time-domain and frequency-domain approaches can be used. (3) Forward solvers typically demand spatial discretizations that are significantly denser than actually desired for the inverted model. Exploiting this fact by pre-integrating the kernels allows a dramatic reduction of disk space and makes kernel storage feasible. No assumptions are made on the spatial discretization scheme employed by the forward solver. (4) In addition, working in the frequency domain effectively reduces the amount of data, the number of kernels to be computed and the number of equations to be solved. (5) Updating the model by solving a large equation system can be

  3. Computationally efficient real-time interpolation algorithm for non-uniform sampled biosignals.

    Science.gov (United States)

    Guven, Onur; Eftekhar, Amir; Kindt, Wilko; Constandinou, Timothy G

    2016-06-01

    This Letter presents a novel, computationally efficient interpolation method that has been optimised for use in electrocardiogram baseline drift removal. In the authors' previous Letter three isoelectric baseline points per heartbeat are detected, and here utilised as interpolation points. As an extension from linear interpolation, their algorithm segments the interpolation interval and utilises different piecewise linear equations. Thus, the algorithm produces a linear curvature that is computationally efficient while interpolating non-uniform samples. The proposed algorithm is tested using sinusoids with different fundamental frequencies from 0.05 to 0.7 Hz and also validated with real baseline wander data acquired from the Massachusetts Institute of Technology University and Boston's Beth Israel Hospital (MIT-BIH) Noise Stress Database. The synthetic data results show an root mean square (RMS) error of 0.9 μV (mean), 0.63 μV (median) and 0.6 μV (standard deviation) per heartbeat on a 1 mVp-p 0.1 Hz sinusoid. On real data, they obtain an RMS error of 10.9 μV (mean), 8.5 μV (median) and 9.0 μV (standard deviation) per heartbeat. Cubic spline interpolation and linear interpolation on the other hand shows 10.7 μV, 11.6 μV (mean), 7.8 μV, 8.9 μV (median) and 9.8 μV, 9.3 μV (standard deviation) per heartbeat.

  4. Debugging a high performance computing program

    Science.gov (United States)

    Gooding, Thomas M.

    2013-08-20

    Methods, apparatus, and computer program products are disclosed for debugging a high performance computing program by gathering lists of addresses of calling instructions for a plurality of threads of execution of the program, assigning the threads to groups in dependence upon the addresses, and displaying the groups to identify defective threads.

  5. A C++11 implementation of arbitrary-rank tensors for high-performance computing

    Science.gov (United States)

    Aragón, Alejandro M.

    2014-11-01

    This article discusses an efficient implementation of tensors of arbitrary rank by using some of the idioms introduced by the recently published C++ ISO Standard (C++11). With the aims at providing a basic building block for high-performance computing, a single Array class template is carefully crafted, from which vectors, matrices, and even higher-order tensors can be created. An expression template facility is also built around the array class template to provide convenient mathematical syntax. As a result, by using templates, an extra high-level layer is added to the C++ language when dealing with algebraic objects and their operations, without compromising performance. The implementation is tested running on both CPU and GPU.

  6. Neural chips, neural computers and application in high and superhigh energy physics experiments

    International Nuclear Information System (INIS)

    Nikityuk, N.M.; )

    2001-01-01

    Architecture peculiarity and characteristics of series of neural chips and neural computes used in scientific instruments are considered. Tendency of development and use of them in high energy and superhigh energy physics experiments are described. Comparative data which characterize the efficient use of neural chips for useful event selection, classification elementary particles, reconstruction of tracks of charged particles and for search of hypothesis Higgs particles are given. The characteristics of native neural chips and accelerated neural boards are considered [ru

  7. Parallel Computing:. Some Activities in High Energy Physics

    Science.gov (United States)

    Willers, Ian

    This paper examines some activities in High Energy Physics that utilise parallel computing. The topic includes all computing from the proposed SIMD front end detectors, the farming applications, high-powered RISC processors and the large machines in the computer centers. We start by looking at the motivation behind using parallelism for general purpose computing. The developments around farming are then described from its simplest form to the more complex system in Fermilab. Finally, there is a list of some developments that are happening close to the experiments.

  8. Efficient Use of Preisach Hysteresis Model in Computer Aided Design

    Directory of Open Access Journals (Sweden)

    IONITA, V.

    2013-05-01

    Full Text Available The paper presents a practical detailed analysis regarding the use of the classical Preisach hysteresis model, covering all the steps, from measuring the necessary data for the model identification to the implementation in a software code for Computer Aided Design (CAD in Electrical Engineering. An efficient numerical method is proposed and the hysteresis modeling accuracy is tested on magnetic recording materials. The procedure includes the correction of the experimental data, which are used for the hysteresis model identification, taking into account the demagnetizing effect for the sample that is measured in an open-circuit device (a vibrating sample magnetometer.

  9. Agglomeration Economies and the High-Tech Computer

    OpenAIRE

    Wallace, Nancy E.; Walls, Donald

    2004-01-01

    This paper considers the effects of agglomeration on the production decisions of firms in the high-tech computer cluster. We build upon an alternative definition of the high-tech computer cluster developed by Bardhan et al. (2003) and we exploit a new data source, the National Establishment Time-Series (NETS) Database, to analyze the spatial distribution of firms in this industry. An essential contribution of this research is the recognition that high-tech firms are heterogeneous collections ...

  10. Measure Guideline. High Efficiency Natural Gas Furnaces

    Energy Technology Data Exchange (ETDEWEB)

    Brand, L. [Partnership for Advanced Residential Retrofit (PARR), Des Plaines, IL (United States); Rose, W. [Partnership for Advanced Residential Retrofit (PARR), Des Plaines, IL (United States)

    2012-10-01

    This measure guideline covers installation of high-efficiency gas furnaces, including: when to install a high-efficiency gas furnace as a retrofit measure; how to identify and address risks; and the steps to be used in the selection and installation process. The guideline is written for Building America practitioners and HVAC contractors and installers. It includes a compilation of information provided by manufacturers, researchers, and the Department of Energy as well as recent research results from the Partnership for Advanced Residential Retrofit (PARR) Building America team.

  11. Highly Efficient Estimators of Multivariate Location with High Breakdown Point

    NARCIS (Netherlands)

    Lopuhaa, H.P.

    1991-01-01

    We propose an affine equivariant estimator of multivariate location that combines a high breakdown point and a bounded influence function with high asymptotic efficiency. This proposal is basically a location $M$-estimator based on the observations obtained after scaling with an affine equivariant

  12. Energy efficiency of error correction on wireless systems

    NARCIS (Netherlands)

    Havinga, Paul J.M.

    1999-01-01

    Since high error rates are inevitable to the wireless environment, energy-efficient error-control is an important issue for mobile computing systems. We have studied the energy efficiency of two different error correction mechanisms and have measured the efficiency of an implementation in software.

  13. Software Systems for High-performance Quantum Computing

    Energy Technology Data Exchange (ETDEWEB)

    Humble, Travis S [ORNL; Britt, Keith A [ORNL

    2016-01-01

    Quantum computing promises new opportunities for solving hard computational problems, but harnessing this novelty requires breakthrough concepts in the design, operation, and application of computing systems. We define some of the challenges facing the development of quantum computing systems as well as software-based approaches that can be used to overcome these challenges. Following a brief overview of the state of the art, we present models for the quantum programming and execution models, the development of architectures for hybrid high-performance computing systems, and the realization of software stacks for quantum networking. This leads to a discussion of the role that conventional computing plays in the quantum paradigm and how some of the current challenges for exascale computing overlap with those facing quantum computing.

  14. Prediction and design of efficient exciplex emitters for high-efficiency, thermally activated delayed-fluorescence organic light-emitting diodes.

    Science.gov (United States)

    Liu, Xiao-Ke; Chen, Zhan; Zheng, Cai-Jun; Liu, Chuan-Lin; Lee, Chun-Sing; Li, Fan; Ou, Xue-Mei; Zhang, Xiao-Hong

    2015-04-08

    High-efficiency, thermally activated delayed-fluorescence organic light-emitting diodes based on exciplex emitters are demonstrated. The best device, based on a TAPC:DPTPCz emitter, shows a high external quantum efficiency of 15.4%. Strategies for predicting and designing efficient exciplex emitters are also provided. This approach allow prediction and design of efficient exciplex emitters for achieving high-efficiency organic light-emitting diodes, for future use in displays and lighting applications. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. Lightweight Provenance Service for High-Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Dai, Dong; Chen, Yong; Carns, Philip; Jenkins, John; Ross, Robert

    2017-09-09

    Provenance describes detailed information about the history of a piece of data, containing the relationships among elements such as users, processes, jobs, and workflows that contribute to the existence of data. Provenance is key to supporting many data management functionalities that are increasingly important in operations such as identifying data sources, parameters, or assumptions behind a given result; auditing data usage; or understanding details about how inputs are transformed into outputs. Despite its importance, however, provenance support is largely underdeveloped in highly parallel architectures and systems. One major challenge is the demanding requirements of providing provenance service in situ. The need to remain lightweight and to be always on often conflicts with the need to be transparent and offer an accurate catalog of details regarding the applications and systems. To tackle this challenge, we introduce a lightweight provenance service, called LPS, for high-performance computing (HPC) systems. LPS leverages a kernel instrument mechanism to achieve transparency and introduces representative execution and flexible granularity to capture comprehensive provenance with controllable overhead. Extensive evaluations and use cases have confirmed its efficiency and usability. We believe that LPS can be integrated into current and future HPC systems to support a variety of data management needs.

  16. Simple Retrofit High-Efficiency Natural Gas Water Heater Field Test

    Energy Technology Data Exchange (ETDEWEB)

    Schoenbauer, Ben [NorthernSTAR, St. Paul, MN (United States)

    2017-03-01

    High-performance water heaters are typically more time consuming and costly to install in retrofit applications, making high performance water heaters difficult to justify economically. However, recent advancements in high performance water heaters have targeted the retrofit market, simplifying installations and reducing costs. Four high efficiency natural gas water heaters designed specifically for retrofit applications were installed in single-family homes along with detailed monitoring systems to characterize their savings potential, their installed efficiencies, and their ability to meet household demands. The water heaters tested for this project were designed to improve the cost-effectiveness and increase market penetration of high efficiency water heaters in the residential retrofit market. The retrofit high efficiency water heaters achieved their goal of reducing costs, maintaining savings potential and installed efficiency of other high efficiency water heaters, and meeting the necessary capacity in order to improve cost-effectiveness. However, the improvements were not sufficient to achieve simple paybacks of less than ten years for the incremental cost compared to a minimum efficiency heater. Significant changes would be necessary to reduce the simple payback to six years or less. Annual energy savings in the range of $200 would also reduce paybacks to less than six years. These energy savings would require either significantly higher fuel costs (greater than $1.50 per therm) or very high usage (around 120 gallons per day). For current incremental costs, the water heater efficiency would need to be similar to that of a heat pump water heater to deliver a six year payback.

  17. Simple Retrofit High-Efficiency Natural Gas Water Heater Field Test

    Energy Technology Data Exchange (ETDEWEB)

    Schoenbauer, Ben [NorthernSTAR, St. Paul, MN (United States)

    2017-03-28

    High performance water heaters are typically more time consuming and costly to install in retrofit applications, making high performance water heaters difficult to justify economically. However, recent advancements in high performance water heaters have targeted the retrofit market, simplifying installations and reducing costs. Four high efficiency natural gas water heaters designed specifically for retrofit applications were installed in single-family homes along with detailed monitoring systems to characterize their savings potential, their installed efficiencies, and their ability to meet household demands. The water heaters tested for this project were designed to improve the cost-effectiveness and increase market penetration of high efficiency water heaters in the residential retrofit market. The retrofit high efficiency water heaters achieved their goal of reducing costs, maintaining savings potential and installed efficiency of other high efficiency water heaters, and meeting the necessary capacity in order to improve cost-effectiveness. However, the improvements were not sufficient to achieve simple paybacks of less than ten years for the incremental cost compared to a minimum efficiency heater. Significant changes would be necessary to reduce the simple payback to six years or less. Annual energy savings in the range of $200 would also reduce paybacks to less than six years. These energy savings would require either significantly higher fuel costs (greater than $1.50 per therm) or very high usage (around 120 gallons per day). For current incremental costs, the water heater efficiency would need to be similar to that of a heat pump water heater to deliver a six year payback.

  18. Department of Energy research in utilization of high-performance computers

    International Nuclear Information System (INIS)

    Buzbee, B.L.; Worlton, W.J.; Michael, G.; Rodrigue, G.

    1980-08-01

    Department of Energy (DOE) and other Government research laboratories depend on high-performance computer systems to accomplish their programmatic goals. As the most powerful computer systems become available, they are acquired by these laboratories so that advances can be made in their disciplines. These advances are often the result of added sophistication to numerical models, the execution of which is made possible by high-performance computer systems. However, high-performance computer systems have become increasingly complex, and consequently it has become increasingly difficult to realize their potential performance. The result is a need for research on issues related to the utilization of these systems. This report gives a brief description of high-performance computers, and then addresses the use of and future needs for high-performance computers within DOE, the growing complexity of applications within DOE, and areas of high-performance computer systems warranting research. 1 figure

  19. Applying Machine Learning and High Performance Computing to Water Quality Assessment and Prediction

    Directory of Open Access Journals (Sweden)

    Ruijian Zhang

    2017-12-01

    Full Text Available Water quality assessment and prediction is a more and more important issue. Traditional ways either take lots of time or they can only do assessments. In this research, by applying machine learning algorithm to a long period time of water attributes’ data; we can generate a decision tree so that it can predict the future day’s water quality in an easy and efficient way. The idea is to combine the traditional ways and the computer algorithms together. Using machine learning algorithms, the assessment of water quality will be far more efficient, and by generating the decision tree, the prediction will be quite accurate. The drawback of the machine learning modeling is that the execution takes quite long time, especially when we employ a better accuracy but more time-consuming algorithm in clustering. Therefore, we applied the high performance computing (HPC System to deal with this problem. Up to now, the pilot experiments have achieved very promising preliminary results. The visualized water quality assessment and prediction obtained from this project would be published in an interactive website so that the public and the environmental managers could use the information for their decision making.

  20. High efficiency carbon nanotube thread antennas

    Science.gov (United States)

    Amram Bengio, E.; Senic, Damir; Taylor, Lauren W.; Tsentalovich, Dmitri E.; Chen, Peiyu; Holloway, Christopher L.; Babakhani, Aydin; Long, Christian J.; Novotny, David R.; Booth, James C.; Orloff, Nathan D.; Pasquali, Matteo

    2017-10-01

    Although previous research has explored the underlying theory of high-frequency behavior of carbon nanotubes (CNTs) and CNT bundles for antennas, there is a gap in the literature for direct experimental measurements of radiation efficiency. These measurements are crucial for any practical application of CNT materials in wireless communication. In this letter, we report a measurement technique to accurately characterize the radiation efficiency of λ/4 monopole antennas made from the CNT thread. We measure the highest absolute values of radiation efficiency for CNT antennas of any type, matching that of copper wire. To capture the weight savings, we propose a specific radiation efficiency metric and show that these CNT antennas exceed copper's performance by over an order of magnitude at 1 GHz and 2.4 GHz. We also report direct experimental observation that, contrary to metals, the radiation efficiency of the CNT thread improves significantly at higher frequencies. These results pave the way for practical applications of CNT thread antennas, particularly in the aerospace and wearable electronics industries where weight saving is a priority.

  1. Doping efficiency analysis of highly phosphorous doped epitaxial/amorphous silicon emitters grown by PECVD for high efficiency silicon solar cells

    Energy Technology Data Exchange (ETDEWEB)

    El-Gohary, H.G.; Sivoththaman, S. [Waterloo Univ., ON (Canada). Dept. of Electrical and Computer Engineering

    2008-08-15

    The efficient doping of hydrogenated amorphous and crystalline silicon thin films is a key factor in the fabrication of silicon solar cells. The most popular method for developing those films is plasma enhanced chemical vapor deposition (PECVD) because it minimizes defect density and improves doping efficiency. This paper discussed the preparation of different structure phosphorous doped silicon emitters ranging from epitaxial to amorphous films at low temperature. Phosphine (PH{sub 3}) was employed as the doping gas source with the same gas concentration for both epitaxial and amorphous silicon emitters. The paper presented an analysis of dopant activation by applying a very short rapid thermal annealing process (RTP). A spreading resistance profile (SRP) and SIMS analysis were used to detect both the active dopant and the dopant concentrations, respectively. The paper also provided the results of a structural analysis for both bulk and cross-section at the interface using high-resolution transmission electron microscopy and Raman spectroscopy, for epitaxial and amorphous films. It was concluded that a unity doping efficiency could be achieved in epitaxial layers by applying an optimized temperature profile using short time processing rapid thermal processing technique. The high quality, one step epitaxial layers, led to both high conductive and high doping efficiency layers.

  2. Measure Guideline: High Efficiency Natural Gas Furnaces

    Energy Technology Data Exchange (ETDEWEB)

    Brand, L.; Rose, W.

    2012-10-01

    This Measure Guideline covers installation of high-efficiency gas furnaces. Topics covered include when to install a high-efficiency gas furnace as a retrofit measure, how to identify and address risks, and the steps to be used in the selection and installation process. The guideline is written for Building America practitioners and HVAC contractors and installers. It includes a compilation of information provided by manufacturers, researchers, and the Department of Energy as well as recent research results from the Partnership for Advanced Residential Retrofit (PARR) Building America team.

  3. Innovative-Simplified Nuclear Power Plant Efficiency Evaluation with High-Efficiency Steam Injector System

    International Nuclear Information System (INIS)

    Shoji, Goto; Shuichi, Ohmori; Michitsugu, Mori

    2006-01-01

    It is possible to establish simplified system with reduced space and total equipment weight using high-efficiency Steam Injectors (SI) instead of low-pressure feedwater heaters in Nuclear Power Plant (NPP). The SI works as a heat exchanger through direct contact between feedwater from condensers and extracted steam from turbines. It can get higher pressure than supplied steam pressure. The maintenance and reliability are still higher than the feedwater ones because SI has no movable parts. This paper describes the analysis of the heat balance, plant efficiency and the operation of this Innovative-Simplified NPP with high-efficiency SI. The plant efficiency and operation are compared with the electric power of 1100 MWe-class BWR system and the Innovative-Simplified BWR system with SI. The SI model is adapted into the heat balance simulator with a simplified model. The results show that plant efficiencies of the Innovated-Simplified BWR system are almost equal to original BWR ones. The present research is one of the projects that are carried out by Tokyo Electric Power Company, Toshiba Corporation, and six Universities in Japan, funded from the Institute of Applied Energy (IAE) of Japan as the national public research-funded program. (authors)

  4. An Efficient UD-Based Algorithm for the Computation of Maximum Likelihood Sensitivity of Continuous-Discrete Systems

    DEFF Research Database (Denmark)

    Boiroux, Dimitri; Juhl, Rune; Madsen, Henrik

    2016-01-01

    This paper addresses maximum likelihood parameter estimation of continuous-time nonlinear systems with discrete-time measurements. We derive an efficient algorithm for the computation of the log-likelihood function and its gradient, which can be used in gradient-based optimization algorithms....... This algorithm uses UD decomposition of symmetric matrices and the array algorithm for covariance update and gradient computation. We test our algorithm on the Lotka-Volterra equations. Compared to the maximum likelihood estimation based on finite difference gradient computation, we get a significant speedup...

  5. High-frequency combination coding-based steady-state visual evoked potential for brain computer interface

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, Feng; Zhang, Xin; Xie, Jun; Li, Yeping; Han, Chengcheng; Lili, Li; Wang, Jing [School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an 710049 (China); Xu, Guang-Hua [School of Mechanical Engineering, Xi’an Jiaotong University, Xi’an 710049 (China); State Key Laboratory for Manufacturing Systems Engineering, Xi’an Jiaotong University, Xi’an 710054 (China)

    2015-03-10

    This study presents a new steady-state visual evoked potential (SSVEP) paradigm for brain computer interface (BCI) systems. The goal of this study is to increase the number of targets using fewer stimulation high frequencies, with diminishing subject’s fatigue and reducing the risk of photosensitive epileptic seizures. The new paradigm is High-Frequency Combination Coding-Based High-Frequency Steady-State Visual Evoked Potential (HFCC-SSVEP).Firstly, we studied SSVEP high frequency(beyond 25 Hz)response of SSVEP, whose paradigm is presented on the LED. The SNR (Signal to Noise Ratio) of high frequency(beyond 40 Hz) response is very low, which is been unable to be distinguished through the traditional analysis method; Secondly we investigated the HFCC-SSVEP response (beyond 25 Hz) for 3 frequencies (25Hz, 33.33Hz, and 40Hz), HFCC-SSVEP produces n{sup n} with n high stimulation frequencies through Frequence Combination Code. Further, Animproved Hilbert-huang transform (IHHT)-based variable frequency EEG feature extraction method and a local spectrum extreme target identification algorithmare adopted to extract time-frequency feature of the proposed HFCC-SSVEP response.Linear predictions and fixed sifting (iterating) 10 time is used to overcome the shortage of end effect and stopping criterion,generalized zero-crossing (GZC) is used to compute the instantaneous frequency of the proposed SSVEP respondent signals, the improved HHT-based feature extraction method for the proposed SSVEP paradigm in this study increases recognition efficiency, so as to improve ITR and to increase the stability of the BCI system. what is more, SSVEPs evoked by high-frequency stimuli (beyond 25Hz) minimally diminish subject’s fatigue and prevent safety hazards linked to photo-induced epileptic seizures, So as to ensure the system efficiency and undamaging.This study tests three subjects in order to verify the feasibility of the proposed method.

  6. High-frequency combination coding-based steady-state visual evoked potential for brain computer interface

    International Nuclear Information System (INIS)

    Zhang, Feng; Zhang, Xin; Xie, Jun; Li, Yeping; Han, Chengcheng; Lili, Li; Wang, Jing; Xu, Guang-Hua

    2015-01-01

    This study presents a new steady-state visual evoked potential (SSVEP) paradigm for brain computer interface (BCI) systems. The goal of this study is to increase the number of targets using fewer stimulation high frequencies, with diminishing subject’s fatigue and reducing the risk of photosensitive epileptic seizures. The new paradigm is High-Frequency Combination Coding-Based High-Frequency Steady-State Visual Evoked Potential (HFCC-SSVEP).Firstly, we studied SSVEP high frequency(beyond 25 Hz)response of SSVEP, whose paradigm is presented on the LED. The SNR (Signal to Noise Ratio) of high frequency(beyond 40 Hz) response is very low, which is been unable to be distinguished through the traditional analysis method; Secondly we investigated the HFCC-SSVEP response (beyond 25 Hz) for 3 frequencies (25Hz, 33.33Hz, and 40Hz), HFCC-SSVEP produces n n with n high stimulation frequencies through Frequence Combination Code. Further, Animproved Hilbert-huang transform (IHHT)-based variable frequency EEG feature extraction method and a local spectrum extreme target identification algorithmare adopted to extract time-frequency feature of the proposed HFCC-SSVEP response.Linear predictions and fixed sifting (iterating) 10 time is used to overcome the shortage of end effect and stopping criterion,generalized zero-crossing (GZC) is used to compute the instantaneous frequency of the proposed SSVEP respondent signals, the improved HHT-based feature extraction method for the proposed SSVEP paradigm in this study increases recognition efficiency, so as to improve ITR and to increase the stability of the BCI system. what is more, SSVEPs evoked by high-frequency stimuli (beyond 25Hz) minimally diminish subject’s fatigue and prevent safety hazards linked to photo-induced epileptic seizures, So as to ensure the system efficiency and undamaging.This study tests three subjects in order to verify the feasibility of the proposed method

  7. Modeling the evolution of channel shape: Balancing computational efficiency with hydraulic fidelity

    Science.gov (United States)

    Wobus, C.W.; Kean, J.W.; Tucker, G.E.; Anderson, R. Scott

    2008-01-01

    The cross-sectional shape of a natural river channel controls the capacity of the system to carry water off a landscape, to convey sediment derived from hillslopes, and to erode its bed and banks. Numerical models that describe the response of a landscape to changes in climate or tectonics therefore require formulations that can accommodate evolution of channel cross-sectional geometry. However, fully two-dimensional (2-D) flow models are too computationally expensive to implement in large-scale landscape evolution models, while available simple empirical relationships between width and discharge do not adequately capture the dynamics of channel adjustment. We have developed a simplified 2-D numerical model of channel evolution in a cohesive, detachment-limited substrate subject to steady, unidirectional flow. Erosion is assumed to be proportional to boundary shear stress, which is calculated using an approximation of the flow field in which log-velocity profiles are assumed to apply along vectors that are perpendicular to the local channel bed. Model predictions of the velocity structure, peak boundary shear stress, and equilibrium channel shape compare well with predictions of a more sophisticated but more computationally demanding ray-isovel model. For example, the mean velocities computed by the two models are consistent to within ???3%, and the predicted peak shear stress is consistent to within ???7%. Furthermore, the shear stress distributions predicted by our model compare favorably with available laboratory measurements for prescribed channel shapes. A modification to our simplified code in which the flow includes a high-velocity core allows the model to be extended to estimate shear stress distributions in channels with large width-to-depth ratios. Our model is efficient enough to incorporate into large-scale landscape evolution codes and can be used to examine how channels adjust both cross-sectional shape and slope in response to tectonic and climatic

  8. COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1991-03-15

    In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour.

  9. COMPUTERS: Teraflops for Europe; EEC Working Group on High Performance Computing

    International Nuclear Information System (INIS)

    Anon.

    1991-01-01

    In little more than a decade, simulation on high performance computers has become an essential tool for theoretical physics, capable of solving a vast range of crucial problems inaccessible to conventional analytic mathematics. In many ways, computer simulation has become the calculus for interacting many-body systems, a key to the study of transitions from isolated to collective behaviour

  10. Development of three-dimensional neoclassical transport simulation code with high performance Fortran on a vector-parallel computer

    International Nuclear Information System (INIS)

    Satake, Shinsuke; Okamoto, Masao; Nakajima, Noriyoshi; Takamaru, Hisanori

    2005-11-01

    A neoclassical transport simulation code (FORTEC-3D) applicable to three-dimensional configurations has been developed using High Performance Fortran (HPF). Adoption of computing techniques for parallelization and a hybrid simulation model to the δf Monte-Carlo method transport simulation, including non-local transport effects in three-dimensional configurations, makes it possible to simulate the dynamism of global, non-local transport phenomena with a self-consistent radial electric field within a reasonable computation time. In this paper, development of the transport code using HPF is reported. Optimization techniques in order to achieve both high vectorization and parallelization efficiency, adoption of a parallel random number generator, and also benchmark results, are shown. (author)

  11. High Efficiency Centrifugal Compressor for Rotorcraft Applications

    Science.gov (United States)

    Medic, Gorazd; Sharma, Om P.; Jongwook, Joo; Hardin, Larry W.; McCormick, Duane C.; Cousins, William T.; Lurie, Elizabeth A.; Shabbir, Aamir; Holley, Brian M.; Van Slooten, Paul R.

    2017-01-01

    A centrifugal compressor research effort conducted by United Technologies Research Center under NASA Research Announcement NNC08CB03C is documented. The objectives were to identify key technical barriers to advancing the aerodynamic performance of high-efficiency, high work factor, compact centrifugal compressor aft-stages for turboshaft engines; to acquire measurements needed to overcome the technical barriers and inform future designs; to design, fabricate, and test a new research compressor in which to acquire the requisite flow field data. A new High-Efficiency Centrifugal Compressor stage -- splittered impeller, splittered diffuser, 90 degree bend, and exit guide vanes -- with aerodynamically aggressive performance and configuration (compactness) goals were designed, fabricated, and subquently tested at the NASA Glenn Research Center.

  12. Quasi-kinoform type multilayer zone plate with high diffraction efficiency for high-energy X-rays

    International Nuclear Information System (INIS)

    Tamura, S; Yasumoto, M; Kamijo, N; Uesugi, K; Takeuchi, A; Terada, Y; Suzuki, Y

    2009-01-01

    Fresnel zone plate (FZP) with high diffraction efficiency leads to high performance X-ray microscopy with the reduction of the radiation damage to biological specimens. In order to attain high diffraction efficiency in high energy X-ray region, we have developed multilevel-type (6-step) multilayer FZPs with the diameter of 70 micron. The efficiencies of two FZPs were evaluated at the BL20XU beamline of SPring-8. For one FZP, the peak efficiency for the 1st-order diffraction of 51% has been obtained at 70 keV. The efficiencies higher than 40% have been achieved in the wide energy range of 70-90 keV. That for the 2nd-order diffraction of 46% has been obtained at 37.5 keV.

  13. High-efficiency organic glass scintillators

    Science.gov (United States)

    Feng, Patrick L.; Carlson, Joseph S.

    2017-12-19

    A new family of neutron/gamma discriminating scintillators is disclosed that comprises stable organic glasses that may be melt-cast into transparent monoliths. These materials have been shown to provide light yields greater than solution-grown trans-stilbene crystals and efficient PSD capabilities when combined with 0.01 to 0.05% by weight of the total composition of a wavelength-shifting fluorophore. Photoluminescence measurements reveal fluorescence quantum yields that are 2 to 5 times greater than conventional plastic or liquid scintillator matrices, which accounts for the superior light yield of these glasses. The unique combination of high scintillation light-yields, efficient neutron/gamma PSD, and straightforward scale-up via melt-casting distinguishes the developed organic glasses from existing scintillators.

  14. An Efficient and Secure m-IPS Scheme of Mobile Devices for Human-Centric Computing

    Directory of Open Access Journals (Sweden)

    Young-Sik Jeong

    2014-01-01

    Full Text Available Recent rapid developments in wireless and mobile IT technologies have led to their application in many real-life areas, such as disasters, home networks, mobile social networks, medical services, industry, schools, and the military. Business/work environments have become wire/wireless, integrated with wireless networks. Although the increase in the use of mobile devices that can use wireless networks increases work efficiency and provides greater convenience, wireless access to networks represents a security threat. Currently, wireless intrusion prevention systems (IPSs are used to prevent wireless security threats. However, these are not an ideal security measure for businesses that utilize mobile devices because they do not take account of temporal-spatial and role information factors. Therefore, in this paper, an efficient and secure mobile-IPS (m-IPS is proposed for businesses utilizing mobile devices in mobile environments for human-centric computing. The m-IPS system incorporates temporal-spatial awareness in human-centric computing with various mobile devices and checks users’ temporal spatial information, profiles, and role information to provide precise access control. And it also can extend application of m-IPS to the Internet of things (IoT, which is one of the important advanced technologies for supporting human-centric computing environment completely, for real ubiquitous field with mobile devices.

  15. GPU-based high-performance computing for radiation therapy

    International Nuclear Information System (INIS)

    Jia, Xun; Jiang, Steve B; Ziegenhein, Peter

    2014-01-01

    Recent developments in radiotherapy therapy demand high computation powers to solve challenging problems in a timely fashion in a clinical environment. The graphics processing unit (GPU), as an emerging high-performance computing platform, has been introduced to radiotherapy. It is particularly attractive due to its high computational power, small size, and low cost for facility deployment and maintenance. Over the past few years, GPU-based high-performance computing in radiotherapy has experienced rapid developments. A tremendous amount of study has been conducted, in which large acceleration factors compared with the conventional CPU platform have been observed. In this paper, we will first give a brief introduction to the GPU hardware structure and programming model. We will then review the current applications of GPU in major imaging-related and therapy-related problems encountered in radiotherapy. A comparison of GPU with other platforms will also be presented. (topical review)

  16. BEAGLE: an application programming interface and high-performance computing library for statistical phylogenetics.

    Science.gov (United States)

    Ayres, Daniel L; Darling, Aaron; Zwickl, Derrick J; Beerli, Peter; Holder, Mark T; Lewis, Paul O; Huelsenbeck, John P; Ronquist, Fredrik; Swofford, David L; Cummings, Michael P; Rambaut, Andrew; Suchard, Marc A

    2012-01-01

    Phylogenetic inference is fundamental to our understanding of most aspects of the origin and evolution of life, and in recent years, there has been a concentration of interest in statistical approaches such as Bayesian inference and maximum likelihood estimation. Yet, for large data sets and realistic or interesting models of evolution, these approaches remain computationally demanding. High-throughput sequencing can yield data for thousands of taxa, but scaling to such problems using serial computing often necessitates the use of nonstatistical or approximate approaches. The recent emergence of graphics processing units (GPUs) provides an opportunity to leverage their excellent floating-point computational performance to accelerate statistical phylogenetic inference. A specialized library for phylogenetic calculation would allow existing software packages to make more effective use of available computer hardware, including GPUs. Adoption of a common library would also make it easier for other emerging computing architectures, such as field programmable gate arrays, to be used in the future. We present BEAGLE, an application programming interface (API) and library for high-performance statistical phylogenetic inference. The API provides a uniform interface for performing phylogenetic likelihood calculations on a variety of compute hardware platforms. The library includes a set of efficient implementations and can currently exploit hardware including GPUs using NVIDIA CUDA, central processing units (CPUs) with Streaming SIMD Extensions and related processor supplementary instruction sets, and multicore CPUs via OpenMP. To demonstrate the advantages of a common API, we have incorporated the library into several popular phylogenetic software packages. The BEAGLE library is free open source software licensed under the Lesser GPL and available from http://beagle-lib.googlecode.com. An example client program is available as public domain software.

  17. The thermodynamic characteristics of high efficiency, internal-combustion engines

    International Nuclear Information System (INIS)

    Caton, Jerald A.

    2012-01-01

    Highlights: ► The thermodynamics of an automotive engine are determined using a cycle simulation. ► The net indicated thermal efficiency increased from 37.0% to 53.9%. ► High compression ratio, lean mixtures and high EGR were the important features. ► Efficiency increased due to lower heat losses, and increased work conversion. ► The nitric oxides were essentially zero due to the low combustion temperatures. - Abstract: Recent advancements have demonstrated new combustion modes for internal combustion engines that exhibit low nitric oxide emissions and high thermal efficiencies. These new combustion modes involve various combinations of stratification, lean mixtures, high levels of EGR, multiple injections, variable valve timings, two fuels, and other such features. Although the exact combination of these features that provides the best design is not yet clear, the results (low emissions with high efficiencies) are of major interest. The current work is directed at determining some of the fundamental thermodynamic reasons for the relatively high efficiencies and to quantify these factors. Both the first and second laws are used in this assessment. An automotive engine (5.7 l) which included some of the features mentioned above (e.g., high compression ratios, lean mixtures, and high EGR) was evaluated using a thermodynamic cycle simulation. These features were examined for a moderate load (bmep = 900 kPa), moderate speed (2000 rpm) condition. By the use of lean operation, high EGR levels, high compression ratio and other features, the net indicated thermal efficiency increased from 37.0% to 53.9%. These increases are explained in a step-by-step fashion. The major reasons for these improvements include the higher compression ratio and the dilute charge (lean mixture, high EGR). The dilute charge resulted in lower temperatures which in turn resulted in lower heat loss. In addition, the lower temperatures resulted in higher ratios of the specific heats which

  18. Errors in measuring absorbed radiation and computing crop radiation use efficiency

    International Nuclear Information System (INIS)

    Gallo, K.P.; Daughtry, C.S.T.; Wiegand, C.L.

    1993-01-01

    Radiation use efficiency (RUE) is often a crucial component of crop growth models that relate dry matter production to energy received by the crop. RUE is a ratio that has units g J -1 , if defined as phytomass per unit of energy received, and units J J -1 , if defined as the energy content of phytomass per unit of energy received. Both the numerator and denominator in computation of RUE can vary with experimental assumptions and methodologies. The objectives of this study were to examine the effect that different methods of measuring the numerator and denominator have on the RUE of corn (Zea mays L.) and to illustrate this variation with experimental data. Computational methods examined included (i) direct measurements of the fraction of photosynthetically active radiation absorbed (f A ), (ii) estimates of f A derived from leaf area index (LAI), and (iii) estimates of f A derived from spectral vegetation indices. Direct measurements of absorbed PAR from planting to physiological maturity of corn were consistently greater than the indirect estimates based on green LAI or the spectral vegetation indices. Consequently, the RUE calculated using directly measured absorbed PAR was lower than the RUE calculated using the indirect measures of absorbed PAR. For crops that contain senesced vegetation, green LAI and the spectral vegetation indices provide appropriate estimates of the fraction of PAR absorbed by a crop canopy and, thus, accurate estimates of crop radiation use efficiency

  19. A comparison of efficient methods for the computation of Born gluon amplitudes

    International Nuclear Information System (INIS)

    Dinsdale, Michael; Ternick, Marko; Weinzierl, Stefan

    2006-01-01

    We compare four different methods for the numerical computation of the pure gluonic amplitudes in the Born approximation. We are in particular interested in the efficiency of the various methods as the number n of the external particles increases. In addition we investigate the numerical accuracy in critical phase space regions. The methods considered are based on (i) Berends-Giele recurrence relations, (ii) scalar diagrams, (iii) MHV vertices and (iv) BCF recursion relations

  20. Performance Measurements in a High Throughput Computing Environment

    CERN Document Server

    AUTHOR|(CDS)2145966; Gribaudo, Marco

    The IT infrastructures of companies and research centres are implementing new technologies to satisfy the increasing need of computing resources for big data analysis. In this context, resource profiling plays a crucial role in identifying areas where the improvement of the utilisation efficiency is needed. In order to deal with the profiling and optimisation of computing resources, two complementary approaches can be adopted: the measurement-based approach and the model-based approach. The measurement-based approach gathers and analyses performance metrics executing benchmark applications on computing resources. Instead, the model-based approach implies the design and implementation of a model as an abstraction of the real system, selecting only those aspects relevant to the study. This Thesis originates from a project carried out by the author within the CERN IT department. CERN is an international scientific laboratory that conducts fundamental researches in the domain of elementary particle physics. The p...

  1. Experiments on high efficiency aerosol filtration

    International Nuclear Information System (INIS)

    Mazzini, M.; Cuccuru, A.; Kunz, P.

    1977-01-01

    Research on high efficiency aerosol filtration by the Nuclear Engineering Institute of Pisa University and by CAMEN in collaboration with CNEN is outlined. HEPA filter efficiency was studied as a function of the type and size of the test aerosol, and as a function of flowrate (+-50% of the nominal value), air temperature (up to 70 0 C), relative humidity (up to 100%), and durability in a corrosive atmosphere (up to 140 hours in NaCl mist). In the selected experimental conditions these influences were appreciable but are not sufficient to be significant in industrial HEPA filter applications. Planned future research is outlined: measurement of the efficiency of two HEPA filters in series using a fixed particle size; dependence of the efficiency on air, temperatures up to 300-500 0 C; performance when subject to smoke from burning organic materials (natural rubber, neoprene, miscellaneous plastics). Such studies are relevant to possible accidental fires in a plutonium laboratory

  2. Performance of a high efficiency high power UHF klystron

    International Nuclear Information System (INIS)

    Konrad, G.T.

    1977-03-01

    A 500 kW c-w klystron was designed for the PEP storage ring at SLAC. The tube operates at 353.2 MHz, 62 kV, a microperveance of 0.75, and a gain of approximately 50 dB. Stable operation is required for a VSWR as high as 2 : 1 at any phase angle. The design efficiency is 70%. To obtain this value of efficiency, a second harmonic cavity is used in order to produce a very tightly bunched beam in the output gap. At the present time it is planned to install 12 such klystrons in PEP. A tube with a reduced size collector was operated at 4% duty at 500 kW. An efficiency of 63% was observed. The same tube was operated up to 200 kW c-w for PEP accelerator cavity tests. A full-scale c-w tube reached 500 kW at 65 kV with an efficiency of 55%. In addition to power and phase measurements into a matched load, some data at various load mismatches are presented

  3. Interactive Data Exploration for High-Performance Fluid Flow Computations through Porous Media

    KAUST Repository

    Perovic, Nevena

    2014-09-01

    © 2014 IEEE. Huge data advent in high-performance computing (HPC) applications such as fluid flow simulations usually hinders the interactive processing and exploration of simulation results. Such an interactive data exploration not only allows scientiest to \\'play\\' with their data but also to visualise huge (distributed) data sets in both an efficient and easy way. Therefore, we propose an HPC data exploration service based on a sliding window concept, that enables researches to access remote data (available on a supercomputer or cluster) during simulation runtime without exceeding any bandwidth limitations between the HPC back-end and the user front-end.

  4. High Performance Computing in Science and Engineering '99 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    2000-01-01

    The book contains reports about the most significant projects from science and engineering of the Federal High Performance Computing Center Stuttgart (HLRS). They were carefully selected in a peer-review process and are showcases of an innovative combination of state-of-the-art modeling, novel algorithms and the use of leading-edge parallel computer technology. The projects of HLRS are using supercomputer systems operated jointly by university and industry and therefore a special emphasis has been put on the industrial relevance of results and methods.

  5. Efficient 2-D DCT Computation from an Image Representation Point of View

    OpenAIRE

    Papakostas, G.A.; Koulouriotis, D.E.; Karakasis, E.G.

    2009-01-01

    A novel methodology that ensures the computation of 2-D DCT coefficients in gray-scale images as well as in binary ones, with high computation rates, was presented in the previous sections. Through a new image representation scheme, called ISR (Image Slice Representation) the 2-D DCT coefficients can be computed in significantly reduced time, with the same accuracy.

  6. Novel Intermode Prediction Algorithm for High Efficiency Video Coding Encoder

    Directory of Open Access Journals (Sweden)

    Chan-seob Park

    2014-01-01

    Full Text Available The joint collaborative team on video coding (JCT-VC is developing the next-generation video coding standard which is called high efficiency video coding (HEVC. In the HEVC, there are three units in block structure: coding unit (CU, prediction unit (PU, and transform unit (TU. The CU is the basic unit of region splitting like macroblock (MB. Each CU performs recursive splitting into four blocks with equal size, starting from the tree block. In this paper, we propose a fast CU depth decision algorithm for HEVC technology to reduce its computational complexity. In 2N×2N PU, the proposed method compares the rate-distortion (RD cost and determines the depth using the compared information. Moreover, in order to speed up the encoding time, the efficient merge SKIP detection method is developed additionally based on the contextual mode information of neighboring CUs. Experimental result shows that the proposed algorithm achieves the average time-saving factor of 44.84% in the random access (RA at Main profile configuration with the HEVC test model (HM 10.0 reference software. Compared to HM 10.0 encoder, a small BD-bitrate loss of 0.17% is also observed without significant loss of image quality.

  7. Process development for high-efficiency silicon solar cells

    Energy Technology Data Exchange (ETDEWEB)

    Gee, J.M.; Basore, P.A.; Buck, M.E.; Ruby, D.S.; Schubert, W.K.; Silva, B.L.; Tingley, J.W.

    1991-12-31

    Fabrication of high-efficiency silicon solar cells in an industrial environment requires a different optimization than in a laboratory environment. Strategies are presented for process development of high-efficiency silicon solar cells, with a goal of simplifying technology transfer into an industrial setting. The strategies emphasize the use of statistical experimental design for process optimization, and the use of baseline processes and cells for process monitoring and quality control. 8 refs.

  8. Experimental quantum computing without entanglement.

    Science.gov (United States)

    Lanyon, B P; Barbieri, M; Almeida, M P; White, A G

    2008-11-14

    Deterministic quantum computation with one pure qubit (DQC1) is an efficient model of computation that uses highly mixed states. Unlike pure-state models, its power is not derived from the generation of a large amount of entanglement. Instead it has been proposed that other nonclassical correlations are responsible for the computational speedup, and that these can be captured by the quantum discord. In this Letter we implement DQC1 in an all-optical architecture, and experimentally observe the generated correlations. We find no entanglement, but large amounts of quantum discord-except in three cases where an efficient classical simulation is always possible. Our results show that even fully separable, highly mixed, states can contain intrinsically quantum mechanical correlations and that these could offer a valuable resource for quantum information technologies.

  9. Computation of High-Frequency Waves with Random Uncertainty

    KAUST Repository

    Malenova, Gabriela

    2016-01-06

    We consider the forward propagation of uncertainty in high-frequency waves, described by the second order wave equation with highly oscillatory initial data. The main sources of uncertainty are the wave speed and/or the initial phase and amplitude, described by a finite number of random variables with known joint probability distribution. We propose a stochastic spectral asymptotic method [1] for computing the statistics of uncertain output quantities of interest (QoIs), which are often linear or nonlinear functionals of the wave solution and its spatial/temporal derivatives. The numerical scheme combines two techniques: a high-frequency method based on Gaussian beams [2, 3], a sparse stochastic collocation method [4]. The fast spectral convergence of the proposed method depends crucially on the presence of high stochastic regularity of the QoI independent of the wave frequency. In general, the high-frequency wave solutions to parametric hyperbolic equations are highly oscillatory and non-smooth in both physical and stochastic spaces. Consequently, the stochastic regularity of the QoI, which is a functional of the wave solution, may in principle below and depend on frequency. In the present work, we provide theoretical arguments and numerical evidence that physically motivated QoIs based on local averages of |uE|2 are smooth, with derivatives in the stochastic space uniformly bounded in E, where uE and E denote the highly oscillatory wave solution and the short wavelength, respectively. This observable related regularity makes the proposed approach more efficient than current asymptotic approaches based on Monte Carlo sampling techniques.

  10. High-order computer-assisted estimates of topological entropy

    Science.gov (United States)

    Grote, Johannes

    The concept of Taylor Models is introduced, which offers highly accurate C0-estimates for the enclosures of functional dependencies, combining high-order Taylor polynomial approximation of functions and rigorous estimates of the truncation error, performed using verified interval arithmetic. The focus of this work is on the application of Taylor Models in algorithms for strongly nonlinear dynamical systems. A method to obtain sharp rigorous enclosures of Poincare maps for certain types of flows and surfaces is developed and numerical examples are presented. Differential algebraic techniques allow the efficient and accurate computation of polynomial approximations for invariant curves of certain planar maps around hyperbolic fixed points. Subsequently we introduce a procedure to extend these polynomial curves to verified Taylor Model enclosures of local invariant manifolds with C0-errors of size 10-10--10 -14, and proceed to generate the global invariant manifold tangle up to comparable accuracy through iteration in Taylor Model arithmetic. Knowledge of the global manifold structure up to finite iterations of the local manifold pieces enables us to find all homoclinic and heteroclinic intersections in the generated manifold tangle. Combined with the mapping properties of the homoclinic points and their ordering we are able to construct a subshift of finite type as a topological factor of the original planar system to obtain rigorous lower bounds for its topological entropy. This construction is fully automatic and yields homoclinic tangles with several hundred homoclinic points. As an example rigorous lower bounds for the topological entropy of the Henon map are computed, which to the best knowledge of the authors yield the largest such estimates published so far.

  11. G-LoSA: An efficient computational tool for local structure-centric biological studies and drug design.

    Science.gov (United States)

    Lee, Hui Sun; Im, Wonpil

    2016-04-01

    Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/. © 2016 The Protein Society.

  12. An Efficient Neural-Network-Based Microseismic Monitoring Platform for Hydraulic Fracture on an Edge Computing Architecture

    Directory of Open Access Journals (Sweden)

    Xiaopu Zhang

    2018-06-01

    Full Text Available Microseismic monitoring is one of the most critical technologies for hydraulic fracturing in oil and gas production. To detect events in an accurate and efficient way, there are two major challenges. One challenge is how to achieve high accuracy due to a poor signal-to-noise ratio (SNR. The other one is concerned with real-time data transmission. Taking these challenges into consideration, an edge-computing-based platform, namely Edge-to-Center LearnReduce, is presented in this work. The platform consists of a data center with many edge components. At the data center, a neural network model combined with convolutional neural network (CNN and long short-term memory (LSTM is designed and this model is trained by using previously obtained data. Once the model is fully trained, it is sent to edge components for events detection and data reduction. At each edge component, a probabilistic inference is added to the neural network model to improve its accuracy. Finally, the reduced data is delivered to the data center. Based on experiment results, a high detection accuracy (over 96% with less transmitted data (about 90% was achieved by using the proposed approach on a microseismic monitoring system. These results show that the platform can simultaneously improve the accuracy and efficiency of microseismic monitoring.

  13. An Efficient Neural-Network-Based Microseismic Monitoring Platform for Hydraulic Fracture on an Edge Computing Architecture.

    Science.gov (United States)

    Zhang, Xiaopu; Lin, Jun; Chen, Zubin; Sun, Feng; Zhu, Xi; Fang, Gengfa

    2018-06-05

    Microseismic monitoring is one of the most critical technologies for hydraulic fracturing in oil and gas production. To detect events in an accurate and efficient way, there are two major challenges. One challenge is how to achieve high accuracy due to a poor signal-to-noise ratio (SNR). The other one is concerned with real-time data transmission. Taking these challenges into consideration, an edge-computing-based platform, namely Edge-to-Center LearnReduce, is presented in this work. The platform consists of a data center with many edge components. At the data center, a neural network model combined with convolutional neural network (CNN) and long short-term memory (LSTM) is designed and this model is trained by using previously obtained data. Once the model is fully trained, it is sent to edge components for events detection and data reduction. At each edge component, a probabilistic inference is added to the neural network model to improve its accuracy. Finally, the reduced data is delivered to the data center. Based on experiment results, a high detection accuracy (over 96%) with less transmitted data (about 90%) was achieved by using the proposed approach on a microseismic monitoring system. These results show that the platform can simultaneously improve the accuracy and efficiency of microseismic monitoring.

  14. Asymptotic optimality and efficient computation of the leave-subject-out cross-validation

    KAUST Repository

    Xu, Ganggang

    2012-12-01

    Although the leave-subject-out cross-validation (CV) has been widely used in practice for tuning parameter selection for various nonparametric and semiparametric models of longitudinal data, its theoretical property is unknown and solving the associated optimization problem is computationally expensive, especially when there are multiple tuning parameters. In this paper, by focusing on the penalized spline method, we show that the leave-subject-out CV is optimal in the sense that it is asymptotically equivalent to the empirical squared error loss function minimization. An efficient Newton-type algorithm is developed to compute the penalty parameters that optimize the CV criterion. Simulated and real data are used to demonstrate the effectiveness of the leave-subject-out CV in selecting both the penalty parameters and the working correlation matrix. © 2012 Institute of Mathematical Statistics.

  15. Computationally efficient method for optical simulation of solar cells and their applications

    Science.gov (United States)

    Semenikhin, I.; Zanuccoli, M.; Fiegna, C.; Vyurkov, V.; Sangiorgi, E.

    2013-01-01

    This paper presents two novel implementations of the Differential method to solve the Maxwell equations in nanostructured optoelectronic solid state devices. The first proposed implementation is based on an improved and computationally efficient T-matrix formulation that adopts multiple-precision arithmetic to tackle the numerical instability problem which arises due to evanescent modes. The second implementation adopts the iterative approach that allows to achieve low computational complexity O(N logN) or better. The proposed algorithms may work with structures with arbitrary spatial variation of the permittivity. The developed two-dimensional numerical simulator is applied to analyze the dependence of the absorption characteristics of a thin silicon slab on the morphology of the front interface and on the angle of incidence of the radiation with respect to the device surface.

  16. Asymptotic optimality and efficient computation of the leave-subject-out cross-validation

    KAUST Repository

    Xu, Ganggang; Huang, Jianhua Z.

    2012-01-01

    Although the leave-subject-out cross-validation (CV) has been widely used in practice for tuning parameter selection for various nonparametric and semiparametric models of longitudinal data, its theoretical property is unknown and solving the associated optimization problem is computationally expensive, especially when there are multiple tuning parameters. In this paper, by focusing on the penalized spline method, we show that the leave-subject-out CV is optimal in the sense that it is asymptotically equivalent to the empirical squared error loss function minimization. An efficient Newton-type algorithm is developed to compute the penalty parameters that optimize the CV criterion. Simulated and real data are used to demonstrate the effectiveness of the leave-subject-out CV in selecting both the penalty parameters and the working correlation matrix. © 2012 Institute of Mathematical Statistics.

  17. High efficiency confinement mode by electron cyclotron heating

    International Nuclear Information System (INIS)

    Funahashi, Akimasa

    1987-01-01

    In the medium size nuclear fusion experiment facility JFT-2M in the Japan Atomic Energy Research Institute, the research on the high efficiency plasma confinement mode has been advanced, and in the experiment in June, 1987, the formation of a high efficiency confinement mode was successfully controlled by electron cyclotron heating, for the first time in the world. This result further advanced the control of the formation of a high efficiency plasma confinement mode and the elucidation of the physical mechanism of that mode, and promoted the research and development of the plasma heating by electron cyclotron heating. In this paper, the recent results of the research on a high efficiency confinement mode at the JFT-2M are reported, and the role of the JFT-2M and the experiment on the improvement of core plasma performance are outlined. Now the plasma temperature exceeding 100 million deg C has been attained in large tokamaks, and in medium size facilities, the various measures for improving confinement performance are to be brought forth and their scientific basis is elucidated to assist large facilities. The JFT-2M started the operation in April, 1983, and has accumulated the results smoothly since then. (Kako, I.)

  18. Efficiency Evaluation of Handling of Geologic-Geophysical Information by Means of Computer Systems

    Science.gov (United States)

    Nuriyahmetova, S. M.; Demyanova, O. V.; Zabirova, L. M.; Gataullin, I. I.; Fathutdinova, O. A.; Kaptelinina, E. A.

    2018-05-01

    Development of oil and gas resources, considering difficult geological, geographical and economic conditions, requires considerable finance costs; therefore their careful reasons, application of the most perspective directions and modern technologies from the point of view of cost efficiency of planned activities are necessary. For ensuring high precision of regional and local forecasts and modeling of reservoirs of fields of hydrocarbonic raw materials, it is necessary to analyze huge arrays of the distributed information which is constantly changing spatial. The solution of this task requires application of modern remote methods of a research of the perspective oil-and-gas territories, complex use of materials remote, nondestructive the environment of geologic-geophysical and space methods of sounding of Earth and the most perfect technologies of their handling. In the article, the authors considered experience of handling of geologic-geophysical information by means of computer systems by the Russian and foreign companies. Conclusions that the multidimensional analysis of geologicgeophysical information space, effective planning and monitoring of exploration works requires broad use of geoinformation technologies as one of the most perspective directions in achievement of high profitability of an oil and gas industry are drawn.

  19. High-Degree Neurons Feed Cortical Computations.

    Directory of Open Access Journals (Sweden)

    Nicholas M Timme

    2016-05-01

    Full Text Available Recent work has shown that functional connectivity among cortical neurons is highly varied, with a small percentage of neurons having many more connections than others. Also, recent theoretical developments now make it possible to quantify how neurons modify information from the connections they receive. Therefore, it is now possible to investigate how information modification, or computation, depends on the number of connections a neuron receives (in-degree or sends out (out-degree. To do this, we recorded the simultaneous spiking activity of hundreds of neurons in cortico-hippocampal slice cultures using a high-density 512-electrode array. This preparation and recording method combination produced large numbers of neurons recorded at temporal and spatial resolutions that are not currently available in any in vivo recording system. We utilized transfer entropy (a well-established method for detecting linear and nonlinear interactions in time series and the partial information decomposition (a powerful, recently developed tool for dissecting multivariate information processing into distinct parts to quantify computation between neurons where information flows converged. We found that computations did not occur equally in all neurons throughout the networks. Surprisingly, neurons that computed large amounts of information tended to receive connections from high out-degree neurons. However, the in-degree of a neuron was not related to the amount of information it computed. To gain insight into these findings, we developed a simple feedforward network model. We found that a degree-modified Hebbian wiring rule best reproduced the pattern of computation and degree correlation results seen in the real data. Interestingly, this rule also maximized signal propagation in the presence of network-wide correlations, suggesting a mechanism by which cortex could deal with common random background input. These are the first results to show that the extent to

  20. On the Design of Energy-Efficient Location Tracking Mechanism in Location-Aware Computing

    Directory of Open Access Journals (Sweden)

    MoonBae Song

    2005-01-01

    Full Text Available The battery, in contrast to other hardware, is not governed by Moore's Law. In location-aware computing, power is a very limited resource. As a consequence, recently, a number of promising techniques in various layers have been proposed to reduce the energy consumption. The paper considers the problem of minimizing the energy used to track the location of mobile user over a wireless link in mobile computing. Energy-efficient location update protocol can be done by reducing the number of location update messages as possible and switching off as long as possible. This can be achieved by the concept of mobility-awareness we propose. For this purpose, this paper proposes a novel mobility model, called state-based mobility model (SMM to provide more generalized framework for both describing the mobility and updating location information of complexly moving objects. We also introduce the state-based location update protocol (SLUP based on this mobility model. An extensive experiment on various synthetic datasets shows that the proposed method improves the energy efficiency by 2 ∼ 3 times with the additional 10% of imprecision cost.

  1. Fast Ss-Ilm a Computationally Efficient Algorithm to Discover Socially Important Locations

    Science.gov (United States)

    Dokuz, A. S.; Celik, M.

    2017-11-01

    Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provide several valuable information about user behaviours on social media networking sites. However, discovering socially important locations are challenging due to data volume and dimensions, spatial and temporal calculations, location sparseness in social media datasets, and inefficiency of current algorithms. In the literature, several studies are conducted to discover important locations, however, the proposed approaches do not work in computationally efficient manner. In this study, we propose Fast SS-ILM algorithm by modifying the algorithm of SS-ILM to mine socially important locations efficiently. Experimental results show that proposed Fast SS-ILM algorithm decreases execution time of socially important locations discovery process up to 20 %.

  2. FAST SS-ILM: A COMPUTATIONALLY EFFICIENT ALGORITHM TO DISCOVER SOCIALLY IMPORTANT LOCATIONS

    Directory of Open Access Journals (Sweden)

    A. S. Dokuz

    2017-11-01

    Full Text Available Socially important locations are places which are frequently visited by social media users in their social media lifetime. Discovering socially important locations provide several valuable information about user behaviours on social media networking sites. However, discovering socially important locations are challenging due to data volume and dimensions, spatial and temporal calculations, location sparseness in social media datasets, and inefficiency of current algorithms. In the literature, several studies are conducted to discover important locations, however, the proposed approaches do not work in computationally efficient manner. In this study, we propose Fast SS-ILM algorithm by modifying the algorithm of SS-ILM to mine socially important locations efficiently. Experimental results show that proposed Fast SS-ILM algorithm decreases execution time of socially important locations discovery process up to 20 %.

  3. High-performance scientific computing in the cloud

    Science.gov (United States)

    Jorissen, Kevin; Vila, Fernando; Rehr, John

    2011-03-01

    Cloud computing has the potential to open up high-performance computational science to a much broader class of researchers, owing to its ability to provide on-demand, virtualized computational resources. However, before such approaches can become commonplace, user-friendly tools must be developed that hide the unfamiliar cloud environment and streamline the management of cloud resources for many scientific applications. We have recently shown that high-performance cloud computing is feasible for parallelized x-ray spectroscopy calculations. We now present benchmark results for a wider selection of scientific applications focusing on electronic structure and spectroscopic simulation software in condensed matter physics. These applications are driven by an improved portable interface that can manage virtual clusters and run various applications in the cloud. We also describe a next generation of cluster tools, aimed at improved performance and a more robust cluster deployment. Supported by NSF grant OCI-1048052.

  4. Quantum Accelerators for High-Performance Computing Systems

    OpenAIRE

    Britt, Keith A.; Mohiyaddin, Fahd A.; Humble, Travis S.

    2017-01-01

    We define some of the programming and system-level challenges facing the application of quantum processing to high-performance computing. Alongside barriers to physical integration, prominent differences in the execution of quantum and conventional programs challenges the intersection of these computational models. Following a brief overview of the state of the art, we discuss recent advances in programming and execution models for hybrid quantum-classical computing. We discuss a novel quantu...

  5. Computer-aided engineering in High Energy Physics

    International Nuclear Information System (INIS)

    Bachy, G.; Hauviller, C.; Messerli, R.; Mottier, M.

    1988-01-01

    Computing, standard tool for a long time in the High Energy Physics community, is being slowly introduced at CERN in the mechanical engineering field. The first major application was structural analysis followed by Computer-Aided Design (CAD). Development work is now progressing towards Computer-Aided Engineering around a powerful data base. This paper gives examples of the power of this approach applied to engineering for accelerators and detectors

  6. High-Order Dielectric Metasurfaces for High-Efficiency Polarization Beam Splitters and Optical Vortex Generators

    Science.gov (United States)

    Guo, Zhongyi; Zhu, Lie; Guo, Kai; Shen, Fei; Yin, Zhiping

    2017-08-01

    In this paper, a high-order dielectric metasurface based on silicon nanobrick array is proposed and investigated. By controlling the length and width of the nanobricks, the metasurfaces could supply two different incremental transmission phases for the X-linear-polarized (XLP) and Y-linear-polarized (YLP) light with extremely high efficiency over 88%. Based on the designed metasurface, two polarization beam splitters working in high-order diffraction modes have been designed successfully, which demonstrated a high transmitted efficiency. In addition, we have also designed two vortex-beam generators working in high-order diffraction modes to create vortex beams with the topological charges of 2 and 3. The employment of dielectric metasurfaces operating in high-order diffraction modes could pave the way for a variety of new ultra-efficient optical devices.

  7. An efficient computational method for a stochastic dynamic lot-sizing problem under service-level constraints

    NARCIS (Netherlands)

    Tarim, S.A.; Ozen, U.; Dogru, M.K.; Rossi, R.

    2011-01-01

    We provide an efficient computational approach to solve the mixed integer programming (MIP) model developed by Tarim and Kingsman [8] for solving a stochastic lot-sizing problem with service level constraints under the static–dynamic uncertainty strategy. The effectiveness of the proposed method

  8. All passive architecture for high efficiency cascaded Raman conversion

    Science.gov (United States)

    Balaswamy, V.; Arun, S.; Chayran, G.; Supradeepa, V. R.

    2018-02-01

    Cascaded Raman fiber lasers have offered a convenient method to obtain scalable, high-power sources at various wavelength regions inaccessible with rare-earth doped fiber lasers. A limitation previously was the reduced efficiency of these lasers. Recently, new architectures have been proposed to enhance efficiency, but this came at the cost of enhanced complexity, requiring an additional low-power, cascaded Raman laser. In this work, we overcome this with a new, all-passive architecture for high-efficiency cascaded Raman conversion. We demonstrate our architecture with a fifth-order cascaded Raman converter from 1117nm to 1480nm with output power of ~64W and efficiency of 60%.

  9. Computer controlled high voltage system

    Energy Technology Data Exchange (ETDEWEB)

    Kunov, B; Georgiev, G; Dimitrov, L [and others

    1996-12-31

    A multichannel computer controlled high-voltage power supply system is developed. The basic technical parameters of the system are: output voltage -100-3000 V, output current - 0-3 mA, maximum number of channels in one crate - 78. 3 refs.

  10. High-Precision Computation and Mathematical Physics

    International Nuclear Information System (INIS)

    Bailey, David H.; Borwein, Jonathan M.

    2008-01-01

    At the present time, IEEE 64-bit floating-point arithmetic is sufficiently accurate for most scientific applications. However, for a rapidly growing body of important scientific computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion effort. This paper presents a survey of recent applications of these techniques and provides some analysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, scattering amplitudes of quarks, gluons and bosons, nonlinear oscillator theory, Ising theory, quantum field theory and experimental mathematics. We conclude that high-precision arithmetic facilities are now an indispensable component of a modern large-scale scientific computing environment.

  11. Computer architecture for efficient algorithmic executions in real-time systems: New technology for avionics systems and advanced space vehicles

    Science.gov (United States)

    Carroll, Chester C.; Youngblood, John N.; Saha, Aindam

    1987-01-01

    Improvements and advances in the development of computer architecture now provide innovative technology for the recasting of traditional sequential solutions into high-performance, low-cost, parallel system to increase system performance. Research conducted in development of specialized computer architecture for the algorithmic execution of an avionics system, guidance and control problem in real time is described. A comprehensive treatment of both the hardware and software structures of a customized computer which performs real-time computation of guidance commands with updated estimates of target motion and time-to-go is presented. An optimal, real-time allocation algorithm was developed which maps the algorithmic tasks onto the processing elements. This allocation is based on the critical path analysis. The final stage is the design and development of the hardware structures suitable for the efficient execution of the allocated task graph. The processing element is designed for rapid execution of the allocated tasks. Fault tolerance is a key feature of the overall architecture. Parallel numerical integration techniques, tasks definitions, and allocation algorithms are discussed. The parallel implementation is analytically verified and the experimental results are presented. The design of the data-driven computer architecture, customized for the execution of the particular algorithm, is discussed.

  12. High-efficiency photovoltaic cells

    Science.gov (United States)

    Yang, H.T.; Zehr, S.W.

    1982-06-21

    High efficiency solar converters comprised of a two cell, non-lattice matched, monolithic stacked semiconductor configuration using optimum pairs of cells having bandgaps in the range 1.6 to 1.7 eV and 0.95 to 1.1 eV, and a method of fabrication thereof, are disclosed. The high band gap subcells are fabricated using metal organic chemical vapor deposition (MOCVD), liquid phase epitaxy (LPE) or molecular beam epitaxy (MBE) to produce the required AlGaAs layers of optimized composition, thickness and doping to produce high performance, heteroface homojunction devices. The low bandgap subcells are similarly fabricated from AlGa(As)Sb compositions by LPE, MBE or MOCVD. These subcells are then coupled to form a monolithic structure by an appropriate bonding technique which also forms the required transparent intercell ohmic contact (IOC) between the two subcells. Improved ohmic contacts to the high bandgap semiconductor structure can be formed by vacuum evaporating to suitable metal or semiconductor materials which react during laser annealing to form a low bandgap semiconductor which provides a low contact resistance structure.

  13. High performance parallel computers for science

    International Nuclear Information System (INIS)

    Nash, T.; Areti, H.; Atac, R.; Biel, J.; Cook, A.; Deppe, J.; Edel, M.; Fischler, M.; Gaines, I.; Hance, R.

    1989-01-01

    This paper reports that Fermilab's Advanced Computer Program (ACP) has been developing cost effective, yet practical, parallel computers for high energy physics since 1984. The ACP's latest developments are proceeding in two directions. A Second Generation ACP Multiprocessor System for experiments will include $3500 RISC processors each with performance over 15 VAX MIPS. To support such high performance, the new system allows parallel I/O, parallel interprocess communication, and parallel host processes. The ACP Multi-Array Processor, has been developed for theoretical physics. Each $4000 node is a FORTRAN or C programmable pipelined 20 Mflops (peak), 10 MByte single board computer. These are plugged into a 16 port crossbar switch crate which handles both inter and intra crate communication. The crates are connected in a hypercube. Site oriented applications like lattice gauge theory are supported by system software called CANOPY, which makes the hardware virtually transparent to users. A 256 node, 5 GFlop, system is under construction

  14. Ultra-high Efficiency DC-DC Converter using GaN Devices

    DEFF Research Database (Denmark)

    Ramachandran, Rakesh

    2016-01-01

    properties of GaN devices can be utilized in power converters to make them more compact and highly efficient. This thesis entitled “Ultra-high Efficiency DC-DC Converter using GaN devices” focuses on achieving ultra-high conversion efficiency in an isolated dc-dc converter by the optimal utilization of Ga...... for many decades. However, the rate of improvement slowed as the silicon power materials asymptotically approached its theoretical bounds. Compared to Si, wideband gap materials such as Silicon Carbide (SiC) and Gallium Nitride (GaN) are promising semiconductors for power devices due to their superior...... in this thesis. Efficiency measurements from the hardware prototype of both the topologies are also presented in this thesis. Finally, the bidirectional operation of an optimized isolated dc-dc converter is presented. The optimized converter has achieved an ultra-high efficiency of 98.8% in both directions...

  15. Saving energy via high-efficiency fans.

    Science.gov (United States)

    Heine, Thomas

    2016-08-01

    Thomas Heine, sales and market manager for EC Upgrades, the retrofit arm of global provider of air movement solutions, ebm-papst A&NZ, discusses the retrofitting of high-efficiency fans to existing HVAC equipment to 'drastically reduce energy consumption'.

  16. Design and reliability, availability, maintainability, and safety analysis of a high availability quadruple vital computer system

    Institute of Scientific and Technical Information of China (English)

    Ping TAN; Wei-ting HE; Jia LIN; Hong-ming ZHAO; Jian CHU

    2011-01-01

    With the development of high-speed railways in China,more than 2000 high-speed trains will be put into use.Safety and efficiency of railway transportation is increasingly important.We have designed a high availability quadruple vital computer (HAQVC) system based on the analysis of the architecture of the traditional double 2-out-of-2 system and 2-out-of-3 system.The HAQVC system is a system with high availability and safety,with prominent characteristics such as fire-new internal architecture,high efficiency,reliable data interaction mechanism,and operation state change mechanism.The hardware of the vital CPU is based on ARM7 with the real-time embedded safe operation system (ES-OS).The Markov modeling method is designed to evaluate the reliability,availability,maintainability,and safety (RAMS) of the system.In this paper,we demonstrate that the HAQVC system is more reliable than the all voting triple modular redundancy (AVTMR) system and double 2-out-of-2 system.Thus,the design can be used for a specific application system,such as an airplane or high-speed railway system.

  17. Bioblendstocks that Enable High Efficiency Engine Designs

    Energy Technology Data Exchange (ETDEWEB)

    McCormick, Robert L.; Fioroni, Gina M.; Ratcliff, Matthew A.; Zigler, Bradley T.; Farrell, John

    2016-11-03

    The past decade has seen a high level of innovation in production of biofuels from sugar, lipid, and lignocellulose feedstocks. As discussed in several talks at this workshop, ethanol blends in the E25 to E50 range could enable more highly efficient spark-ignited (SI) engines. This is because of their knock resistance properties that include not only high research octane number (RON), but also charge cooling from high heat of vaporization, and high flame speed. Emerging alcohol fuels such as isobutanol or mixed alcohols have desirable properties such as reduced gasoline blend vapor pressure, but also have lower RON than ethanol. These fuels may be able to achieve the same knock resistance benefits, but likely will require higher blend levels or higher RON hydrocarbon blendstocks. A group of very high RON (>150) oxygenates such as dimethyl furan, methyl anisole, and related compounds are also produced from biomass. While providing no increase in charge cooling, their very high octane numbers may provide adequate knock resistance for future highly efficient SI engines. Given this range of options for highly knock resistant fuels there appears to be a critical need for a fuel knock resistance metric that includes effects of octane number, heat of vaporization, and potentially flame speed. Emerging diesel fuels include highly branched long-chain alkanes from hydroprocessing of fats and oils, as well as sugar-derived terpenoids. These have relatively high cetane number (CN), which may have some benefits in designing more efficient CI engines. Fast pyrolysis of biomass can produce diesel boiling range streams that are high in aromatic, oxygen and acid contents. Hydroprocessing can be applied to remove oxygen and consequently reduce acidity, however there are strong economic incentives to leave up to 2 wt% oxygen in the product. This oxygen will primarily be present as low CN alkyl phenols and aryl ethers. While these have high heating value, their presence in diesel fuel

  18. After Installation: Ubiquitous Computing and High School Science in Three Experienced, High-Technology Schools

    Science.gov (United States)

    Drayton, Brian; Falk, Joni K.; Stroud, Rena; Hobbs, Kathryn; Hammerman, James

    2010-01-01

    There are few studies of the impact of ubiquitous computing on high school science, and the majority of studies of ubiquitous computing report only on the early stages of implementation. The present study presents data on 3 high schools with carefully elaborated ubiquitous computing systems that have gone through at least one "obsolescence cycle"…

  19. Stabilization void-fill encapsulation high-efficiency particulate filters

    International Nuclear Information System (INIS)

    Alexander, R.G.; Stewart, W.E.; Phillips, S.J.; Serkowski, M.M.; England, J.L.; Boynton, H.C.

    1994-05-01

    This report discusses high-efficiency particulate air (HEPA) filter systems that which are contaminated with radionuclides are part of the nuclear fuel processing systems conducted by the US Department of Energy (DOE) and require replacement and safe and efficient disposal for plant safety. Two K-3 HEPA filters were removed from service, placed burial boxes, buried, and safely and efficiently stabilized remotely which reduced radiation exposure to personnel and the environment

  20. Optical interconnection networks for high-performance computing systems

    International Nuclear Information System (INIS)

    Biberman, Aleksandr; Bergman, Keren

    2012-01-01

    Enabled by silicon photonic technology, optical interconnection networks have the potential to be a key disruptive technology in computing and communication industries. The enduring pursuit of performance gains in computing, combined with stringent power constraints, has fostered the ever-growing computational parallelism associated with chip multiprocessors, memory systems, high-performance computing systems and data centers. Sustaining these parallelism growths introduces unique challenges for on- and off-chip communications, shifting the focus toward novel and fundamentally different communication approaches. Chip-scale photonic interconnection networks, enabled by high-performance silicon photonic devices, offer unprecedented bandwidth scalability with reduced power consumption. We demonstrate that the silicon photonic platforms have already produced all the high-performance photonic devices required to realize these types of networks. Through extensive empirical characterization in much of our work, we demonstrate such feasibility of waveguides, modulators, switches and photodetectors. We also demonstrate systems that simultaneously combine many functionalities to achieve more complex building blocks. We propose novel silicon photonic devices, subsystems, network topologies and architectures to enable unprecedented performance of these photonic interconnection networks. Furthermore, the advantages of photonic interconnection networks extend far beyond the chip, offering advanced communication environments for memory systems, high-performance computing systems, and data centers. (review article)

  1. Efficient computation of k-Nearest Neighbour Graphs for large high-dimensional data sets on GPU clusters.

    Directory of Open Access Journals (Sweden)

    Ali Dashti

    Full Text Available This paper presents an implementation of the brute-force exact k-Nearest Neighbor Graph (k-NNG construction for ultra-large high-dimensional data cloud. The proposed method uses Graphics Processing Units (GPUs and is scalable with multi-levels of parallelism (between nodes of a cluster, between different GPUs on a single node, and within a GPU. The method is applicable to homogeneous computing clusters with a varying number of nodes and GPUs per node. We achieve a 6-fold speedup in data processing as compared with an optimized method running on a cluster of CPUs and bring a hitherto impossible [Formula: see text]-NNG generation for a dataset of twenty million images with 15 k dimensionality into the realm of practical possibility.

  2. High-efficient electron linacs

    International Nuclear Information System (INIS)

    Glavatskikh, K.V.; Zverev, B.V.; Kalyuzhnyj, V.E.; Morozov, V.L.; Nikolaev, S.V.; Plotnikov, S.N.; Sobenin, N.P.; Vovna, V.A.; Gryzlov, A.V.

    1993-01-01

    Comparison analysis of ELA on running and still waves designed for 10 MeV energy and with high efficiency is carried out. It is shown, that from the point of view of dimensions ELA with a still wave or that of a combined type is more preferable. From the point of view of impedance characteristics in any variant with application of magnetron as HF-generator it is necessary to implement special requirements to the accelerating structure if no ferrite isolation is provided in HF-channel. 3 refs., 4 figs., 1 tab

  3. Green Computing

    Directory of Open Access Journals (Sweden)

    K. Shalini

    2013-01-01

    Full Text Available Green computing is all about using computers in a smarter and eco-friendly way. It is the environmentally responsible use of computers and related resources which includes the implementation of energy-efficient central processing units, servers and peripherals as well as reduced resource consumption and proper disposal of electronic waste .Computers certainly make up a large part of many people lives and traditionally are extremely damaging to the environment. Manufacturers of computer and its parts have been espousing the green cause to help protect environment from computers and electronic waste in any way.Research continues into key areas such as making the use of computers as energy-efficient as Possible, and designing algorithms and systems for efficiency-related computer technologies.

  4. Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and Comparisons

    Directory of Open Access Journals (Sweden)

    Ernestina Martel

    2018-06-01

    Full Text Available Dimensionality reduction represents a critical preprocessing step in order to increase the efficiency and the performance of many hyperspectral imaging algorithms. However, dimensionality reduction algorithms, such as the Principal Component Analysis (PCA, suffer from their computationally demanding nature, becoming advisable for their implementation onto high-performance computer architectures for applications under strict latency constraints. This work presents the implementation of the PCA algorithm onto two different high-performance devices, namely, an NVIDIA Graphics Processing Unit (GPU and a Kalray manycore, uncovering a highly valuable set of tips and tricks in order to take full advantage of the inherent parallelism of these high-performance computing platforms, and hence, reducing the time that is required to process a given hyperspectral image. Moreover, the achieved results obtained with different hyperspectral images have been compared with the ones that were obtained with a field programmable gate array (FPGA-based implementation of the PCA algorithm that has been recently published, providing, for the first time in the literature, a comprehensive analysis in order to highlight the pros and cons of each option.

  5. A simple, robust and efficient high-order accurate shock-capturing scheme for compressible flows: Towards minimalism

    Science.gov (United States)

    Ohwada, Taku; Shibata, Yuki; Kato, Takuma; Nakamura, Taichi

    2018-06-01

    Developed is a high-order accurate shock-capturing scheme for the compressible Euler/Navier-Stokes equations; the formal accuracy is 5th order in space and 4th order in time. The performance and efficiency of the scheme are validated in various numerical tests. The main ingredients of the scheme are nothing special; they are variants of the standard numerical flux, MUSCL, the usual Lagrange's polynomial and the conventional Runge-Kutta method. The scheme can compute a boundary layer accurately with a rational resolution and capture a stationary contact discontinuity sharply without inner points. And yet it is endowed with high resistance against shock anomalies (carbuncle phenomenon, post-shock oscillations, etc.). A good balance between high robustness and low dissipation is achieved by blending three types of numerical fluxes according to physical situation in an intuitively easy-to-understand way. The performance of the scheme is largely comparable to that of WENO5-Rusanov, while its computational cost is 30-40% less than of that of the advanced scheme.

  6. Segmentation of low‐cost high efficiency oxide‐based thermoelectric materials

    DEFF Research Database (Denmark)

    Le, Thanh Hung; Van Nong, Ngo; Linderoth, Søren

    2015-01-01

    Thermoelectric (TE) oxide materials have attracted great interest in advanced renewable energy research owing to the fact that they consist of abundant elements, can be manufactured by low-cost processing, sustain high temperatures, be robust and provide long lifetime. However, the low conversion...... efficiency of TE oxides has been a major drawback limiting these materials to broaden applications. In this work, theoretical calculations are used to predict how segmentation of oxide and semimetal materials, utilizing the benefits of both types of materials, can provide high efficiency, high temperature...... oxide-based segmented legs. The materials for segmentation are selected by their compatibility factors and their conversion efficiency versus material cost, i.e., “efficiency ratio”. Numerical modelling results showed that conversion efficiency could reach values of more than 10% for unicouples using...

  7. High-efficiency CARM

    Energy Technology Data Exchange (ETDEWEB)

    Bratman, V.L.; Kol`chugin, B.D.; Samsonov, S.V.; Volkov, A.B. [Institute of Applied Physics, Nizhny Novgorod (Russian Federation)

    1995-12-31

    The Cyclotron Autoresonance Maser (CARM) is a well-known variety of FEMs. Unlike the ubitron in which electrons move in a periodical undulator field, in the CARM the particles move along helical trajectories in a uniform magnetic field. Since it is much simpler to generate strong homogeneous magnetic fields than periodical ones for a relatively low electron energy ({Brit_pounds}{le}1-3 MeV) the period of particles` trajectories in the CARM can be sufficiently smaller than in the undulator in which, moreover, the field decreases rapidly in the transverse direction. In spite of this evident advantage, the number of papers on CARM is an order less than on ubitron, which is apparently caused by the low (not more than 10 %) CARM efficiency in experiments. At the same time, ubitrons operating in two rather complicated regimes-trapping and adiabatic deceleration of particles and combined undulator and reversed guiding fields - yielded efficiencies of 34 % and 27 %, respectively. The aim of this work is to demonstrate that high efficiency can be reached even for a simplest version of the CARM. In order to reduce sensitivity to an axial velocity spread of particles, a short interaction length where electrons underwent only 4-5 cyclotron oscillations was used in this work. Like experiments, a narrow anode outlet of a field-emission electron gun cut out the {open_quotes}most rectilinear{close_quotes} near-axis part of the electron beam. Additionally, magnetic field of a small correcting coil compensated spurious electron oscillations pumped by the anode aperture. A kicker in the form of a sloping to the axis frame with current provided a control value of rotary velocity at a small additional velocity spread. A simple cavity consisting of a cylindrical waveguide section restricted by a cut-off waveguide on the cathode side and by a Bragg reflector on the collector side was used as the CARM-oscillator microwave system.

  8. New III-V cell design approaches for very high efficiency

    Energy Technology Data Exchange (ETDEWEB)

    Lundstrom, M.S.; Melloch, M.R.; Lush, G.B.; Patkar, M.P.; Young, M.P. (Purdue Univ., Lafayette, IN (United States))

    1993-04-01

    This report describes to examine new solar cell desip approaches for achieving very high conversion efficiencies. The program consists of two elements. The first centers on exploring new thin-film approaches specifically designed for M-III semiconductors. Substantial efficiency gains may be possible by employing light trapping techniques to confine the incident photons, as well as the photons emitted by radiative recombination. The thin-film approach is a promising route for achieving substantial performance improvements in the already high-efficiency, single-junction, III-V cell. The second element of the research involves exploring desip approaches for achieving high conversion efficiencies without requiring extremely high-quality material. This work has applications to multiple-junction cells, for which the selection of a component cell often involves a compromise between optimum band pp and optimum material quality. It could also be a benefit manufacturing environment by making the cell's efficiency less dependent on materialquality.

  9. Efficient continuous-wave nonlinear frequency conversion in high-Q gallium nitride photonic crystal cavities on silicon

    Directory of Open Access Journals (Sweden)

    Mohamed Sabry Mohamed

    2017-03-01

    Full Text Available We report on nonlinear frequency conversion from the telecom range via second harmonic generation (SHG and third harmonic generation (THG in suspended gallium nitride slab photonic crystal (PhC cavities on silicon, under continuous-wave resonant excitation. Optimized two-dimensional PhC cavities with augmented far-field coupling have been characterized with quality factors as high as 4.4 × 104, approaching the computed theoretical values. The strong enhancement in light confinement has enabled efficient SHG, achieving a normalized conversion efficiency of 2.4 × 10−3 W−1, as well as simultaneous THG. SHG emission power of up to 0.74 nW has been detected without saturation. The results herein validate the suitability of gallium nitride for integrated nonlinear optical processing.

  10. Highly efficient and eco-friendly gold-catalyzed synthesis of homoallylic ketones

    KAUST Repository

    Gó mez-Suá rez, Adriá n; Gasperini, Danila; Vummaleti, Sai V. C.; Poater, Albert; Cavallo, Luigi; Nolan, Steven P.

    2014-01-01

    We report a new catalytic protocol for the synthesis of γ,δ-unsaturated carbonyl units from simple starting materials, allylic alcohols and alkynes, via a hydroxalkoxylation/Claisen rearrangement sequence. This new process is more efficient (higher TON and TOF) and more eco-friendly (increased mass efficiency) than the previous state-of-the-art technique. In addition, this method tolerates both terminal and internal alkynes. Moreover, computational studies have been carried out in order to shed light on how the Claisen rearrangement is initiated. © 2014 American Chemical Society.

  11. Highly efficient and eco-friendly gold-catalyzed synthesis of homoallylic ketones

    KAUST Repository

    Gómez-Suárez, Adrián

    2014-08-01

    We report a new catalytic protocol for the synthesis of γ,δ-unsaturated carbonyl units from simple starting materials, allylic alcohols and alkynes, via a hydroxalkoxylation/Claisen rearrangement sequence. This new process is more efficient (higher TON and TOF) and more eco-friendly (increased mass efficiency) than the previous state-of-the-art technique. In addition, this method tolerates both terminal and internal alkynes. Moreover, computational studies have been carried out in order to shed light on how the Claisen rearrangement is initiated. © 2014 American Chemical Society.

  12. Charge transport in highly efficient iridium cored electrophosphorescent dendrimers

    Science.gov (United States)

    Markham, Jonathan P. J.; Samuel, Ifor D. W.; Lo, Shih-Chun; Burn, Paul L.; Weiter, Martin; Bässler, Heinz

    2004-01-01

    Electrophosphorescent dendrimers are promising materials for highly efficient light-emitting diodes. They consist of a phosphorescent core onto which dendritic groups are attached. Here, we present an investigation into the optical and electronic properties of highly efficient phosphorescent dendrimers. The effect of dendrimer structure on charge transport and optical properties is studied using temperature-dependent charge-generation-layer time-of-flight measurements and current voltage (I-V) analysis. A model is used to explain trends seen in the I-V characteristics. We demonstrate that fine tuning the mobility by chemical structure is possible in these dendrimers and show that this can lead to highly efficient bilayer dendrimer light-emitting diodes with neat emissive layers. Power efficiencies of 20 lm/W were measured for devices containing a second-generation (G2) Ir(ppy)3 dendrimer with a 1,3,5-tris(2-N-phenylbenzimidazolyl)benzene electron transport layer.

  13. An Automated Approach to Very High Order Aeroacoustic Computations in Complex Geometries

    Science.gov (United States)

    Dyson, Rodger W.; Goodrich, John W.

    2000-01-01

    Computational aeroacoustics requires efficient, high-resolution simulation tools. And for smooth problems, this is best accomplished with very high order in space and time methods on small stencils. But the complexity of highly accurate numerical methods can inhibit their practical application, especially in irregular geometries. This complexity is reduced by using a special form of Hermite divided-difference spatial interpolation on Cartesian grids, and a Cauchy-Kowalewslci recursion procedure for time advancement. In addition, a stencil constraint tree reduces the complexity of interpolating grid points that are located near wall boundaries. These procedures are used to automatically develop and implement very high order methods (>15) for solving the linearized Euler equations that can achieve less than one grid point per wavelength resolution away from boundaries by including spatial derivatives of the primitive variables at each grid point. The accuracy of stable surface treatments is currently limited to 11th order for grid aligned boundaries and to 2nd order for irregular boundaries.

  14. ClustalXeed: a GUI-based grid computation version for high performance and terabyte size multiple sequence alignment

    Directory of Open Access Journals (Sweden)

    Kim Taeho

    2010-09-01

    Full Text Available Abstract Background There is an increasing demand to assemble and align large-scale biological sequence data sets. The commonly used multiple sequence alignment programs are still limited in their ability to handle very large amounts of sequences because the system lacks a scalable high-performance computing (HPC environment with a greatly extended data storage capacity. Results We designed ClustalXeed, a software system for multiple sequence alignment with incremental improvements over previous versions of the ClustalX and ClustalW-MPI software. The primary advantage of ClustalXeed over other multiple sequence alignment software is its ability to align a large family of protein or nucleic acid sequences. To solve the conventional memory-dependency problem, ClustalXeed uses both physical random access memory (RAM and a distributed file-allocation system for distance matrix construction and pair-align computation. The computation efficiency of disk-storage system was markedly improved by implementing an efficient load-balancing algorithm, called "idle node-seeking task algorithm" (INSTA. The new editing option and the graphical user interface (GUI provide ready access to a parallel-computing environment for users who seek fast and easy alignment of large DNA and protein sequence sets. Conclusions ClustalXeed can now compute a large volume of biological sequence data sets, which were not tractable in any other parallel or single MSA program. The main developments include: 1 the ability to tackle larger sequence alignment problems than possible with previous systems through markedly improved storage-handling capabilities. 2 Implementing an efficient task load-balancing algorithm, INSTA, which improves overall processing times for multiple sequence alignment with input sequences of non-uniform length. 3 Support for both single PC and distributed cluster systems.

  15. Design of High Efficiency Illumination for LED Lighting

    OpenAIRE

    Chang, Yong-Nong; Cheng, Hung-Liang; Kuo, Chih-Ming

    2013-01-01

    A high efficiency illumination for LED street lighting is proposed. For energy saving, this paper uses Class-E resonant inverter as main electric circuit to improve efficiency. In addition, single dimming control has the best efficiency, simplest control scheme and lowest circuit cost among other types of dimming techniques. Multiple serial-connected transformers used to drive the LED strings as they can provide galvanic isolation and have the advantage of good current distribution against de...

  16. High-Precision Computation: Mathematical Physics and Dynamics

    International Nuclear Information System (INIS)

    Bailey, D.H.; Barrio, R.; Borwein, J.M.

    2010-01-01

    At the present time, IEEE 64-bit oating-point arithmetic is suficiently accurate for most scientic applications. However, for a rapidly growing body of important scientic computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion e ort. This pa- per presents a survey of recent applications of these techniques and provides someanalysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, studies of the one structure constant, scattering amplitudes of quarks, glu- ons and bosons, nonlinear oscillator theory, experimental mathematics, evaluation of orthogonal polynomials, numerical integration of ODEs, computation of periodic orbits, studies of the splitting of separatrices, detection of strange nonchaotic at- tractors, Ising theory, quantum held theory, and discrete dynamical systems. We conclude that high-precision arithmetic facilities are now an indispensable compo- nent of a modern large-scale scientic computing environment.

  17. High-Precision Computation: Mathematical Physics and Dynamics

    Energy Technology Data Exchange (ETDEWEB)

    Bailey, D. H.; Barrio, R.; Borwein, J. M.

    2010-04-01

    At the present time, IEEE 64-bit oating-point arithmetic is suficiently accurate for most scientic applications. However, for a rapidly growing body of important scientic computing applications, a higher level of numeric precision is required. Such calculations are facilitated by high-precision software packages that include high-level language translation modules to minimize the conversion e ort. This pa- per presents a survey of recent applications of these techniques and provides someanalysis of their numerical requirements. These applications include supernova simulations, climate modeling, planetary orbit calculations, Coulomb n-body atomic systems, studies of the one structure constant, scattering amplitudes of quarks, glu- ons and bosons, nonlinear oscillator theory, experimental mathematics, evaluation of orthogonal polynomials, numerical integration of ODEs, computation of periodic orbits, studies of the splitting of separatrices, detection of strange nonchaotic at- tractors, Ising theory, quantum held theory, and discrete dynamical systems. We conclude that high-precision arithmetic facilities are now an indispensable compo- nent of a modern large-scale scientic computing environment.

  18. High Performance Computing in Science and Engineering '98 : Transactions of the High Performance Computing Center

    CERN Document Server

    Jäger, Willi

    1999-01-01

    The book contains reports about the most significant projects from science and industry that are using the supercomputers of the Federal High Performance Computing Center Stuttgart (HLRS). These projects are from different scientific disciplines, with a focus on engineering, physics and chemistry. They were carefully selected in a peer-review process and are showcases for an innovative combination of state-of-the-art physical modeling, novel algorithms and the use of leading-edge parallel computer technology. As HLRS is in close cooperation with industrial companies, special emphasis has been put on the industrial relevance of results and methods.

  19. Efficient frequent pattern mining algorithm based on node sets in cloud computing environment

    Science.gov (United States)

    Billa, V. N. Vinay Kumar; Lakshmanna, K.; Rajesh, K.; Reddy, M. Praveen Kumar; Nagaraja, G.; Sudheer, K.

    2017-11-01

    The ultimate goal of Data Mining is to determine the hidden information which is useful in making decisions using the large databases collected by an organization. This Data Mining involves many tasks that are to be performed during the process. Mining frequent itemsets is the one of the most important tasks in case of transactional databases. These transactional databases contain the data in very large scale where the mining of these databases involves the consumption of physical memory and time in proportion to the size of the database. A frequent pattern mining algorithm is said to be efficient only if it consumes less memory and time to mine the frequent itemsets from the given large database. Having these points in mind in this thesis we proposed a system which mines frequent itemsets in an optimized way in terms of memory and time by using cloud computing as an important factor to make the process parallel and the application is provided as a service. A complete framework which uses a proven efficient algorithm called FIN algorithm. FIN algorithm works on Nodesets and POC (pre-order coding) tree. In order to evaluate the performance of the system we conduct the experiments to compare the efficiency of the same algorithm applied in a standalone manner and in cloud computing environment on a real time data set which is traffic accidents data set. The results show that the memory consumption and execution time taken for the process in the proposed system is much lesser than those of standalone system.

  20. Efficient Stochastic Inversion Using Adjoint Models and Kernel-PCA

    Energy Technology Data Exchange (ETDEWEB)

    Thimmisetty, Charanraj A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing; Zhao, Wenju [Florida State Univ., Tallahassee, FL (United States). Dept. of Scientific Computing; Chen, Xiao [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing; Tong, Charles H. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Center for Applied Scientific Computing; White, Joshua A. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States). Atmospheric, Earth and Energy Division

    2017-10-18

    Performing stochastic inversion on a computationally expensive forward simulation model with a high-dimensional uncertain parameter space (e.g. a spatial random field) is computationally prohibitive even when gradient information can be computed efficiently. Moreover, the ‘nonlinear’ mapping from parameters to observables generally gives rise to non-Gaussian posteriors even with Gaussian priors, thus hampering the use of efficient inversion algorithms designed for models with Gaussian assumptions. In this paper, we propose a novel Bayesian stochastic inversion methodology, which is characterized by a tight coupling between the gradient-based Langevin Markov Chain Monte Carlo (LMCMC) method and a kernel principal component analysis (KPCA). This approach addresses the ‘curse-of-dimensionality’ via KPCA to identify a low-dimensional feature space within the high-dimensional and nonlinearly correlated parameter space. In addition, non-Gaussian posterior distributions are estimated via an efficient LMCMC method on the projected low-dimensional feature space. We will demonstrate this computational framework by integrating and adapting our recent data-driven statistics-on-manifolds constructions and reduction-through-projection techniques to a linear elasticity model.

  1. High Quantum Efficiency OLED Lighting Systems

    Energy Technology Data Exchange (ETDEWEB)

    Shiang, Joseph [General Electric (GE) Global Research, Fairfield, CT (United States)

    2011-09-30

    The overall goal of the program was to apply improvements in light outcoupling technology to a practical large area plastic luminaire, and thus enable the product vision of an extremely thin form factor high efficiency large area light source. The target substrate was plastic and the baseline device was operating at 35 LPW at the start of the program. The target LPW of the program was a >2x improvement in the LPW efficacy and the overall amount of light to be delivered was relatively high 900 lumens. Despite the extremely difficult challenges associated with scaling up a wet solution process on plastic substrates, the program was able to make substantial progress. A small molecule wet solution process was successfully implemented on plastic substrates with almost no loss in efficiency in transitioning from the laboratory scale glass to large area plastic substrates. By transitioning to a small molecule based process, the LPW entitlement increased from 35 LPW to 60 LPW. A further 10% improvement in outcoupling efficiency was demonstrated via the use of a highly reflecting cathode, which reduced absorptive loss in the OLED device. The calculated potential improvement in some cases is even larger, ~30%, and thus there is considerable room for optimism in improving the net light coupling efficacy, provided absorptive loss mechanisms are eliminated. Further improvements are possible if scattering schemes such as the silver nanowire based hard coat structure are fully developed. The wet coating processes were successfully scaled to large area plastic substrate and resulted in the construction of a 900 lumens luminaire device.

  2. CRITICAL ISSUES IN HIGH END COMPUTING - FINAL REPORT

    Energy Technology Data Exchange (ETDEWEB)

    Corones, James [Krell Institute

    2013-09-23

    High-End computing (HEC) has been a driver for advances in science and engineering for the past four decades. Increasingly HEC has become a significant element in the national security, economic vitality, and competitiveness of the United States. Advances in HEC provide results that cut across traditional disciplinary and organizational boundaries. This program provides opportunities to share information about HEC systems and computational techniques across multiple disciplines and organizations through conferences and exhibitions of HEC advances held in Washington DC so that mission agency staff, scientists, and industry can come together with White House, Congressional and Legislative staff in an environment conducive to the sharing of technical information, accomplishments, goals, and plans. A common thread across this series of conferences is the understanding of computational science and applied mathematics techniques across a diverse set of application areas of interest to the Nation. The specific objectives of this program are: Program Objective 1. To provide opportunities to share information about advances in high-end computing systems and computational techniques between mission critical agencies, agency laboratories, academics, and industry. Program Objective 2. To gather pertinent data, address specific topics of wide interest to mission critical agencies. Program Objective 3. To promote a continuing discussion of critical issues in high-end computing. Program Objective 4.To provide a venue where a multidisciplinary scientific audience can discuss the difficulties applying computational science techniques to specific problems and can specify future research that, if successful, will eliminate these problems.

  3. Computational Study on the Effect of Shroud Shape on the Efficiency of the Gas Turbine Stage

    Science.gov (United States)

    Afanas'ev, I. V.; Granovskii, A. V.

    2018-03-01

    The last stages of powerful power gas turbines play an important role in the development of power and efficiency of the whole unit as well as in the distribution of the flow parameters behind the last stage, which determines the efficient operation of the exhaust diffusers. Therefore, much attention is paid to improving the efficiency of the last stages of gas turbines as well as the distribution of flow parameters. Since the long blades of the last stages of multistage high-power gas turbines could fall into the resonance frequency range in the course of operation, which results in the destruction of the blades, damping wires or damping bolts are used for turning out of resonance frequencies. However, these damping elements cause additional energy losses leading to a reduction in the efficiency of the stage. To minimize these losses, dampening shrouds are used instead of wires and bolts at the periphery of the working blades. However, because of the strength problems, designers have to use, instead of the most efficient full shrouds, partial shrouds that do not provide for significantly reducing the losses in the tip clearance between the blade and the turbine housing. In this paper, a computational study is performed concerning an effect that the design of the shroud of the turbine-working blade exerted on the flow structure in the vicinity of the shroud and on the efficiency of the stage as a whole. The analysis of the flow structure has shown that a significant part of the losses under using the shrouds is associated with the formation of vortex zones in the cavities on the turbine housing before the shrouds, between the ribs of the shrouds, and in the cavities at the outlet behind the shrouds. All the investigated variants of a partial shrouding are inferior in efficiency to the stages with shrouds that completely cover the tip section of the working blade. The stage with a unshrouded working blade was most efficient at the values of the relative tip clearance

  4. A hardware-oriented concurrent TZ search algorithm for High-Efficiency Video Coding

    Science.gov (United States)

    Doan, Nghia; Kim, Tae Sung; Rhee, Chae Eun; Lee, Hyuk-Jae

    2017-12-01

    High-Efficiency Video Coding (HEVC) is the latest video coding standard, in which the compression performance is double that of its predecessor, the H.264/AVC standard, while the video quality remains unchanged. In HEVC, the test zone (TZ) search algorithm is widely used for integer motion estimation because it effectively searches the good-quality motion vector with a relatively small amount of computation. However, the complex computation structure of the TZ search algorithm makes it difficult to implement it in the hardware. This paper proposes a new integer motion estimation algorithm which is designed for hardware execution by modifying the conventional TZ search to allow parallel motion estimations of all prediction unit (PU) partitions. The algorithm consists of the three phases of zonal, raster, and refinement searches. At the beginning of each phase, the algorithm obtains the search points required by the original TZ search for all PU partitions in a coding unit (CU). Then, all redundant search points are removed prior to the estimation of the motion costs, and the best search points are then selected for all PUs. Compared to the conventional TZ search algorithm, experimental results show that the proposed algorithm significantly decreases the Bjøntegaard Delta bitrate (BD-BR) by 0.84%, and it also reduces the computational complexity by 54.54%.

  5. Sampling efficiency of modified 37-mm sampling cassettes using computational fluid dynamics.

    Science.gov (United States)

    Anthony, T Renée; Sleeth, Darrah; Volckens, John

    2016-01-01

    In the U.S., most industrial hygiene practitioners continue to rely on the closed-face cassette (CFC) to assess worker exposures to hazardous dusts, primarily because ease of use, cost, and familiarity. However, mass concentrations measured with this classic sampler underestimate exposures to larger particles throughout the inhalable particulate mass (IPM) size range (up to aerodynamic diameters of 100 μm). To investigate whether the current 37-mm inlet cap can be redesigned to better meet the IPM sampling criterion, computational fluid dynamics (CFD) models were developed, and particle sampling efficiencies associated with various modifications to the CFC inlet cap were determined. Simulations of fluid flow (standard k-epsilon turbulent model) and particle transport (laminar trajectories, 1-116 μm) were conducted using sampling flow rates of 10 L min(-1) in slow moving air (0.2 m s(-1)) in the facing-the-wind orientation. Combinations of seven inlet shapes and three inlet diameters were evaluated as candidates to replace the current 37-mm inlet cap. For a given inlet geometry, differences in sampler efficiency between inlet diameters averaged less than 1% for particles through 100 μm, but the largest opening was found to increase the efficiency for the 116 μm particles by 14% for the flat inlet cap. A substantial reduction in sampler efficiency was identified for sampler inlets with side walls extending beyond the dimension of the external lip of the current 37-mm CFC. The inlet cap based on the 37-mm CFC dimensions with an expanded 15-mm entry provided the best agreement with facing-the-wind human aspiration efficiency. The sampler efficiency was increased with a flat entry or with a thin central lip adjacent to the new enlarged entry. This work provides a substantial body of sampling efficiency estimates as a function of particle size and inlet geometry for personal aerosol samplers.

  6. Extraction of drainage networks from large terrain datasets using high throughput computing

    Science.gov (United States)

    Gong, Jianya; Xie, Jibo

    2009-02-01

    Advanced digital photogrammetry and remote sensing technology produces large terrain datasets (LTD). How to process and use these LTD has become a big challenge for GIS users. Extracting drainage networks, which are basic for hydrological applications, from LTD is one of the typical applications of digital terrain analysis (DTA) in geographical information applications. Existing serial drainage algorithms cannot deal with large data volumes in a timely fashion, and few GIS platforms can process LTD beyond the GB size. High throughput computing (HTC), a distributed parallel computing mode, is proposed to improve the efficiency of drainage networks extraction from LTD. Drainage network extraction using HTC involves two key issues: (1) how to decompose the large DEM datasets into independent computing units and (2) how to merge the separate outputs into a final result. A new decomposition method is presented in which the large datasets are partitioned into independent computing units using natural watershed boundaries instead of using regular 1-dimensional (strip-wise) and 2-dimensional (block-wise) decomposition. Because the distribution of drainage networks is strongly related to watershed boundaries, the new decomposition method is more effective and natural. The method to extract natural watershed boundaries was improved by using multi-scale DEMs instead of single-scale DEMs. A HTC environment is employed to test the proposed methods with real datasets.

  7. Efficient visualization of high-throughput targeted proteomics experiments: TAPIR.

    Science.gov (United States)

    Röst, Hannes L; Rosenberger, George; Aebersold, Ruedi; Malmström, Lars

    2015-07-15

    Targeted mass spectrometry comprises a set of powerful methods to obtain accurate and consistent protein quantification in complex samples. To fully exploit these techniques, a cross-platform and open-source software stack based on standardized data exchange formats is required. We present TAPIR, a fast and efficient Python visualization software for chromatograms and peaks identified in targeted proteomics experiments. The input formats are open, community-driven standardized data formats (mzML for raw data storage and TraML encoding the hierarchical relationships between transitions, peptides and proteins). TAPIR is scalable to proteome-wide targeted proteomics studies (as enabled by SWATH-MS), allowing researchers to visualize high-throughput datasets. The framework integrates well with existing automated analysis pipelines and can be extended beyond targeted proteomics to other types of analyses. TAPIR is available for all computing platforms under the 3-clause BSD license at https://github.com/msproteomicstools/msproteomicstools. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  8. Parallel computing for event reconstruction in high-energy physics

    International Nuclear Information System (INIS)

    Wolbers, S.

    1993-01-01

    Parallel computing has been recognized as a solution to large computing problems. In High Energy Physics offline event reconstruction of detector data is a very large computing problem that has been solved with parallel computing techniques. A review of the parallel programming package CPS (Cooperative Processes Software) developed and used at Fermilab for offline reconstruction of Terabytes of data requiring the delivery of hundreds of Vax-Years per experiment is given. The Fermilab UNIX farms, consisting of 180 Silicon Graphics workstations and 144 IBM RS6000 workstations, are used to provide the computing power for the experiments. Fermilab has had a long history of providing production parallel computing starting with the ACP (Advanced Computer Project) Farms in 1986. The Fermilab UNIX Farms have been in production for over 2 years with 24 hour/day service to experimental user groups. Additional tools for management, control and monitoring these large systems will be described. Possible future directions for parallel computing in High Energy Physics will be given

  9. High efficiency and broadband acoustic diodes

    Science.gov (United States)

    Fu, Congyi; Wang, Bohan; Zhao, Tianfei; Chen, C. Q.

    2018-01-01

    Energy transmission efficiency and working bandwidth are the two major factors limiting the application of current acoustic diodes (ADs). This letter presents a design of high efficiency and broadband acoustic diodes composed of a nonlinear frequency converter and a linear wave filter. The converter consists of two masses connected by a bilinear spring with asymmetric tension and compression stiffness. The wave filter is a linear mass-spring lattice (sonic crystal). Both numerical simulation and experiment show that the energy transmission efficiency of the acoustic diode can be improved by as much as two orders of magnitude, reaching about 61%. Moreover, the primary working band width of the AD is about two times of the cut-off frequency of the sonic crystal filter. The cut-off frequency dependent working band of the AD implies that the developed AD can be scaled up or down from macro-scale to micro- and nano-scale.

  10. High-precision efficiency calibration of a high-purity co-axial germanium detector

    Energy Technology Data Exchange (ETDEWEB)

    Blank, B., E-mail: blank@cenbg.in2p3.fr [Centre d' Etudes Nucléaires de Bordeaux Gradignan, UMR 5797, CNRS/IN2P3, Université de Bordeaux, Chemin du Solarium, BP 120, 33175 Gradignan Cedex (France); Souin, J.; Ascher, P.; Audirac, L.; Canchel, G.; Gerbaux, M.; Grévy, S.; Giovinazzo, J.; Guérin, H.; Nieto, T. Kurtukian; Matea, I. [Centre d' Etudes Nucléaires de Bordeaux Gradignan, UMR 5797, CNRS/IN2P3, Université de Bordeaux, Chemin du Solarium, BP 120, 33175 Gradignan Cedex (France); Bouzomita, H.; Delahaye, P.; Grinyer, G.F.; Thomas, J.C. [Grand Accélérateur National d' Ions Lourds, CEA/DSM, CNRS/IN2P3, Bvd Henri Becquerel, BP 55027, F-14076 CAEN Cedex 5 (France)

    2015-03-11

    A high-purity co-axial germanium detector has been calibrated in efficiency to a precision of about 0.15% over a wide energy range. High-precision scans of the detector crystal and γ-ray source measurements have been compared to Monte-Carlo simulations to adjust the dimensions of a detector model. For this purpose, standard calibration sources and short-lived online sources have been used. The resulting efficiency calibration reaches the precision needed e.g. for branching ratio measurements of super-allowed β decays for tests of the weak-interaction standard model.

  11. NINJA: Java for High Performance Numerical Computing

    Directory of Open Access Journals (Sweden)

    José E. Moreira

    2002-01-01

    Full Text Available When Java was first introduced, there was a perception that its many benefits came at a significant performance cost. In the particularly performance-sensitive field of numerical computing, initial measurements indicated a hundred-fold performance disadvantage between Java and more established languages such as Fortran and C. Although much progress has been made, and Java now can be competitive with C/C++ in many important situations, significant performance challenges remain. Existing Java virtual machines are not yet capable of performing the advanced loop transformations and automatic parallelization that are now common in state-of-the-art Fortran compilers. Java also has difficulties in implementing complex arithmetic efficiently. These performance deficiencies can be attacked with a combination of class libraries (packages, in Java that implement truly multidimensional arrays and complex numbers, and new compiler techniques that exploit the properties of these class libraries to enable other, more conventional, optimizations. Two compiler techniques, versioning and semantic expansion, can be leveraged to allow fully automatic optimization and parallelization of Java code. Our measurements with the NINJA prototype Java environment show that Java can be competitive in performance with highly optimized and tuned Fortran code.

  12. Efficient Machine Learning Approach for Optimizing Scientific Computing Applications on Emerging HPC Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Arumugam, Kamesh [Old Dominion Univ., Norfolk, VA (United States)

    2017-05-01

    Efficient parallel implementations of scientific applications on multi-core CPUs with accelerators such as GPUs and Xeon Phis is challenging. This requires - exploiting the data parallel architecture of the accelerator along with the vector pipelines of modern x86 CPU architectures, load balancing, and efficient memory transfer between different devices. It is relatively easy to meet these requirements for highly structured scientific applications. In contrast, a number of scientific and engineering applications are unstructured. Getting performance on accelerators for these applications is extremely challenging because many of these applications employ irregular algorithms which exhibit data-dependent control-ow and irregular memory accesses. Furthermore, these applications are often iterative with dependency between steps, and thus making it hard to parallelize across steps. As a result, parallelism in these applications is often limited to a single step. Numerical simulation of charged particles beam dynamics is one such application where the distribution of work and memory access pattern at each time step is irregular. Applications with these properties tend to present significant branch and memory divergence, load imbalance between different processor cores, and poor compute and memory utilization. Prior research on parallelizing such irregular applications have been focused around optimizing the irregular, data-dependent memory accesses and control-ow during a single step of the application independent of the other steps, with the assumption that these patterns are completely unpredictable. We observed that the structure of computation leading to control-ow divergence and irregular memory accesses in one step is similar to that in the next step. It is possible to predict this structure in the current step by observing the computation structure of previous steps. In this dissertation, we present novel machine learning based optimization techniques to address

  13. PEAC: A Power-Efficient Adaptive Computing Technology for Enabling Swarm of Small Spacecraft and Deployable Mini-Payloads

    Data.gov (United States)

    National Aeronautics and Space Administration — This task is to develop and demonstrate a path-to-flight and power-adaptive avionics technology PEAC (Power Efficient Adaptive Computing). PEAC will enable emerging...

  14. High Performance Computing and Storage Requirements for Biological and Environmental Research Target 2017

    Energy Technology Data Exchange (ETDEWEB)

    Gerber, Richard [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC); Wasserman, Harvey [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). National Energy Research Scientific Computing Center (NERSC)

    2013-05-01

    The National Energy Research Scientific Computing Center (NERSC) is the primary computing center for the DOE Office of Science, serving approximately 4,500 users working on some 650 projects that involve nearly 600 codes in a wide variety of scientific disciplines. In addition to large-­scale computing and storage resources NERSC provides support and expertise that help scientists make efficient use of its systems. The latest review revealed several key requirements, in addition to achieving its goal of characterizing BER computing and storage needs.

  15. Efficiently outsourcing multiparty computation under multiple keys

    NARCIS (Netherlands)

    Peter, Andreas; Tews, Erik; Tews, Erik; Katzenbeisser, Stefan

    2013-01-01

    Secure multiparty computation enables a set of users to evaluate certain functionalities on their respective inputs while keeping these inputs encrypted throughout the computation. In many applications, however, outsourcing these computations to an untrusted server is desirable, so that the server

  16. Metasurface holograms reaching 80% efficiency.

    Science.gov (United States)

    Zheng, Guoxing; Mühlenbernd, Holger; Kenney, Mitchell; Li, Guixin; Zentgraf, Thomas; Zhang, Shuang

    2015-04-01

    Surfaces covered by ultrathin plasmonic structures--so-called metasurfaces--have recently been shown to be capable of completely controlling the phase of light, representing a new paradigm for the design of innovative optical elements such as ultrathin flat lenses, directional couplers for surface plasmon polaritons and wave plate vortex beam generation. Among the various types of metasurfaces, geometric metasurfaces, which consist of an array of plasmonic nanorods with spatially varying orientations, have shown superior phase control due to the geometric nature of their phase profile. Metasurfaces have recently been used to make computer-generated holograms, but the hologram efficiency remained too low at visible wavelengths for practical purposes. Here, we report the design and realization of a geometric metasurface hologram reaching diffraction efficiencies of 80% at 825 nm and a broad bandwidth between 630 nm and 1,050 nm. The 16-level-phase computer-generated hologram demonstrated here combines the advantages of a geometric metasurface for the superior control of the phase profile and of reflectarrays for achieving high polarization conversion efficiency. Specifically, the design of the hologram integrates a ground metal plane with a geometric metasurface that enhances the conversion efficiency between the two circular polarization states, leading to high diffraction efficiency without complicating the fabrication process. Because of these advantages, our strategy could be viable for various practical holographic applications.

  17. High-efficiency ballistic electrostatic generator using microdroplets

    Science.gov (United States)

    Xie, Yanbo; Bos, Diederik; de Vreede, Lennart J.; de Boer, Hans L.; van der Meulen, Mark-Jan; Versluis, Michel; Sprenkels, Ad J.; van den Berg, Albert; Eijkel, Jan C. T.

    2014-04-01

    The strong demand for renewable energy promotes research on novel methods and technologies for energy conversion. Microfluidic systems for energy conversion by streaming current are less known to the public, and the relatively low efficiencies previously obtained seemed to limit the further applications of such systems. Here we report a microdroplet-based electrostatic generator operating by an acceleration-deceleration cycle (‘ballistic’ conversion), and show that this principle enables both high efficiency and compact simple design. Water is accelerated by pumping it through a micropore to form a microjet breaking up into fast-moving charged droplets. Droplet kinetic energy is converted to electrical energy when the charged droplets decelerate in the electrical field that forms between membrane and target. We demonstrate conversion efficiencies of up to 48%, a power density of 160 kW m-2 and both high- (20 kV) and low- (500 V) voltage operation. Besides offering striking new insights, the device potentially opens up new perspectives for low-cost and robust renewable energy conversion.

  18. High-Efficient Low-Cost Photovoltaics Recent Developments

    CERN Document Server

    Petrova-Koch, Vesselinka; Goetzberger, Adolf

    2009-01-01

    A bird's-eye view of the development and problems of recent photovoltaic cells and systems and prospects for Si feedstock is presented. High-efficient low-cost PV modules, making use of novel efficient solar cells (based on c-Si or III-V materials), and low cost solar concentrators are in the focus of this book. Recent developments of organic photovoltaics, which is expected to overcome its difficulties and to enter the market soon, are also included.

  19. [Computational fluid dynamics simulation of different impeller combinations in high viscosity fermentation and its application].

    Science.gov (United States)

    Dong, Shuhao; Zhu, Ping; Xu, Xiaoying; Li, Sha; Jiang, Yongxiang; Xu, Hong

    2015-07-01

    Agitator is one of the essential factors to realize high efficient fermentation for high aerobic and viscous microorganisms, and the influence of different impeller combination on the fermentation process is very important. Welan gum is a microbial exopolysaccharide produced by Alcaligenes sp. under high aerobic and high viscos conditions. Computational fluid dynamics (CFD) numerical simulation was used for analyzing the distribution of velocity, shear rate and gas holdup in the welan fermentation reactor under six different impeller combinations. The best three combinations of impellers were applied to the fermentation of welan. By analyzing the fermentation performance, the MB-4-6 combination had better effect on dissolved oxygen and velocity. The content of welan was increased by 13%. Furthermore, the viscosity of production were also increased.

  20. Embedded High Performance Scalable Computing Systems

    National Research Council Canada - National Science Library

    Ngo, David

    2003-01-01

    The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...