WorldWideScience

Sample records for hardware accelerated scalable

  1. Hardware Accelerated Simulated Radiography

    International Nuclear Information System (INIS)

    Laney, D; Callahan, S; Max, N; Silva, C; Langer, S; Frank, R

    2005-01-01

    We present the application of hardware accelerated volume rendering algorithms to the simulation of radiographs as an aid to scientists designing experiments, validating simulation codes, and understanding experimental data. The techniques presented take advantage of 32 bit floating point texture capabilities to obtain validated solutions to the radiative transport equation for X-rays. An unsorted hexahedron projection algorithm is presented for curvilinear hexahedra that produces simulated radiographs in the absorption-only regime. A sorted tetrahedral projection algorithm is presented that simulates radiographs of emissive materials. We apply the tetrahedral projection algorithm to the simulation of experimental diagnostics for inertial confinement fusion experiments on a laser at the University of Rochester. We show that the hardware accelerated solution is faster than the current technique used by scientists

  2. Hardware Accelerated Sequence Alignment with Traceback

    Directory of Open Access Journals (Sweden)

    Scott Lloyd

    2009-01-01

    in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

  3. Hardware-Accelerated Simulated Radiography

    International Nuclear Information System (INIS)

    Laney, D; Callahan, S; Max, N; Silva, C; Langer, S.; Frank, R

    2005-01-01

    We present the application of hardware accelerated volume rendering algorithms to the simulation of radiographs as an aid to scientists designing experiments, validating simulation codes, and understanding experimental data. The techniques presented take advantage of 32-bit floating point texture capabilities to obtain solutions to the radiative transport equation for X-rays. The hardware accelerated solutions are accurate enough to enable scientists to explore the experimental design space with greater efficiency than the methods currently in use. An unsorted hexahedron projection algorithm is presented for curvilinear hexahedral meshes that produces simulated radiographs in the absorption-only regime. A sorted tetrahedral projection algorithm is presented that simulates radiographs of emissive materials. We apply the tetrahedral projection algorithm to the simulation of experimental diagnostics for inertial confinement fusion experiments on a laser at the University of Rochester

  4. A Scalable Approach for Hardware Semiformal Verification

    OpenAIRE

    Grimm, Tomas; Lettnin, Djones; Hübner, Michael

    2018-01-01

    The current verification flow of complex systems uses different engines synergistically: virtual prototyping, formal verification, simulation, emulation and FPGA prototyping. However, none is able to verify a complete architecture. Furthermore, hybrid approaches aiming at complete verification use techniques that lower the overall complexity by increasing the abstraction level. This work focuses on the verification of complex systems at the RT level to handle the hardware peculiarities. Our r...

  5. Scalable fast multipole accelerated vortex methods

    KAUST Repository

    Hu, Qi

    2014-05-01

    The fast multipole method (FMM) is often used to accelerate the calculation of particle interactions in particle-based methods to simulate incompressible flows. To evaluate the most time-consuming kernels - the Biot-Savart equation and stretching term of the vorticity equation, we mathematically reformulated it so that only two Laplace scalar potentials are used instead of six. This automatically ensuring divergence-free far-field computation. Based on this formulation, we developed a new FMM-based vortex method on heterogeneous architectures, which distributed the work between multicore CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm uses new data structures which can dynamically manage inter-node communication and load balance efficiently, with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching calculation for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s.

  6. Implementation of Hardware Accelerators on Zynq

    DEFF Research Database (Denmark)

    Toft, Jakob Kenn

    of the ARM Cortex-9 processor featured on the Zynq SoC, with regard to execution time, power dissipation and energy consumption. The implementation of the hardware accelerators were successful. Use of the Monte Carlo processor resulted in a significant increase in performance. The Telco hardware accelerator......In the recent years it has become obvious that the performance of general purpose processors are having trouble meeting the requirements of high performance computing applications of today. This is partly due to the relatively high power consumption, compared to the performance, of general purpose...... processors, which has made hardware accelerators an essential part of several datacentres and the worlds fastest super-computers. In this work, two different hardware accelerators were implemented on a Xilinx Zynq SoC platform mounted on the ZedBoard platform. The two accelerators are based on two different...

  7. Design of hardware accelerators for demanding applications.

    NARCIS (Netherlands)

    Jozwiak, L.; Jan, Y.

    2010-01-01

    This paper focuses on mastering the architecture development of hardware accelerators. It presents the results of our analysis of the main issues that have to be addressed when designing accelerators for modern demanding applications, when using as an example the accelerator design for LDPC decoding

  8. Hardware Acceleration of Adaptive Neural Algorithms.

    Energy Technology Data Exchange (ETDEWEB)

    James, Conrad D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-11-01

    As tradit ional numerical computing has faced challenges, researchers have turned towards alternative computing approaches to reduce power - per - computation metrics and improve algorithm performance. Here, we describe an approach towards non - conventional computing that strengthens the connection between machine learning and neuroscience concepts. The Hardware Acceleration of Adaptive Neural Algorithms (HAANA) project ha s develop ed neural machine learning algorithms and hardware for applications in image processing and cybersecurity. While machine learning methods are effective at extracting relevant features from many types of data, the effectiveness of these algorithms degrades when subjected to real - world conditions. Our team has generated novel neural - inspired approa ches to improve the resiliency and adaptability of machine learning algorithms. In addition, we have also designed and fabricated hardware architectures and microelectronic devices specifically tuned towards the training and inference operations of neural - inspired algorithms. Finally, our multi - scale simulation framework allows us to assess the impact of microelectronic device properties on algorithm performance.

  9. Open Hardware for CERN's accelerator control systems

    International Nuclear Information System (INIS)

    Bij, E van der; Serrano, J; Wlostowski, T; Cattin, M; Gousiou, E; Sanchez, P Alvarez; Boccardi, A; Voumard, N; Penacoba, G

    2012-01-01

    The accelerator control systems at CERN will be upgraded and many electronics modules such as analog and digital I/O, level converters and repeaters, serial links and timing modules are being redesigned. The new developments are based on the FPGA Mezzanine Card, PCI Express and VME64x standards while the Wishbone specification is used as a system on a chip bus. To attract partners, the projects are developed in an 'Open' fashion. Within this Open Hardware project new ways of working with industry are being evaluated and it has been proven that industry can be involved at all stages, from design to production and support.

  10. Hardware Accelerators Targeting a Novel Group Based Packet Classification Algorithm

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Packet classification is a ubiquitous and key building block for many critical network devices. However, it remains as one of the main bottlenecks faced when designing fast network devices. In this paper, we propose a novel Group Based Search packet classification Algorithm (GBSA that is scalable, fast, and efficient. GBSA consumes an average of 0.4 megabytes of memory for a 10 k rule set. The worst-case classification time per packet is 2 microseconds, and the preprocessing speed is 3 M rules/second based on an Xeon processor operating at 3.4 GHz. When compared with other state-of-the-art classification techniques, the results showed that GBSA outperforms the competition with respect to speed, memory usage, and processing time. Moreover, GBSA is amenable to implementation in hardware. Three different hardware implementations are also presented in this paper including an Application Specific Instruction Set Processor (ASIP implementation and two pure Register-Transfer Level (RTL implementations based on Impulse-C and Handel-C flows, respectively. Speedups achieved with these hardware accelerators ranged from 9x to 18x compared with a pure software implementation running on an Xeon processor.

  11. A Software and Hardware IPTV Architecture for Scalable DVB Distribution

    Directory of Open Access Journals (Sweden)

    Georg Acher

    2009-01-01

    Full Text Available Many standards and even more proprietary technologies deal with IP-based television (IPTV. But none of them can transparently map popular public broadcast services such as DVB or ATSC to IPTV with acceptable effort. In this paper we explain why we believe that such a mapping using a light weight framework is an important step towards all-IP multimedia. We then present the NetCeiver architecture: it is based on well-known standards such as IPv6, and it allows zero configuration. The use of multicast streaming makes NetCeiver highly scalable. We also describe a low cost FPGA implementation of the proposed NetCeiver architecture, which can concurrently stream services from up to six full transponders.

  12. Open Hardware For CERN's Accelerator Control Systems

    CERN Document Server

    van der Bij, E; Ayass, M; Boccardi, A; Cattin, M; Gil Soriano, C; Gousiou, E; Iglesias Gonsálvez, S; Penacoba Fernandez, G; Serrano, J; Voumard, N; Wlostowski, T

    2011-01-01

    The accelerator control systems at CERN will be renovated and many electronics modules will be redesigned as the modules they will replace cannot be bought anymore or use obsolete components. The modules used in the control systems are diverse: analog and digital I/O, level converters and repeaters, serial links and timing modules. Overall around 120 modules are supported that are used in systems such as beam instrumentation, cryogenics and power converters. Only a small percentage of the currently used modules are commercially available, while most of them had been specifically designed at CERN. The new developments are based on VITA and PCI-SIG standards such as FMC (FPGA Mezzanine Card), PCI Express and VME64x using transition modules. As system-on-chip interconnect, the public domain Wishbone specification is used. For the renovation, it is considered imperative to have for each board access to the full hardware design and its firmware so that problems could quickly be resolved by CERN engineers or its ...

  13. A Framework for Hardware-Accelerated Services Using Partially Reconfigurable SoCs

    Directory of Open Access Journals (Sweden)

    MACHIDON, O. M.

    2016-05-01

    Full Text Available The current trend towards ?Everything as a Service? fosters a new approach on reconfigurable hardware resources. This innovative, service-oriented approach has the potential of bringing a series of benefits for both reconfigurable and distributed computing fields by favoring a hardware-based acceleration of web services and increasing service performance. This paper proposes a framework for accelerating web services by offloading the compute-intensive tasks to reconfigurable System-on-Chip (SoC devices, as integrated IP (Intellectual Property cores. The framework provides a scalable, dynamic management of the tasks and hardware processing cores, based on dynamic partial reconfiguration of the SoC. We have enhanced security of the entire system by making use of the built-in detection features of the hardware device and also by implementing active counter-measures that protect the sensitive data.

  14. Scalable fast multipole accelerated vortex methods

    KAUST Repository

    Hu, Qi; Gumerov, Nail A.; Yokota, Rio; Barba, Lorena A.; Duraiswami, Ramani

    2014-01-01

    -node communication and load balance efficiently, with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff

  15. Programming time-multiplexed reconfigurable hardware using a scalable neuromorphic compiler.

    Science.gov (United States)

    Minkovich, Kirill; Srinivasa, Narayan; Cruz-Albrecht, Jose M; Cho, Youngkwan; Nogin, Aleksey

    2012-06-01

    Scalability and connectivity are two key challenges in designing neuromorphic hardware that can match biological levels. In this paper, we describe a neuromorphic system architecture design that addresses an approach to meet these challenges using traditional complementary metal-oxide-semiconductor (CMOS) hardware. A key requirement in realizing such neural architectures in hardware is the ability to automatically configure the hardware to emulate any neural architecture or model. The focus for this paper is to describe the details of such a programmable front-end. This programmable front-end is composed of a neuromorphic compiler and a digital memory, and is designed based on the concept of synaptic time-multiplexing (STM). The neuromorphic compiler automatically translates any given neural architecture to hardware switch states and these states are stored in digital memory to enable desired neural architectures. STM enables our proposed architecture to address scalability and connectivity using traditional CMOS hardware. We describe the details of the proposed design and the programmable front-end, and provide examples to illustrate its capabilities. We also provide perspectives for future extensions and potential applications.

  16. Evaluating the scalability of HEP software and multi-core hardware

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A

    2011-01-01

    As researchers have reached the practical limits of processor performance improvements by frequency scaling, it is clear that the future of computing lies in the effective utilization of parallel and multi-core architectures. Since this significant change in computing is well underway, it is vital for HEP programmers to understand the scalability of their software on modern hardware and the opportunities for potential improvements. This work aims to quantify the benefit of new mainstream architectures to the HEP community through practical benchmarking on recent hardware solutions, including the usage of parallelized HEP applications.

  17. Accelerator Technology: Injection and Extraction Related Hardware: Kickers and Septa

    CERN Document Server

    Barnes, M J; Mertens, V

    2013-01-01

    This document is part of Subvolume C 'Accelerators and Colliders' of Volume 21 'Elementary Particles' of Landolt-Börnstein - Group I 'Elementary Particles, Nuclei and Atoms'. It contains the the Section '8.7 Injection and Extraction Related Hardware: Kickers and Septa' of the Chapter '8 Accelerator Technology' with the content: 8.7 Injection and Extraction Related Hardware: Kickers and Septa 8.7.1 Fast Pulsed Systems (Kickers) 8.7.2 Electrostatic and Magnetic Septa

  18. Acceleration of Meshfree Radial Point Interpolation Method on Graphics Hardware

    International Nuclear Information System (INIS)

    Nakata, Susumu

    2008-01-01

    This article describes a parallel computational technique to accelerate radial point interpolation method (RPIM)-based meshfree method using graphics hardware. RPIM is one of the meshfree partial differential equation solvers that do not require the mesh structure of the analysis targets. In this paper, a technique for accelerating RPIM using graphics hardware is presented. In the method, the computation process is divided into small processes suitable for processing on the parallel architecture of the graphics hardware in a single instruction multiple data manner.

  19. Hardware availability calculations and results of the IFMIF accelerator facility

    International Nuclear Information System (INIS)

    Bargalló, Enric; Arroyo, Jose Manuel; Abal, Javier; Beauvais, Pierre-Yves; Gobin, Raphael; Orsini, Fabienne; Weber, Moisés; Podadera, Ivan; Grespan, Francesco; Fagotti, Enrico; De Blas, Alfredo; Dies, Javier; Tapia, Carlos; Mollá, Joaquín; Ibarra, Ángel

    2014-01-01

    Highlights: • IFMIF accelerator facility hardware availability analyses methodology is described. • Results of the individual hardware availability analyses are shown for the reference design. • Accelerator design improvements are proposed for each system. • Availability results are evaluated and compared with the requirements. - Abstract: Hardware availability calculations have been done individually for each system of the deuteron accelerators of the International Fusion Materials Irradiation Facility (IFMIF). The principal goal of these analyses is to estimate the availability of the systems, compare it with the challenging IFMIF requirements and find new paths to improve availability performances. Major unavailability contributors are highlighted and possible design changes are proposed in order to achieve the hardware availability requirements established for each system. In this paper, such possible improvements are implemented in fault tree models and the availability results are evaluated. The parallel activity on the design and construction of the linear IFMIF prototype accelerator (LIPAc) provides detailed design information for the RAMI (reliability, availability, maintainability and inspectability) analyses and allows finding out the improvements that the final accelerator could have. Because of the R and D behavior of the LIPAc, RAMI improvements could be the major differences between the prototype and the IFMIF accelerator design

  20. Hardware availability calculations and results of the IFMIF accelerator facility

    Energy Technology Data Exchange (ETDEWEB)

    Bargalló, Enric, E-mail: enric.bargallo-font@upc.edu [Fusion Energy Engineering Laboratory (FEEL), Technical University of Catalonia (UPC), Barcelona (Spain); Arroyo, Jose Manuel [Laboratorio Nacional de Fusión por Confinamiento Magnético – CIEMAT, Madrid (Spain); Abal, Javier [Fusion Energy Engineering Laboratory (FEEL), Technical University of Catalonia (UPC), Barcelona (Spain); Beauvais, Pierre-Yves; Gobin, Raphael; Orsini, Fabienne [Commissariat à l’Energie Atomique, Saclay (France); Weber, Moisés; Podadera, Ivan [Laboratorio Nacional de Fusión por Confinamiento Magnético – CIEMAT, Madrid (Spain); Grespan, Francesco; Fagotti, Enrico [Istituto Nazionale di Fisica Nucleare, Legnaro (Italy); De Blas, Alfredo; Dies, Javier; Tapia, Carlos [Fusion Energy Engineering Laboratory (FEEL), Technical University of Catalonia (UPC), Barcelona (Spain); Mollá, Joaquín; Ibarra, Ángel [Laboratorio Nacional de Fusión por Confinamiento Magnético – CIEMAT, Madrid (Spain)

    2014-10-15

    Highlights: • IFMIF accelerator facility hardware availability analyses methodology is described. • Results of the individual hardware availability analyses are shown for the reference design. • Accelerator design improvements are proposed for each system. • Availability results are evaluated and compared with the requirements. - Abstract: Hardware availability calculations have been done individually for each system of the deuteron accelerators of the International Fusion Materials Irradiation Facility (IFMIF). The principal goal of these analyses is to estimate the availability of the systems, compare it with the challenging IFMIF requirements and find new paths to improve availability performances. Major unavailability contributors are highlighted and possible design changes are proposed in order to achieve the hardware availability requirements established for each system. In this paper, such possible improvements are implemented in fault tree models and the availability results are evaluated. The parallel activity on the design and construction of the linear IFMIF prototype accelerator (LIPAc) provides detailed design information for the RAMI (reliability, availability, maintainability and inspectability) analyses and allows finding out the improvements that the final accelerator could have. Because of the R and D behavior of the LIPAc, RAMI improvements could be the major differences between the prototype and the IFMIF accelerator design.

  1. A Hardware Framework for on-Chip FPGA Acceleration

    DEFF Research Database (Denmark)

    Lomuscio, Andrea; Cardarilli, Gian Carlo; Nannarelli, Alberto

    2016-01-01

    In this work, we present a new framework to dynamically load hardware accelerators on reconfigurable platforms (FPGAs). Provided a library of application-specific processors, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accele......In this work, we present a new framework to dynamically load hardware accelerators on reconfigurable platforms (FPGAs). Provided a library of application-specific processors, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA......-based accelerator. Results show that significant speed-up can be obtained by the proposed acceleration framework on system-on-chips where reconfigurable fabric is placed next to the CPUs. The speed-up is due to both the intrinsic acceleration in the application-specific processors, and to the increased parallelism....

  2. Hardware Implementation of Lossless Adaptive and Scalable Hyperspectral Data Compression for Space

    Science.gov (United States)

    Aranki, Nazeeh; Keymeulen, Didier; Bakhshi, Alireza; Klimesh, Matthew

    2009-01-01

    On-board lossless hyperspectral data compression reduces data volume in order to meet NASA and DoD limited downlink capabilities. The technique also improves signature extraction, object recognition and feature classification capabilities by providing exact reconstructed data on constrained downlink resources. At JPL a novel, adaptive and predictive technique for lossless compression of hyperspectral data was recently developed. This technique uses an adaptive filtering method and achieves a combination of low complexity and compression effectiveness that far exceeds state-of-the-art techniques currently in use. The JPL-developed 'Fast Lossless' algorithm requires no training data or other specific information about the nature of the spectral bands for a fixed instrument dynamic range. It is of low computational complexity and thus well-suited for implementation in hardware. A modified form of the algorithm that is better suited for data from pushbroom instruments is generally appropriate for flight implementation. A scalable field programmable gate array (FPGA) hardware implementation was developed. The FPGA implementation achieves a throughput performance of 58 Msamples/sec, which can be increased to over 100 Msamples/sec in a parallel implementation that uses twice the hardware resources This paper describes the hardware implementation of the 'Modified Fast Lossless' compression algorithm on an FPGA. The FPGA implementation targets the current state-of-the-art FPGAs (Xilinx Virtex IV and V families) and compresses one sample every clock cycle to provide a fast and practical real-time solution for space applications.

  3. 3D IBFV : Hardware-Accelerated 3D Flow Visualization

    NARCIS (Netherlands)

    Telea, Alexandru; Wijk, Jarke J. van

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique for 2D flow visualization in two main directions. First, we decompose the 3D flow visualization problem in a

  4. 3D IBFV : hardware-accelerated 3D flow visualization

    NARCIS (Netherlands)

    Telea, A.C.; Wijk, van J.J.

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique presented by van Wijk (2001) for 2D flow visualization in two main directions. First, we decompose the 3D

  5. The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on Commodity Hardware

    Energy Technology Data Exchange (ETDEWEB)

    Tierney, Brian L; Vallentin, Matthias; Sommer, Robin; Lee, Jason; Leres, Craig; Paxson, Vern; Tierney, Brian

    2007-09-19

    In this work we present a NIDS cluster as a scalable solution for realizing high-performance, stateful network intrusion detection on commodity hardware. The design addresses three challenges: (i) distributing traffic evenly across an extensible set of analysis nodes in a fashion that minimizes the communication required for coordination, (ii) adapting the NIDS's operation to support coordinating its low-level analysis rather than just aggregating alerts; and (iii) validating that the cluster produces sound results. Prototypes of our NIDS cluster now operate at the Lawrence Berkeley National Laboratory and the University of California at Berkeley. In both environments the clusters greatly enhance the power of the network security monitoring.

  6. Automatic Optimization of Hardware Accelerators for Image Processing

    OpenAIRE

    Reiche, Oliver; Häublein, Konrad; Reichenbach, Marc; Hannig, Frank; Teich, Jürgen; Fey, Dietmar

    2015-01-01

    In the domain of image processing, often real-time constraints are required. In particular, in safety-critical applications, such as X-ray computed tomography in medical imaging or advanced driver assistance systems in the automotive domain, timing is of utmost importance. A common approach to maintain real-time capabilities of compute-intensive applications is to offload those computations to dedicated accelerator hardware, such as Field Programmable Gate Arrays (FPGAs). Programming such arc...

  7. Hardware Design Considerations for Edge-Accelerated Stereo Correspondence Algorithms

    Directory of Open Access Journals (Sweden)

    Christos Ttofis

    2012-01-01

    Full Text Available Stereo correspondence is a popular algorithm for the extraction of depth information from a pair of rectified 2D images. Hence, it has been used in many computer vision applications that require knowledge about depth. However, stereo correspondence is a computationally intensive algorithm and requires high-end hardware resources in order to achieve real-time processing speed in embedded computer vision systems. This paper presents an overview of the use of edge information as a means to accelerate hardware implementations of stereo correspondence algorithms. The presented approach restricts the stereo correspondence algorithm only to the edges of the input images rather than to all image points, thus resulting in a considerable reduction of the search space. The paper highlights the benefits of the edge-directed approach by applying it to two stereo correspondence algorithms: an SAD-based fixed-support algorithm and a more complex adaptive support weight algorithm. Furthermore, we present design considerations about the implementation of these algorithms on reconfigurable hardware and also discuss issues related to the memory structures needed, the amount of parallelism that can be exploited, the organization of the processing blocks, and so forth. The two architectures (fixed-support based versus adaptive-support weight based are compared in terms of processing speed, disparity map accuracy, and hardware overheads, when both are implemented on a Virtex-5 FPGA platform.

  8. GASPRNG: GPU accelerated scalable parallel random number generator library

    Science.gov (United States)

    Gao, Shuang; Peterson, Gregory D.

    2013-04-01

    Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or

  9. Final Scientific/Technical Report for "Enabling Exascale Hardware and Software Design through Scalable System Virtualization"

    Energy Technology Data Exchange (ETDEWEB)

    Dinda, Peter August [Northwestern Univ., Evanston, IL (United States)

    2015-03-17

    This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems software for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3

  10. Transform coding for hardware-accelerated volume rendering.

    Science.gov (United States)

    Fout, Nathaniel; Ma, Kwan-Liu

    2007-01-01

    Hardware-accelerated volume rendering using the GPU is now the standard approach for real-time volume rendering, although limited graphics memory can present a problem when rendering large volume data sets. Volumetric compression in which the decompression is coupled to rendering has been shown to be an effective solution to this problem; however, most existing techniques were developed in the context of software volume rendering, and all but the simplest approaches are prohibitive in a real-time hardware-accelerated volume rendering context. In this paper we present a novel block-based transform coding scheme designed specifically with real-time volume rendering in mind, such that the decompression is fast without sacrificing compression quality. This is made possible by consolidating the inverse transform with dequantization in such a way as to allow most of the reprojection to be precomputed. Furthermore, we take advantage of the freedom afforded by off-line compression in order to optimize the encoding as much as possible while hiding this complexity from the decoder. In this context we develop a new block classification scheme which allows us to preserve perceptually important features in the compression. The result of this work is an asymmetric transform coding scheme that allows very large volumes to be compressed and then decompressed in real-time while rendering on the GPU.

  11. Hardware accelerator design for tracking in smart camera

    Science.gov (United States)

    Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Vohra, Anil

    2011-10-01

    Smart Cameras are important components in video analysis. For video analysis, smart cameras needs to detect interesting moving objects, track such objects from frame to frame, and perform analysis of object track in real time. Therefore, the use of real-time tracking is prominent in smart cameras. The software implementation of tracking algorithm on a general purpose processor (like PowerPC) could achieve low frame rate far from real-time requirements. This paper presents the SIMD approach based hardware accelerator designed for real-time tracking of objects in a scene. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA. Resulted frame rate is 30 frames per second for 250x200 resolution video in gray scale.

  12. Hardware accelerator design for change detection in smart camera

    Science.gov (United States)

    Singh, Sanjay; Dunga, Srinivasa Murali; Saini, Ravi; Mandal, A. S.; Shekhar, Chandra; Chaudhury, Santanu; Vohra, Anil

    2011-10-01

    Smart Cameras are important components in Human Computer Interaction. In any remote surveillance scenario, smart cameras have to take intelligent decisions to select frames of significant changes to minimize communication and processing overhead. Among many of the algorithms for change detection, one based on clustering based scheme was proposed for smart camera systems. However, such an algorithm could achieve low frame rate far from real-time requirements on a general purpose processors (like PowerPC) available on FPGAs. This paper proposes the hardware accelerator capable of detecting real time changes in a scene, which uses clustering based change detection scheme. The system is designed and simulated using VHDL and implemented on Xilinx XUP Virtex-IIPro FPGA board. Resulted frame rate is 30 frames per second for QVGA resolution in gray scale.

  13. Optimizing memory-bound SYMV kernel on GPU hardware accelerators

    KAUST Repository

    Abdelfattah, Ahmad

    2013-01-01

    Hardware accelerators are becoming ubiquitous high performance scientific computing. They are capable of delivering an unprecedented level of concurrent execution contexts. High-level programming language extensions (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized numerical kernel for computing the symmetric matrix-vector product on nVidia Fermi GPUs. Due to its inherent memory-bound nature, this kernel is very critical in the tridiagonalization of a symmetric dense matrix, which is a preprocessing step to calculate the eigenpairs. Using a novel design to address the irregular memory accesses by hiding latency and increasing bandwidth, our preliminary asymptotic results show 3.5x and 2.5x fold speedups over the similar CUBLAS 4.0 kernel, and 7-8% and 30% fold improvement over the Matrix Algebra on GPU and Multicore Architectures (MAGMA) library in single and double precision arithmetics, respectively. © 2013 Springer-Verlag.

  14. A High Performance QDWH-SVD Solver using Hardware Accelerators

    KAUST Repository

    Sukkari, Dalal E.; Ltaief, Hatem; Keyes, David E.

    2015-01-01

    few digits of accuracy, compared to the full double precision floating point arithmetic. We further leverage the single GPU QDWH-SVD implementation by introducing the first multi-GPU SVD solver to study the scalability of the QDWH-SVD framework.

  15. Energy Efficient FPGA based Hardware Accelerators for Financial Applications

    DEFF Research Database (Denmark)

    Kenn Toft, Jakob; Nannarelli, Alberto

    2014-01-01

    Field Programmable Gate Arrays (FPGAs) based accelerators are very suitable to implement application-specific processors using uncommon operations or number systems. In this work, we design FPGA-based accelerators for two financial computations with different characteristics and we compare...... the accelerator performance and energy consumption to a software execution of the application. The experimental results show that significant speed-up and energy savings, can be obtained for large data sets by using the accelerator at expenses of a longer development time....

  16. FPGA Acceleration by Dynamically-Loaded Hardware Libraries

    DEFF Research Database (Denmark)

    Lomuscio, Andrea; Nannarelli, Alberto; Re, Marco

    -the-y the speciffic processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accelerator. Results show that significant speed-up and energy efficiency can be obtained by HLL acceleration on system-on-chips where reconfigurable fabric is placed next to the CPUs....

  17. Final Report: Enabling Exascale Hardware and Software Design through Scalable System Virtualization

    Energy Technology Data Exchange (ETDEWEB)

    Bridges, Patrick G.

    2015-02-01

    In this grant, we enhanced the Palacios virtual machine monitor to increase its scalability and suitability for addressing exascale system software design issues. This included a wide range of research on core Palacios features, large-scale system emulation, fault injection, perfomrance monitoring, and VMM extensibility. This research resulted in large number of high-impact publications in well-known venues, the support of a number of students, and the graduation of two Ph.D. students and one M.S. student. In addition, our enhanced version of the Palacios virtual machine monitor has been adopted as a core element of the Hobbes operating system under active DOE-funded research and development.

  18. Evaluation of accelerated iterative x-ray CT image reconstruction using floating point graphics hardware

    International Nuclear Information System (INIS)

    Kole, J S; Beekman, F J

    2006-01-01

    Statistical reconstruction methods offer possibilities to improve image quality as compared with analytical methods, but current reconstruction times prohibit routine application in clinical and micro-CT. In particular, for cone-beam x-ray CT, the use of graphics hardware has been proposed to accelerate the forward and back-projection operations, in order to reduce reconstruction times. In the past, wide application of this texture hardware mapping approach was hampered owing to limited intrinsic accuracy. Recently, however, floating point precision has become available in the latest generation commodity graphics cards. In this paper, we utilize this feature to construct a graphics hardware accelerated version of the ordered subset convex reconstruction algorithm. The aims of this paper are (i) to study the impact of using graphics hardware acceleration for statistical reconstruction on the reconstructed image accuracy and (ii) to measure the speed increase one can obtain by using graphics hardware acceleration. We compare the unaccelerated algorithm with the graphics hardware accelerated version, and for the latter we consider two different interpolation techniques. A simulation study of a micro-CT scanner with a mathematical phantom shows that at almost preserved reconstructed image accuracy, speed-ups of a factor 40 to 222 can be achieved, compared with the unaccelerated algorithm, and depending on the phantom and detector sizes. Reconstruction from physical phantom data reconfirms the usability of the accelerated algorithm for practical cases

  19. A Framework for Dynamically-Loaded Hardware Library (HLL) in FPGA Acceleration

    DEFF Research Database (Denmark)

    Cardarilli, Gian Carlo; Di Carlo, Leonardo; Nannarelli, Alberto

    2016-01-01

    Hardware acceleration is often used to address the need for speed and computing power in embedded systems. FPGAs always represented a good solution for HW acceleration and, recently, new SoC platforms extended the flexibility of the FPGAs by combining on a single chip both high-performance CPUs...... and FPGA fabric. The aim of this work is the implementation of hardware accelerators for these new SoCs. The innovative feature of these accelerators is the on-the-fly reconfiguration of the hardware to dynamically adapt the accelerator’s functionalities to the current CPU workload. The realization...... of the accelerators preliminarily requires also the profiling of both the SW (ARM CPU + NEON Units) and HW (FPGA) performance, an evaluation of the partial reconfiguration times and the development of an applicationspecific IP-cores library. This paper focuses on the profiling aspect of both the SW and HW...

  20. Acquisition of reliable vacuum hardware for large accelerator systems

    International Nuclear Information System (INIS)

    Welch, K.M.

    1996-01-01

    Credible and effective communications prove to be the major challenge in the acquisition of reliable vacuum hardware. Technical competence is necessary but not sufficient. We must effectively communicate with management, sponsoring agencies, project organizations, service groups, staff and with vendors. Most of Deming's 14 quality assurance tenets relate to creating an enlightened environment of good communications. All projects progress along six distinct, closely coupled, dynamic phases; all six phases are in a state of perpetual change. These phases and their elements are discussed, with emphasis given to the acquisition phase and its related vocabulary. (author)

  1. Hardware-accelerated autostereogram rendering for interactive 3D visualization

    Science.gov (United States)

    Petz, Christoph; Goldluecke, Bastian; Magnor, Marcus

    2003-05-01

    Single Image Random Dot Stereograms (SIRDS) are an attractive way of depicting three-dimensional objects using conventional display technology. Once trained in decoupling the eyes' convergence and focusing, autostereograms of this kind are able to convey the three-dimensional impression of a scene. We present in this work an algorithm that generates SIRDS at interactive frame rates on a conventional PC. The presented system allows rotating a 3D geometry model and observing the object from arbitrary positions in real-time. Subjective tests show that the perception of a moving or rotating 3D scene presents no problem: The gaze remains focused onto the object. In contrast to conventional SIRDS algorithms, we render multiple pixels in a single step using a texture-based approach, exploiting the parallel-processing architecture of modern graphics hardware. A vertex program determines the parallax for each vertex of the geometry model, and the graphics hardware's texture unit is used to render the dot pattern. No data has to be transferred between main memory and the graphics card for generating the autostereograms, leaving CPU capacity available for other tasks. Frame rates of 25 fps are attained at a resolution of 1024x512 pixels on a standard PC using a consumer-grade nVidia GeForce4 graphics card, demonstrating the real-time capability of the system.

  2. Interfacing Hardware Accelerators to a Time-Division Multiplexing Network-on-Chip

    DEFF Research Database (Denmark)

    Pezzarossa, Luca; Sørensen, Rasmus Bo; Schoeberl, Martin

    2015-01-01

    This paper addresses the integration of stateless hardware accelerators into time-predictable multi-core platforms based on time-division multiplexing networks-on-chip. Stateless hardware accelerators, like floating-point units, are typically attached as co-processors to individual processors in ...... implementation. The design evaluation is carried out using the open source T-CREST multi-core platform implemented on an Altera Cyclone IV FPGA. The size of the proposed design, including a floating-point accelerator, is about two-thirds of a processor....

  3. FPGA Implementation of Decimal Processors for Hardware Acceleration

    DEFF Research Database (Denmark)

    Borup, Nicolas; Dindorp, Jonas; Nannarelli, Alberto

    2011-01-01

    Applications in non-conventional number systems can benefit from accelerators implemented on reconfigurable platforms, such as Field Programmable Gate-Arrays (FPGAs). In this paper, we show that applications requiring decimal operations, such as the ones necessary in accounting or financial trans...... execution on the CPU of the hosting computer....

  4. Accelerating epistasis analysis in human genetics with consumer graphics hardware

    Directory of Open Access Journals (Sweden)

    Cancare Fabio

    2009-07-01

    Full Text Available Abstract Background Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs have more memory bandwidth and computational capability than Central Processing Units (CPUs and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. Findings We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective

  5. Accelerating epistasis analysis in human genetics with consumer graphics hardware.

    Science.gov (United States)

    Sinnott-Armstrong, Nicholas A; Greene, Casey S; Cancare, Fabio; Moore, Jason H

    2009-07-24

    Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR) is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs) have more memory bandwidth and computational capability than Central Processing Units (CPUs) and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective performance while leaving the CPU available for other

  6. Acquisition of reliable vacuum hardware for large accelerator systems

    International Nuclear Information System (INIS)

    Welch, K.M.

    1995-01-01

    Credible and effective communications prove to be the major challenge in the acquisition of reliable vacuum hardware. Technical competence is necessary but not sufficient. The authors must effectively communicate with management, sponsoring agencies, project organizations, service groups, staff and with vendors. Most of Deming's 14 quality assurance tenants relate to creating an enlightened environment of good communications. All projects progress along six distinct, closely coupled, dynamic phases. All six phases are in a state of perpetual change. These phases and their elements are discussed, with emphasis given to the acquisition phase and its related vocabulary. Large projects require great clarity and rigor as poor communications can be costly. For rigor to be cost effective, it can't be pedantic. Clarity thrives best in a low-risk, team environment

  7. A High Performance QDWH-SVD Solver using Hardware Accelerators

    KAUST Repository

    Sukkari, Dalal E.

    2015-04-08

    This paper describes a new high performance implementation of the QR-based Dynamically Weighted Halley Singular Value Decomposition (QDWH-SVD) solver on multicore architecture enhanced with multiple GPUs. The standard QDWH-SVD algorithm was introduced by Nakatsukasa and Higham (SIAM SISC, 2013) and combines three successive computational stages: (1) the polar decomposition calculation of the original matrix using the QDWH algorithm, (2) the symmetric eigendecomposition of the resulting polar factor to obtain the singular values and the right singular vectors and (3) the matrix-matrix multiplication to get the associated left singular vectors. A comprehensive test suite highlights the numerical robustness of the QDWH-SVD solver. Although it performs up to two times more flops when computing all singular vectors compared to the standard SVD solver algorithm, our new high performance implementation on single GPU results in up to 3.8x improvements for asymptotic matrix sizes, compared to the equivalent routines from existing state-of-the-art open-source and commercial libraries. However, when only singular values are needed, QDWH-SVD is penalized by performing up to 14 times more flops. The singular value only implementation of QDWH-SVD on single GPU can still run up to 18% faster than the best existing equivalent routines. Integrating mixed precision techniques in the solver can additionally provide up to 40% improvement at the price of losing few digits of accuracy, compared to the full double precision floating point arithmetic. We further leverage the single GPU QDWH-SVD implementation by introducing the first multi-GPU SVD solver to study the scalability of the QDWH-SVD framework.

  8. Hardware dependencies of GPU-accelerated beamformer performances for microwave breast cancer detection

    Directory of Open Access Journals (Sweden)

    Salomon Christoph J.

    2016-09-01

    Full Text Available UWB microwave imaging has proven to be a promising technique for early-stage breast cancer detection. The extensive image reconstruction time can be accelerated by parallelizing the execution of the underlying beamforming algorithms. However, the efficiency of the parallelization will most likely depend on the grade of parallelism of the imaging algorithm and of the utilized hardware. This paper investigates the dependencies of two different beamforming algorithms on multiple hardware specification of several graphics boards. The parallel implementation is realized by using NVIDIA’s CUDA. Three conclusions are drawn about the behavior of the parallel implementation and how to efficiently use the accessible hardware.

  9. Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

    Directory of Open Access Journals (Sweden)

    Burgess Shane C

    2008-04-01

    Full Text Available Abstract Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences. Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation.

  10. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

    Science.gov (United States)

    Persaud, A.; Ji, Q.; Feinberg, E.; Seidl, P. A.; Waldron, W. L.; Schenkel, T.; Lal, A.; Vinayakumar, K. B.; Ardanuc, S.; Hammer, D. A.

    2017-06-01

    A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

  11. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure.

    Science.gov (United States)

    Persaud, A; Ji, Q; Feinberg, E; Seidl, P A; Waldron, W L; Schenkel, T; Lal, A; Vinayakumar, K B; Ardanuc, S; Hammer, D A

    2017-06-01

    A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

  12. FPGA Hardware Acceleration of a Phylogenetic Tree Reconstruction with Maximum Parsimony Algorithm

    OpenAIRE

    BLOCK, Henry; MARUYAMA, Tsutomu

    2017-01-01

    In this paper, we present an FPGA hardware implementation for a phylogenetic tree reconstruction with a maximum parsimony algorithm. We base our approach on a particular stochastic local search algorithm that uses the Progressive Neighborhood and the Indirect Calculation of Tree Lengths method. This method is widely used for the acceleration of the phylogenetic tree reconstruction algorithm in software. In our implementation, we define a tree structure and accelerate the search by parallel an...

  13. Scalable devices

    KAUST Repository

    Krü ger, Jens J.; Hadwiger, Markus

    2014-01-01

    In computer science in general and in particular the field of high performance computing and supercomputing the term scalable plays an important role. It indicates that a piece of hardware, a concept, an algorithm, or an entire system scales

  14. Hardware Acceleration on Cloud Services: The use of Restricted Boltzmann Machines on Handwritten Digits Recognition

    Directory of Open Access Journals (Sweden)

    Eleni Bougioukou

    2018-02-01

    Full Text Available Cloud computing allows users and enterprises to process their data in high performance servers, thus reducing the need for advanced hardware at the client side. Although local processing is viable in many cases, collecting data from multiple clients and processing them in a server gives the best possible performance in terms of processing rate. In this work, the implementation of a high performance cloud computing engine for recognizing handwritten digits is presented. The engine exploits the benefits of cloud and uses a powerful hardware accelerator in order to classify the images received concurrently from multiple clients. The accelerator implements a number of neural networks, operating in parallel, resulting to a processing rate of more than 10 MImages/sec.

  15. A Hardware Accelerator for Fault Simulation Utilizing a Reconfigurable Array Architecture

    Directory of Open Access Journals (Sweden)

    Sungho Kang

    1996-01-01

    Full Text Available In order to reduce cost and to achieve high speed a new hardware accelerator for fault simulation has been designed. The architecture of the new accelerator is based on a reconfigurabl mesh type processing element (PE array. Circuit elements at the same topological level are simulated concurrently, as in a pipelined process. A new parallel simulation algorithm expands all of the gates to two input gates in order to limit the number of faults to two at each gate, so that the faults can be distributed uniformly throughout the PE array. The PE array reconfiguration operation provides a simulation speed advantage by maximizing the use of each PE cell.

  16. Spectral-element Seismic Wave Propagation on CUDA/OpenCL Hardware Accelerators

    Science.gov (United States)

    Peter, D. B.; Videau, B.; Pouget, K.; Komatitsch, D.

    2015-12-01

    Seismic wave propagation codes are essential tools to investigate a variety of wave phenomena in the Earth. Furthermore, they can now be used for seismic full-waveform inversions in regional- and global-scale adjoint tomography. Although these seismic wave propagation solvers are crucial ingredients to improve the resolution of tomographic images to answer important questions about the nature of Earth's internal processes and subsurface structure, their practical application is often limited due to high computational costs. They thus need high-performance computing (HPC) facilities to improving the current state of knowledge. At present, numerous large HPC systems embed many-core architectures such as graphics processing units (GPUs) to enhance numerical performance. Such hardware accelerators can be programmed using either the CUDA programming environment or the OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted by additional hardware accelerators, like e.g. AMD graphic cards, ARM-based processors as well as Intel Xeon Phi coprocessors. For seismic wave propagation simulations using the open-source spectral-element code package SPECFEM3D_GLOBE, we incorporated an automatic source-to-source code generation tool (BOAST) which allows us to use meta-programming of all computational kernels for forward and adjoint runs. Using our BOAST kernels, we generate optimized source code for both CUDA and OpenCL languages within the source code package. Thus, seismic wave simulations are able now to fully utilize CUDA and OpenCL hardware accelerators. We show benchmarks of forward seismic wave propagation simulations using SPECFEM3D_GLOBE on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.

  17. Forward and adjoint spectral-element simulations of seismic wave propagation using hardware accelerators

    Science.gov (United States)

    Peter, Daniel; Videau, Brice; Pouget, Kevin; Komatitsch, Dimitri

    2015-04-01

    Improving the resolution of tomographic images is crucial to answer important questions on the nature of Earth's subsurface structure and internal processes. Seismic tomography is the most prominent approach where seismic signals from ground-motion records are used to infer physical properties of internal structures such as compressional- and shear-wave speeds, anisotropy and attenuation. Recent advances in regional- and global-scale seismic inversions move towards full-waveform inversions which require accurate simulations of seismic wave propagation in complex 3D media, providing access to the full 3D seismic wavefields. However, these numerical simulations are computationally very expensive and need high-performance computing (HPC) facilities for further improving the current state of knowledge. During recent years, many-core architectures such as graphics processing units (GPUs) have been added to available large HPC systems. Such GPU-accelerated computing together with advances in multi-core central processing units (CPUs) can greatly accelerate scientific applications. There are mainly two possible choices of language support for GPU cards, the CUDA programming environment and OpenCL language standard. CUDA software development targets NVIDIA graphic cards while OpenCL was adopted mainly by AMD graphic cards. In order to employ such hardware accelerators for seismic wave propagation simulations, we incorporated a code generation tool BOAST into an existing spectral-element code package SPECFEM3D_GLOBE. This allows us to use meta-programming of computational kernels and generate optimized source code for both CUDA and OpenCL languages, running simulations on either CUDA or OpenCL hardware accelerators. We show here applications of forward and adjoint seismic wave propagation on CUDA/OpenCL GPUs, validating results and comparing performances for different simulations and hardware usages.

  18. FPGA hardware acceleration for high performance neutron transport computation based on agent methodology - 318

    International Nuclear Information System (INIS)

    Shanjie, Xiao; Tatjana, Jevremovic

    2010-01-01

    The accurate, detailed and 3D neutron transport analysis for Gen-IV reactors is still time-consuming regardless of advanced computational hardware available in developed countries. This paper introduces a new concept in addressing the computational time while persevering the detailed and accurate modeling; a specifically designed FPGA co-processor accelerates robust AGENT methodology for complex reactor geometries. For the first time this approach is applied to accelerate the neutronics analysis. The AGENT methodology solves neutron transport equation using the method of characteristics. The AGENT methodology performance was carefully analyzed before the hardware design based on the FPGA co-processor was adopted. The most time-consuming kernel part is then transplanted into the FPGA co-processor. The FPGA co-processor is designed with data flow-driven non von-Neumann architecture and has much higher efficiency than the conventional computer architecture. Details of the FPGA co-processor design are introduced and the design is benchmarked using two different examples. The advanced chip architecture helps the FPGA co-processor obtaining more than 20 times speed up with its working frequency much lower than the CPU frequency. (authors)

  19. Flexible hardware design for RSA and Elliptic Curve Cryptosystems

    NARCIS (Netherlands)

    Batina, L.; Bruin - Muurling, G.; Örs, S.B.; Okamoto, T.

    2004-01-01

    This paper presents a scalable hardware implementation of both commonly used public key cryptosystems, RSA and Elliptic Curve Cryptosystem (ECC) on the same platform. The introduced hardware accelerator features a design which can be varied from very small (less than 20 Kgates) targeting wireless

  20. Instrumentation Of The CERN Accelerator Logging Service: Ensuring Performance, Scalability, Maintenance And Diagnostics

    CERN Document Server

    Roderick, C; Dinis Teixeira, D

    2011-01-01

    The CERN accelerator Logging Service currently holds more than 90 terabytes of data online, and processes approximately 450 gigabytes per day, via hundreds of data loading processes and data extraction requests. This service is mission-critical for day-to-day operations, especially with respect to the tracking of live data from the LHC beam and equipment. In order to effectively manage any service, the service provider’s goals should include knowing how the underlying systems are being used, in terms of: “Who is doing what, from where, using which applications and methods, and how long each action takes”. Armed with such information, it is then possible to: analyse and tune system performance over time; plan for scalability ahead of time; assess the impact of maintenance operations and infrastructure upgrades; diagnose past, on-going, or re-occurring problems. The Logging Service is based on Oracle DBMS and Application Servers, and Java technology, and is comprised of several layered and multi-tiered s...

  1. Scalable devices

    KAUST Repository

    Krüger, Jens J.

    2014-01-01

    In computer science in general and in particular the field of high performance computing and supercomputing the term scalable plays an important role. It indicates that a piece of hardware, a concept, an algorithm, or an entire system scales with the size of the problem, i.e., it can not only be used in a very specific setting but it\\'s applicable for a wide range of problems. From small scenarios to possibly very large settings. In this spirit, there exist a number of fixed areas of research on scalability. There are works on scalable algorithms, scalable architectures but what are scalable devices? In the context of this chapter, we are interested in a whole range of display devices, ranging from small scale hardware such as tablet computers, pads, smart-phones etc. up to large tiled display walls. What interests us mostly is not so much the hardware setup but mostly the visualization algorithms behind these display systems that scale from your average smart phone up to the largest gigapixel display walls.

  2. A Hardware-Accelerated Quantum Monte Carlo framework (HAQMC) for N-body systems

    Science.gov (United States)

    Gothandaraman, Akila; Peterson, Gregory D.; Warren, G. Lee; Hinde, Robert J.; Harrison, Robert J.

    2009-12-01

    Interest in the study of structural and energetic properties of highly quantum clusters, such as inert gas clusters has motivated the development of a hardware-accelerated framework for Quantum Monte Carlo simulations. In the Quantum Monte Carlo method, the properties of a system of atoms, such as the ground-state energies, are averaged over a number of iterations. Our framework is aimed at accelerating the computations in each iteration of the QMC application by offloading the calculation of properties, namely energy and trial wave function, onto reconfigurable hardware. This gives a user the capability to run simulations for a large number of iterations, thereby reducing the statistical uncertainty in the properties, and for larger clusters. This framework is designed to run on the Cray XD1 high performance reconfigurable computing platform, which exploits the coarse-grained parallelism of the processor along with the fine-grained parallelism of the reconfigurable computing devices available in the form of field-programmable gate arrays. In this paper, we illustrate the functioning of the framework, which can be used to calculate the energies for a model cluster of helium atoms. In addition, we present the capabilities of the framework that allow the user to vary the chemical identities of the simulated atoms. Program summaryProgram title: Hardware Accelerated Quantum Monte Carlo (HAQMC) Catalogue identifier: AEEP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEP_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 691 537 No. of bytes in distributed program, including test data, etc.: 5 031 226 Distribution format: tar.gz Programming language: C/C++ for the QMC application, VHDL and Xilinx 8.1 ISE/EDK tools for FPGA design and development Computer: Cray XD

  3. A hardware acceleration based on high-level synthesis approach for glucose-insulin analysis

    Science.gov (United States)

    Daud, Nur Atikah Mohd; Mahmud, Farhanahani; Jabbar, Muhamad Hairol

    2017-01-01

    In this paper, the research is focusing on Type 1 Diabetes Mellitus (T1DM). Since this disease requires a full attention on the blood glucose concentration with the help of insulin injection, it is important to have a tool that able to predict that level when consume a certain amount of carbohydrate during meal time. Therefore, to make it realizable, a Hovorka model which is aiming towards T1DM is chosen in this research. A high-level language is chosen that is C++ to construct the mathematical model of the Hovorka model. Later, this constructed code is converted into intellectual property (IP) which is also known as a hardware accelerator by using of high-level synthesis (HLS) approach which able to improve in terms of design and performance for glucose-insulin analysis tool later as will be explained further in this paper. This is the first step in this research before implementing the design into system-on-chip (SoC) to achieve a high-performance system for the glucose-insulin analysis tool.

  4. A Scalable Parallel PWTD-Accelerated SIE Solver for Analyzing Transient Scattering from Electrically Large Objects

    KAUST Repository

    Liu, Yang

    2015-12-17

    A scalable parallel plane-wave time-domain (PWTD) algorithm for efficient and accurate analysis of transient scattering from electrically large objects is presented. The algorithm produces scalable communication patterns on very large numbers of processors by leveraging two mechanisms: (i) a hierarchical parallelization strategy to evenly distribute the computation and memory loads at all levels of the PWTD tree among processors, and (ii) a novel asynchronous communication scheme to reduce the cost and memory requirement of the communications between the processors. The efficiency and accuracy of the algorithm are demonstrated through its applications to the analysis of transient scattering from a perfect electrically conducting (PEC) sphere with a diameter of 70 wavelengths and a PEC square plate with a dimension of 160 wavelengths. Furthermore, the proposed algorithm is used to analyze transient fields scattered from realistic airplane and helicopter models under high frequency excitation.

  5. Accelerating the Non-equispaced Fast Fourier Transform on Commodity Graphics Hardware

    DEFF Research Database (Denmark)

    Sørensen, Thomas Sangild; Schaeffter, Tobias; Noe, Karsten Østergaard

    2008-01-01

    We present a fast parallel algorithm to compute the Non-equispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform, which was previously its most time consuming part. We describe the performa......We present a fast parallel algorithm to compute the Non-equispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform, which was previously its most time consuming part. We describe...

  6. Accelerated Degradation for Hardware in the Loop Simulation of Fuel Cell-Gas Turbine Hybrid System

    DEFF Research Database (Denmark)

    Abreu-Sepulveda, Maria A.; Harun, Nor Farida; Hackett, Gregory

    2015-01-01

    The U.S. Department of Energy (DOE)-National Energy Technology Laboratory (NETL) in Morgantown, WV has developed the hybrid performance (HyPer) project in which a solid oxide fuel cell (SOFC) one-dimensional (1D), real-time operating model is coupled to a gas turbine hardware system by utilizing...

  7. Hardware for Accelerating N-Modular Redundant Systems for High-Reliability Computing

    Science.gov (United States)

    Dobbs, Carl, Sr.

    2012-01-01

    A hardware unit has been designed that reduces the cost, in terms of performance and power consumption, for implementing N-modular redundancy (NMR) in a multiprocessor device. The innovation monitors transactions to memory, and calculates a form of sumcheck on-the-fly, thereby relieving the processors of calculating the sumcheck in software

  8. Hardware Acceleration of SQL-Queries Processing in MDM-Systems Based on MISDSolution

    OpenAIRE

    V. E. Podol'skii; A. V. Samochadin; S. S. Koloskov

    2015-01-01

    In this article we examine the possibility of hardware support for functions of mobile device management platform (MDM-platform) using a Multiple Instructions and Single Data stream computer system, developed within the framework of the project in Bauman Moscow State Technical University. At the universities the MDM-platform is used to provide various mobile services for the faculty, students and administration to facilitate the learning process: a mobile schedule, document sharing, text mess...

  9. Hardware Acceleration of SQL-Queries Processing in MDM-Systems Based on MISDSolution

    Directory of Open Access Journals (Sweden)

    V. E. Podol'skii

    2015-01-01

    Full Text Available In this article we examine the possibility of hardware support for functions of mobile device management platform (MDM-platform using a Multiple Instructions and Single Data stream computer system, developed within the framework of the project in Bauman Moscow State Technical University. At the universities the MDM-platform is used to provide various mobile services for the faculty, students and administration to facilitate the learning process: a mobile schedule, document sharing, text messages, and other interactive activities. Most of these services are provided by the extensive use of data stored in MDM-platform databases. When accessing the databases SQL- queries are commonly used. These queries comprise operators of SQL-language that are based on mathematical sets theory. Hardware support for operations on sets is implemented in Multiple Instructions and Single Data stream computer system (MISD System. This allows performance improvement of algorithms and operations on sets. Thus, the hardware support for the processing of SQL-queries in MISD system allows us to benefit from the implementation of SQL-queries in the MISD paradigm.The scientific novelty of the work lies in the fact that it is the first time a set of algorithms for basic SQL statements has been presented in a format supported by MISD system. In addition, for the first time operators INNER JOIN, LEFT JOIN and LEFT OUTER JOIN have been implemented for MISD system and tested for it (testing was done for FPGA Xilinx Virtex-II Pro XC2VP30 implementation of MISD system. The practical significance of the work lies in the fact that the results of the study will be used in the project "Development of the Russian analogue of the system software for centralized management of personal devices and platforms in enterprise networks" of the St. Petersburg Polytechnic University (with the financial support of the state represented by the Ministry of Education and Science of the Russian

  10. Hardware-accelerated Point Generation and Rendering of Point-based Impostors

    DEFF Research Database (Denmark)

    Bærentzen, Jakob Andreas

    2005-01-01

    This paper presents a novel scheme for generating points from triangle models. The method is fast and lends itself well to implementation using graphics hardware. The triangle to point conversion is done by rendering the models, and the rendering may be performed procedurally or by a black box API....... I describe the technique in detail and discuss how the generated point sets can easily be used as impostors for the original triangle models used to create the points. Since the points reside solely in GPU memory, these impostors are fairly efficient. Source code is available online....

  11. Design of Power Efficient FPGA based Hardware Accelerators for Financial Applications

    DEFF Research Database (Denmark)

    Hegner, Jonas Stenbæk; Sindholt, Joakim; Nannarelli, Alberto

    2012-01-01

    Using Field Programmable Gate Arrays (FPGAs) to accelerate financial derivative calculations is becoming very common. In this work, we implement an FPGA-based specific processor for European option pricing using Monte Carlo simulations, and we compare its performance and power dissipation...

  12. Design and implementation of embedded hardware accelerator for diagnosing HDL-CODE in assertion-based verification environment

    Directory of Open Access Journals (Sweden)

    C. U. Ngene

    2013-08-01

    Full Text Available The use of assertions for monitoring the designer’s intention in hardware description language (HDL model is gaining popularity as it helps the designer to observe internal errors at the output ports of the device under verification. During verification assertions are synthesised and the generated data are represented in a tabular forms. The amount of data generated can be enormous depending on the size of the code and the number of modules that constitute the code. Furthermore, to manually inspect these data and diagnose the module with functional violation is a time consuming process which negatively affects the overall product development time. To locate the module with functional violation within acceptable diagnostic time, the data processing and analysis procedure must be accelerated. In this paper a multi-array processor (hardware accelerator was designed and implemented in Virtex6 field programmable gate array (FPGA and it can be integrated into verification environment. The design was captured in very high speed integrated circuit HDL (VHDL. The design was synthesised with Xilinx design suite ISE 13.1 and simulated with Xilinx ISIM. The multi-array processor (MAP executes three logical operations (AND, OR, XOR and a one’s compaction operation on array of data in parallel. An improvement in processing and analysis time was recorded as compared to the manual procedure after the multi-array processor was integrated into the verification environment. It was also found that the multi-array processor which was developed as an Intellectual Property (IP core can also be used in applications where output responses and golden model that are represented in the form of matrices can be compared for searching, recognition and decision-making.

  13. Fast parallel tandem mass spectral library searching using GPU hardware acceleration.

    Science.gov (United States)

    Baumgardner, Lydia Ashleigh; Shanmugam, Avinash Kumar; Lam, Henry; Eng, Jimmy K; Martin, Daniel B

    2011-06-03

    Mass spectrometry-based proteomics is a maturing discipline of biologic research that is experiencing substantial growth. Instrumentation has steadily improved over time with the advent of faster and more sensitive instruments collecting ever larger data files. Consequently, the computational process of matching a peptide fragmentation pattern to its sequence, traditionally accomplished by sequence database searching and more recently also by spectral library searching, has become a bottleneck in many mass spectrometry experiments. In both of these methods, the main rate-limiting step is the comparison of an acquired spectrum with all potential matches from a spectral library or sequence database. This is a highly parallelizable process because the core computational element can be represented as a simple but arithmetically intense multiplication of two vectors. In this paper, we present a proof of concept project taking advantage of the massively parallel computing available on graphics processing units (GPUs) to distribute and accelerate the process of spectral assignment using spectral library searching. This program, which we have named FastPaSS (for Fast Parallelized Spectral Searching), is implemented in CUDA (Compute Unified Device Architecture) from NVIDIA, which allows direct access to the processors in an NVIDIA GPU. Our efforts demonstrate the feasibility of GPU computing for spectral assignment, through implementation of the validated spectral searching algorithm SpectraST in the CUDA environment.

  14. Floating-point-based hardware accelerator of a beam phase-magnitude detector and filter for a beam phase control system in a heavy-ion synchrotron application

    International Nuclear Information System (INIS)

    Samman, F.A.; Pongyupinpanich Surapong; Spies, C.; Glesner, M.

    2012-01-01

    A hardware implementation of an adaptive phase and magnitude detector and filter of a beam-phase control system in a heavy ion synchrotron application is presented in this paper. The main components of the hardware are adaptive LMS (Least-Mean-Square) filters and phase and magnitude detectors. The phase detectors are implemented by using a CORDIC (Coordinate Rotation Digital Computer) algorithm based on 32-bit binary floating-point arithmetic data formats. The floating-point-based hardware is designed to improve the precision of the past hardware implementation that were based on fixed-point arithmetics. The hardware of the detector and the adaptive LMS filter have been implemented on a programmable logic device (FPGA) for hardware acceleration purpose. The ideal Matlab/Simulink model of the hardware and the VHDL model of the adaptive LMS filter and the phase and magnitude detector are compared. The comparison result shows that the output signal of the floating-point based adaptive FIR filter as well as the phase and magnitude detector agree with the expected output signal of the ideal Matlab/Simulink model. (authors)

  15. GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

    Science.gov (United States)

    Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oguz; Mutlu, Onur; Alkan, Can

    2017-11-01

    High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. https://github.com/BilkentCompGen/GateKeeper. mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press

  16. Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system

    Directory of Open Access Journals (Sweden)

    Daniel Brüderle

    2009-06-01

    Full Text Available Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.

  17. Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system.

    Science.gov (United States)

    Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz

    2009-01-01

    Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.

  18. Scalable Resolution Display Walls

    KAUST Repository

    Leigh, Jason; Johnson, Andrew; Renambot, Luc; Peterka, Tom; Jeong, Byungil; Sandin, Daniel J.; Talandis, Jonas; Jagodic, Ratko; Nam, Sungwon; Hur, Hyejung; Sun, Yiwen

    2013-01-01

    This article will describe the progress since 2000 on research and development in 2-D and 3-D scalable resolution display walls that are built from tiling individual lower resolution flat panel displays. The article will describe approaches and trends in display hardware construction, middleware architecture, and user-interaction design. The article will also highlight examples of use cases and the benefits the technology has brought to their respective disciplines. © 1963-2012 IEEE.

  19. The VMTG Hardware Description

    CERN Document Server

    Puccio, B

    1998-01-01

    The document describes the hardware features of the CERN Master Timing Generator. This board is the common platform for the transmission of General Timing Machine required by the CERN accelerators. In addition, the paper shows the various jumper options to customise the card which is compliant to the VMEbus standard.

  20. An Introduction to Parallelism, Concurrency and Acceleration (1/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Concurrency and parallelism are firm elements of any modern computing infrastructure, made even more prominent by the emergence of accelerators. These lectures offer an introduction to these important concepts. We will begin with a brief refresher of recent hardware offerings to modern-day programmers. We will then open the main discussion with an overview of the laws and practical aspects of scalability. Key parallelism data structures, patterns and algorithms will be shown. The main threats to scalability and mitigation strategies will be discussed in the context of real-life optimization problems.

  1. Hardware malware

    CERN Document Server

    Krieg, Christian

    2013-01-01

    In our digital world, integrated circuits are present in nearly every moment of our daily life. Even when using the coffee machine in the morning, or driving our car to work, we interact with integrated circuits. The increasing spread of information technology in virtually all areas of life in the industrialized world offers a broad range of attack vectors. So far, mainly software-based attacks have been considered and investigated, while hardware-based attacks have attracted comparatively little interest. The design and production process of integrated circuits is mostly decentralized due to

  2. How to create successful Open Hardware projects — About White Rabbits and open fields

    International Nuclear Information System (INIS)

    Bij, E van der; Arruat, M; Cattin, M; Daniluk, G; Cobas, J D Gonzalez; Gousiou, E; Lewis, J; Lipinski, M M; Serrano, J; Stana, T; Voumard, N; Wlostowski, T

    2013-01-01

    CERN's accelerator control group has embraced ''Open Hardware'' (OH) to facilitate peer review, avoid vendor lock-in and make support tasks scalable. A web-based tool for easing collaborative work was set up and the CERN OH Licence was created. New ADC, TDC, fine delay and carrier cards based on VITA and PCI-SIG standards were designed and drivers for Linux were written. Often industry was paid for developments, while quality and documentation was controlled by CERN. An innovative timing network was also developed with the OH paradigm. Industry now sells and supports these designs that find their way into new fields

  3. How to create successful Open Hardware projects - About White Rabbits and open fields

    CERN Document Server

    van der Bij, E; Lewis, J; Stana, T; Wlostowski, T; Gousiou, E; Serrano, J; Arruat, M; Lipinski, M M; Daniluk, G; Voumard, N; Cattin, M

    2013-01-01

    CERN's accelerator control group has embraced "Open Hardware" (OH) to facilitate peer review, avoid vendor lock-in and make support tasks scalable. A web-based tool for easing collaborative work was set up and the CERN OH Licence was created. New ADC, TDC, fine delay and carrier cards based on VITA and PCI-SIG standards were designed and drivers for Linux were written. Often industry was paid for developments, while quality and documentation was controlled by CERN. An innovative timing network was also developed with the OH paradigm. Industry now sells and supports these designs that find their way into new fields.

  4. Scalable and massively parallel Monte Carlo photon transport simulations for heterogeneous computing platforms.

    Science.gov (United States)

    Yu, Leiming; Nina-Paravecino, Fanny; Kaeli, David; Fang, Qianqian

    2018-01-01

    We present a highly scalable Monte Carlo (MC) three-dimensional photon transport simulation platform designed for heterogeneous computing systems. Through the development of a massively parallel MC algorithm using the Open Computing Language framework, this research extends our existing graphics processing unit (GPU)-accelerated MC technique to a highly scalable vendor-independent heterogeneous computing environment, achieving significantly improved performance and software portability. A number of parallel computing techniques are investigated to achieve portable performance over a wide range of computing hardware. Furthermore, multiple thread-level and device-level load-balancing strategies are developed to obtain efficient simulations using multiple central processing units and GPUs. (2018) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE).

  5. CODA: A scalable, distributed data acquisition system

    International Nuclear Information System (INIS)

    Watson, W.A. III; Chen, J.; Heyes, G.; Jastrzembski, E.; Quarrie, D.

    1994-01-01

    A new data acquisition system has been designed for physics experiments scheduled to run at CEBAF starting in the summer of 1994. This system runs on Unix workstations connected via ethernet, FDDI, or other network hardware to multiple intelligent front end crates -- VME, CAMAC or FASTBUS. CAMAC crates may either contain intelligent processors, or may be interfaced to VME. The system is modular and scalable, from a single front end crate and one workstation linked by ethernet, to as may as 32 clusters of front end crates ultimately connected via a high speed network to a set of analysis workstations. The system includes an extensible, device independent slow controls package with drivers for CAMAC, VME, and high voltage crates, as well as a link to CEBAF accelerator controls. All distributed processes are managed by standard remote procedure calls propagating change-of-state requests, or reading and writing program variables. Custom components may be easily integrated. The system is portable to any front end processor running the VxWorks real-time kernel, and to most workstations supplying a few standard facilities such as rsh and X-windows, and Motif and socket libraries. Sample implementations exist for 2 Unix workstation families connected via ethernet or FDDI to VME (with interfaces to FASTBUS or CAMAC), and via ethernet to FASTBUS or CAMAC

  6. Accelerators

    CERN Multimedia

    CERN. Geneva

    2001-01-01

    The talk summarizes the principles of particle acceleration and addresses problems related to storage rings like LEP and LHC. Special emphasis will be given to orbit stability, long term stability of the particle motion, collective effects and synchrotron radiation.

  7. Scalable fast multipole methods for vortex element methods

    KAUST Repository

    Hu, Qi

    2012-11-01

    We use a particle-based method to simulate incompressible flows, where the Fast Multipole Method (FMM) is used to accelerate the calculation of particle interactions. The most time-consuming kernelsâ\\'the Biot-Savart equation and stretching term of the vorticity equationâ\\'are mathematically reformulated so that only two Laplace scalar potentials are used instead of six, while automatically ensuring divergence-free far-field computation. Based on this formulation, and on our previous work for a scalar heterogeneous FMM algorithm, we develop a new FMM-based vortex method capable of simulating general flows including turbulence on heterogeneous architectures, which distributes the work between multi-core CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm also uses new data structures which can dynamically manage inter-node communication and load balance efficiently but with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s. © 2012 IEEE.

  8. Scalability of the LEU-Modified Cintichem Process: 3-MeV Van de Graaff and 35-MeV Electron Linear Accelerator Studies

    Energy Technology Data Exchange (ETDEWEB)

    Rotsch, David A. [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Brossard, Tom [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Roussin, Ethan [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Quigley, Kevin [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Chemerisov, Sergey [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Gromov, Roman [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Jonah, Charles [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Hafenrichter, Lohman [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Tkac, Peter [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Krebs, John [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Vandegrift, George F. [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division

    2016-10-31

    Molybdenum-99, the mother of Tc-99m, can be produced from fission of U-235 in nuclear reactors and purified from fission products by the Cintichem process, later modified for low-enriched uranium (LEU) targets. The key step in this process is the precipitation of Mo with α-benzoin oxime (ABO). The stability of this complex to radiation has been examined. Molybdenum-ABO was irradiated with 3 MeV electrons produced by a Van de Graaff generator and 35 MeV electrons produced by a 50 MeV/25 kW electron linear accelerator. Dose equivalents of 1.7–31.2 kCi of Mo-99 were administered to freshly prepared Mo-ABO. Irradiated samples of Mo-ABO were processed according to the LEU Modified-Cintichem process. The Van de Graaff data indicated good radiation stability of the Mo-ABO complex up to ~15 kCi dose equivalents of Mo-99 and nearly complete destruction at doses >24 kCi Mo-99. The linear accelerator data indicate that even at 6.2 kCi of Mo-99 equivalence of dose, the sample lost ~20% of Mo-99. The 20% loss of Mo-99 at this low dose may be attributed to thermal decomposition of the product from the heat deposited in the sample during irradiation.

  9. Using the FLUKA Monte Carlo Code to Simulate the Interactions of Ionizing Radiation with Matter to Assist and Aid Our Understanding of Ground Based Accelerator Testing, Space Hardware Design, and Secondary Space Radiation Environments

    Science.gov (United States)

    Reddell, Brandon

    2015-01-01

    Designing hardware to operate in the space radiation environment is a very difficult and costly activity. Ground based particle accelerators can be used to test for exposure to the radiation environment, one species at a time, however, the actual space environment cannot be duplicated because of the range of energies and isotropic nature of space radiation. The FLUKA Monte Carlo code is an integrated physics package based at CERN that has been under development for the last 40+ years and includes the most up-to-date fundamental physics theory and particle physics data. This work presents an overview of FLUKA and how it has been used in conjunction with ground based radiation testing for NASA and improve our understanding of secondary particle environments resulting from the interaction of space radiation with matter.

  10. Run-Time Scalable Hardware for Reconfigurable Systems

    OpenAIRE

    Otero Marnotes, Andres

    2014-01-01

    La optimización de parámetros tales como el consumo de potencia, la cantidad de recursos lógicos empleados o la ocupación de memoria ha sido siempre una de las preocupaciones principales a la hora de diseñar sistemas embebidos. Esto es debido a que se trata de sistemas dotados de una cantidad de recursos limitados, y que han sido tradicionalmente empleados para un propósito específico, que permanece invariable a lo largo de toda la vida útil del sistema. Sin embargo, el uso de sistemas embebi...

  11. Scalable Frequent Subgraph Mining

    KAUST Repository

    Abdelhamid, Ehab

    2017-06-19

    A graph is a data structure that contains a set of nodes and a set of edges connecting these nodes. Nodes represent objects while edges model relationships among these objects. Graphs are used in various domains due to their ability to model complex relations among several objects. Given an input graph, the Frequent Subgraph Mining (FSM) task finds all subgraphs with frequencies exceeding a given threshold. FSM is crucial for graph analysis, and it is an essential building block in a variety of applications, such as graph clustering and indexing. FSM is computationally expensive, and its existing solutions are extremely slow. Consequently, these solutions are incapable of mining modern large graphs. This slowness is caused by the underlying approaches of these solutions which require finding and storing an excessive amount of subgraph matches. This dissertation proposes a scalable solution for FSM that avoids the limitations of previous work. This solution is composed of four components. The first component is a single-threaded technique which, for each candidate subgraph, needs to find only a minimal number of matches. The second component is a scalable parallel FSM technique that utilizes a novel two-phase approach. The first phase quickly builds an approximate search space, which is then used by the second phase to optimize and balance the workload of the FSM task. The third component focuses on accelerating frequency evaluation, which is a critical step in FSM. To do so, a machine learning model is employed to predict the type of each graph node, and accordingly, an optimized method is selected to evaluate that node. The fourth component focuses on mining dynamic graphs, such as social networks. To this end, an incremental index is maintained during the dynamic updates. Only this index is processed and updated for the majority of graph updates. Consequently, search space is significantly pruned and efficiency is improved. The empirical evaluation shows that the

  12. Hardware Support for Dynamic Languages

    DEFF Research Database (Denmark)

    Schleuniger, Pascal; Karlsson, Sven; Probst, Christian W.

    2011-01-01

    In recent years, dynamic programming languages have enjoyed increasing popularity. For example, JavaScript has become one of the most popular programming languages on the web. As the complexity of web applications is growing, compute-intensive workloads are increasingly handed off to the client...... side. While a lot of effort is put in increasing the performance of web browsers, we aim for multicore systems with dedicated cores to effectively support dynamic languages. We have designed Tinuso, a highly flexible core for experimentation that is optimized for high performance when implemented...... on FPGA. We composed a scalable multicore configuration where we study how hardware support for software speculation can be used to increase the performance of dynamic languages....

  13. Enhancing Scalability of Sparse Direct Methods

    International Nuclear Information System (INIS)

    Li, Xiaoye S.; Demmel, James; Grigori, Laura; Gu, Ming; Xia, Jianlin; Jardin, Steve; Sovinec, Carl; Lee, Lie-Quan

    2007-01-01

    TOPS is providing high-performance, scalable sparse direct solvers, which have had significant impacts on the SciDAC applications, including fusion simulation (CEMM), accelerator modeling (COMPASS), as well as many other mission-critical applications in DOE and elsewhere. Our recent developments have been focusing on new techniques to overcome scalability bottleneck of direct methods, in both time and memory. These include parallelizing symbolic analysis phase and developing linear-complexity sparse factorization methods. The new techniques will make sparse direct methods more widely usable in large 3D simulations on highly-parallel petascale computers

  14. Introduction to Hardware Security

    Directory of Open Access Journals (Sweden)

    Yier Jin

    2015-10-01

    Full Text Available Hardware security has become a hot topic recently with more and more researchers from related research domains joining this area. However, the understanding of hardware security is often mixed with cybersecurity and cryptography, especially cryptographic hardware. For the same reason, the research scope of hardware security has never been clearly defined. To help researchers who have recently joined in this area better understand the challenges and tasks within the hardware security domain and to help both academia and industry investigate countermeasures and solutions to solve hardware security problems, we will introduce the key concepts of hardware security as well as its relations to related research topics in this survey paper. Emerging hardware security topics will also be clearly depicted through which the future trend will be elaborated, making this survey paper a good reference for the continuing research efforts in this area.

  15. Memory Based Machine Intelligence Techniques in VLSI hardware

    OpenAIRE

    James, Alex Pappachen

    2012-01-01

    We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high ...

  16. Scalable Parallel Distributed Coprocessor System for Graph Searching Problems with Massive Data

    Directory of Open Access Journals (Sweden)

    Wanrong Huang

    2017-01-01

    Full Text Available The Internet applications, such as network searching, electronic commerce, and modern medical applications, produce and process massive data. Considerable data parallelism exists in computation processes of data-intensive applications. A traversal algorithm, breadth-first search (BFS, is fundamental in many graph processing applications and metrics when a graph grows in scale. A variety of scientific programming methods have been proposed for accelerating and parallelizing BFS because of the poor temporal and spatial locality caused by inherent irregular memory access patterns. However, new parallel hardware could provide better improvement for scientific methods. To address small-world graph problems, we propose a scalable and novel field-programmable gate array-based heterogeneous multicore system for scientific programming. The core is multithread for streaming processing. And the communication network InfiniBand is adopted for scalability. We design a binary search algorithm to address mapping to unify all processor addresses. Within the limits permitted by the Graph500 test bench after 1D parallel hybrid BFS algorithm testing, our 8-core and 8-thread-per-core system achieved superior performance and efficiency compared with the prior work under the same degree of parallelism. Our system is efficient not as a special acceleration unit but as a processor platform that deals with graph searching applications.

  17. Hardware Acceleration of Sparse Cognitive Algorithms

    Science.gov (United States)

    2016-05-01

    3 Figure 2: Kernel Run Times on an NVidia GTX750 Using the NVidia Profiler .......................... 5 Figure 3: 2D Implementation...Demonstrated Results Notes: (1) Server class GPU, NVidia Tesla K20c. (2) Consumer grade GPU, GeForce GT 640. Factor GPU Baseline SIMD with PiM ASIC...times on an NVidia GTX750 using the NVidia profiler. Figure 2: Kernel Run Times on an NVidia GTX750 Using the NVidia Profiler In this example

  18. Another way of doing RSA cryptography in hardware

    NARCIS (Netherlands)

    Batina, L.; Bruin - Muurling, G.; Honary, B.

    2001-01-01

    In this paper we describe an efficient and secure hardware implementation of the RSA cryptosystem. Modular exponentiation is based on Montgomery’s method without any modular reduction achieving the optimal bound. The presented systolic array architecture is scalable in severalparameters which makes

  19. Combining hardware and simulation for datacenter scaling studies

    DEFF Research Database (Denmark)

    Ruepp, Sarah Renée; Pilimon, Artur; Thrane, Jakob

    2017-01-01

    and simulation to illustrate the scalability and performance of datacenter networks. We simulate a Datacenter network and interconnect it with real world traffic generation hardware. Analysis of the introduced packet conversion and virtual queueing delays shows that the conversion efficiency is at the order...

  20. An Introduction to Parallelism, Concurrency and Acceleration (1/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Concurrency and parallelism are firm elements of any modern computing infrastructure, made even more prominent by the emergence of accelerators. These lectures offer an introduction to these important concepts. We will begin with a brief refresher of recent hardware offerings to modern-day programmers. We will then open the main discussion with an overview of the laws and practical aspects of scalability. Key parallelism data structures, patterns and algorithms will be shown. The main threats to scalability and mitigation strategies will be discussed in the context of real-life optimization problems. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP and Google), as well as international research institutes, such as EPFL. Current...

  1. Scalable coherent interface

    International Nuclear Information System (INIS)

    Alnaes, K.; Kristiansen, E.H.; Gustavson, D.B.; James, D.V.

    1990-01-01

    The Scalable Coherent Interface (IEEE P1596) is establishing an interface standard for very high performance multiprocessors, supporting a cache-coherent-memory model scalable to systems with up to 64K nodes. This Scalable Coherent Interface (SCI) will supply a peak bandwidth per node of 1 GigaByte/second. The SCI standard should facilitate assembly of processor, memory, I/O and bus bridge cards from multiple vendors into massively parallel systems with throughput far above what is possible today. The SCI standard encompasses two levels of interface, a physical level and a logical level. The physical level specifies electrical, mechanical and thermal characteristics of connectors and cards that meet the standard. The logical level describes the address space, data transfer protocols, cache coherence mechanisms, synchronization primitives and error recovery. In this paper we address logical level issues such as packet formats, packet transmission, transaction handshake, flow control, and cache coherence. 11 refs., 10 figs

  2. Open Hardware Business Models

    Directory of Open Access Journals (Sweden)

    Edy Ferreira

    2008-04-01

    Full Text Available In the September issue of the Open Source Business Resource, Patrick McNamara, president of the Open Hardware Foundation, gave a comprehensive introduction to the concept of open hardware, including some insights about the potential benefits for both companies and users. In this article, we present the topic from a different perspective, providing a classification of market offers from companies that are making money with open hardware.

  3. Open Hardware Business Models

    OpenAIRE

    Edy Ferreira

    2008-01-01

    In the September issue of the Open Source Business Resource, Patrick McNamara, president of the Open Hardware Foundation, gave a comprehensive introduction to the concept of open hardware, including some insights about the potential benefits for both companies and users. In this article, we present the topic from a different perspective, providing a classification of market offers from companies that are making money with open hardware.

  4. PKI Scalability Issues

    OpenAIRE

    Slagell, Adam J; Bonilla, Rafael

    2004-01-01

    This report surveys different PKI technologies such as PKIX and SPKI and the issues of PKI that affect scalability. Much focus is spent on certificate revocation methodologies and status verification systems such as CRLs, Delta-CRLs, CRS, Certificate Revocation Trees, Windowed Certificate Revocation, OCSP, SCVP and DVCS.

  5. Hardware and software status of QCDOC

    International Nuclear Information System (INIS)

    Boyle, P.A.; Chen, D.; Christ, N.H.; Clark, M.; Cohen, S.D.; Cristian, C.; Dong, Z.; Gara, A.; Joo, B.; Jung, C.; Kim, C.; Levkova, L.; Liao, X.; Liu, G.; Mawhinney, R.D.; Ohta, S.; Petrov, K.; Wettig, T.; Yamaguchi, A.

    2004-01-01

    QCDOC is a massively parallel supercomputer whose processing nodes are based on an application-specific integrated circuit (ASIC). This ASIC was custom-designed so that crucial lattice QCD kernels achieve an overall sustained performance of 50% on machines with several 10,000 nodes. This strong scalability, together with low power consumption and a price/performance ratio of $1 per sustained MFlops, enable QCDOC to attack the most demanding lattice QCD problems. The first ASICs became available in June of 2003, and the testing performed so far has shown all systems functioning according to specification. We review the hardware and software status of QCDOC and present performance figures obtained in real hardware as well as in simulation

  6. DISP: Optimizations towards Scalable MPI Startup

    Energy Technology Data Exchange (ETDEWEB)

    Fu, Huansong [Florida State University, Tallahassee; Pophale, Swaroop S [ORNL; Gorentla Venkata, Manjunath [ORNL; Yu, Weikuan [Florida State University, Tallahassee

    2016-01-01

    Despite the popularity of MPI for high performance computing, the startup of MPI programs faces a scalability challenge as both the execution time and memory consumption increase drastically at scale. We have examined this problem using the collective modules of Cheetah and Tuned in Open MPI as representative implementations. Previous improvements for collectives have focused on algorithmic advances and hardware off-load. In this paper, we examine the startup cost of the collective module within a communicator and explore various techniques to improve its efficiency and scalability. Accordingly, we have developed a new scalable startup scheme with three internal techniques, namely Delayed Initialization, Module Sharing and Prediction-based Topology Setup (DISP). Our DISP scheme greatly benefits the collective initialization of the Cheetah module. At the same time, it helps boost the performance of non-collective initialization in the Tuned module. We evaluate the performance of our implementation on Titan supercomputer at ORNL with up to 4096 processes. The results show that our delayed initialization can speed up the startup of Tuned and Cheetah by an average of 32.0% and 29.2%, respectively, our module sharing can reduce the memory consumption of Tuned and Cheetah by up to 24.1% and 83.5%, respectively, and our prediction-based topology setup can speed up the startup of Cheetah by up to 80%.

  7. Open Hardware at CERN

    CERN Multimedia

    CERN Knowledge Transfer Group

    2015-01-01

    CERN is actively making its knowledge and technology available for the benefit of society and does so through a variety of different mechanisms. Open hardware has in recent years established itself as a very effective way for CERN to make electronics designs and in particular printed circuit board layouts, accessible to anyone, while also facilitating collaboration and design re-use. It is creating an impact on many levels, from companies producing and selling products based on hardware designed at CERN, to new projects being released under the CERN Open Hardware Licence. Today the open hardware community includes large research institutes, universities, individual enthusiasts and companies. Many of the companies are actively involved in the entire process from design to production, delivering services and consultancy and even making their own products available under open licences.

  8. Hardware description languages

    Science.gov (United States)

    Tucker, Jerry H.

    1994-01-01

    Hardware description languages are special purpose programming languages. They are primarily used to specify the behavior of digital systems and are rapidly replacing traditional digital system design techniques. This is because they allow the designer to concentrate on how the system should operate rather than on implementation details. Hardware description languages allow a digital system to be described with a wide range of abstraction, and they support top down design techniques. A key feature of any hardware description language environment is its ability to simulate the modeled system. The two most important hardware description languages are Verilog and VHDL. Verilog has been the dominant language for the design of application specific integrated circuits (ASIC's). However, VHDL is rapidly gaining in popularity.

  9. Hardware protection through obfuscation

    CERN Document Server

    Bhunia, Swarup; Tehranipoor, Mark

    2017-01-01

    This book introduces readers to various threats faced during design and fabrication by today’s integrated circuits (ICs) and systems. The authors discuss key issues, including illegal manufacturing of ICs or “IC Overproduction,” insertion of malicious circuits, referred as “Hardware Trojans”, which cause in-field chip/system malfunction, and reverse engineering and piracy of hardware intellectual property (IP). The authors provide a timely discussion of these threats, along with techniques for IC protection based on hardware obfuscation, which makes reverse-engineering an IC design infeasible for adversaries and untrusted parties with any reasonable amount of resources. This exhaustive study includes a review of the hardware obfuscation methods developed at each level of abstraction (RTL, gate, and layout) for conventional IC manufacturing, new forms of obfuscation for emerging integration strategies (split manufacturing, 2.5D ICs, and 3D ICs), and on-chip infrastructure needed for secure exchange o...

  10. ZEUS hardware control system

    Science.gov (United States)

    Loveless, R.; Erhard, P.; Ficenec, J.; Gather, K.; Heath, G.; Iacovacci, M.; Kehres, J.; Mobayyen, M.; Notz, D.; Orr, R.; Orr, R.; Sephton, A.; Stroili, R.; Tokushuku, K.; Vogel, W.; Whitmore, J.; Wiggers, L.

    1989-12-01

    The ZEUS collaboration is building a system to monitor, control and document the hardware of the ZEUS detector. This system is based on a network of VAX computers and microprocessors connected via ethernet. The database for the hardware values will be ADAMO tables; the ethernet connection will be DECNET, TCP/IP, or RPC. Most of the documentation will also be kept in ADAMO tables for easy access by users.

  11. ZEUS hardware control system

    International Nuclear Information System (INIS)

    Loveless, R.; Erhard, P.; Ficenec, J.; Gather, K.; Heath, G.; Iacovacci, M.; Kehres, J.; Mobayyen, M.; Notz, D.; Orr, R.; Sephton, A.; Stroili, R.; Tokushuku, K.; Vogel, W.; Whitmore, J.; Wiggers, L.

    1989-01-01

    The ZEUS collaboration is building a system to monitor, control and document the hardware of the ZEUS detector. This system is based on a network of VAX computers and microprocessors connected via ethernet. The database for the hardware values will be ADAMO tables; the ethernet connection will be DECNET, TCP/IP, or RPC. Most of the documentation will also be kept in ADAMO tables for easy access by users. (orig.)

  12. Scalable optical quantum computer

    Energy Technology Data Exchange (ETDEWEB)

    Manykin, E A; Mel' nichenko, E V [Institute for Superconductivity and Solid-State Physics, Russian Research Centre ' Kurchatov Institute' , Moscow (Russian Federation)

    2014-12-31

    A way of designing a scalable optical quantum computer based on the photon echo effect is proposed. Individual rare earth ions Pr{sup 3+}, regularly located in the lattice of the orthosilicate (Y{sub 2}SiO{sub 5}) crystal, are suggested to be used as optical qubits. Operations with qubits are performed using coherent and incoherent laser pulses. The operation protocol includes both the method of measurement-based quantum computations and the technique of optical computations. Modern hybrid photon echo protocols, which provide a sufficient quantum efficiency when reading recorded states, are considered as most promising for quantum computations and communications. (quantum computer)

  13. Scalable optical quantum computer

    International Nuclear Information System (INIS)

    Manykin, E A; Mel'nichenko, E V

    2014-01-01

    A way of designing a scalable optical quantum computer based on the photon echo effect is proposed. Individual rare earth ions Pr 3+ , regularly located in the lattice of the orthosilicate (Y 2 SiO 5 ) crystal, are suggested to be used as optical qubits. Operations with qubits are performed using coherent and incoherent laser pulses. The operation protocol includes both the method of measurement-based quantum computations and the technique of optical computations. Modern hybrid photon echo protocols, which provide a sufficient quantum efficiency when reading recorded states, are considered as most promising for quantum computations and communications. (quantum computer)

  14. Travel Software using GPU Hardware

    CERN Document Server

    Szalwinski, Chris M; Dimov, Veliko Atanasov; CERN. Geneva. ATS Department

    2015-01-01

    Travel is the main multi-particle tracking code being used at CERN for the beam dynamics calculations through hadron and ion linear accelerators. It uses two routines for the calculation of space charge forces, namely, rings of charges and point-to-point. This report presents the studies to improve the performance of Travel using GPU hardware. The studies showed that the performance of Travel with the point-to-point simulations of space-charge effects can be speeded up at least 72 times using current GPU hardware. Simple recompilation of the source code using an Intel compiler can improve performance at least 4 times without GPU support. The limited memory of the GPU is the bottleneck. Two algorithms were investigated on this point: repeated computation and tiling. The repeating computation algorithm is simpler and is the currently recommended solution. The tiling algorithm was more complicated and degraded performance. Both build and test instructions for the parallelized version of the software are inclu...

  15. Hardware Objects for Java

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Thalinger, Christian; Korsholm, Stephan

    2008-01-01

    Java, as a safe and platform independent language, avoids access to low-level I/O devices or direct memory access. In standard Java, low-level I/O it not a concern; it is handled by the operating system. However, in the embedded domain resources are scarce and a Java virtual machine (JVM) without...... an underlying middleware is an attractive architecture. When running the JVM on bare metal, we need access to I/O devices from Java; therefore we investigate a safe and efficient mechanism to represent I/O devices as first class Java objects, where device registers are represented by object fields. Access...... to those registers is safe as Java’s type system regulates it. The access is also fast as it is directly performed by the bytecodes getfield and putfield. Hardware objects thus provide an object-oriented abstraction of low-level hardware devices. As a proof of concept, we have implemented hardware objects...

  16. Computer hardware fault administration

    Science.gov (United States)

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-09-14

    Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.

  17. Dynamically-Loaded Hardware Libraries (HLL) Technology for Audio Applications

    DEFF Research Database (Denmark)

    Esposito, A.; Lomuscio, A.; Nunzio, L. Di

    2016-01-01

    In this work, we apply hardware acceleration to embedded systems running audio applications. We present a new framework, Dynamically-Loaded Hardware Libraries or HLL, to dynamically load hardware libraries on reconfigurable platforms (FPGAs). Provided a library of application-specific processors......, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accelerator. The proposed architecture provides excellent flexibility with respect to the different audio applications implemented, high quality audio, and an energy efficient solution....

  18. The LASS hardware processor

    International Nuclear Information System (INIS)

    Kunz, P.F.

    1976-01-01

    The problems of data analysis with hardware processors are reviewed and a description is given of a programmable processor. This processor, the 168/E, has been designed for use in the LASS multi-processor system; it has an execution speed comparable to the IBM 370/168 and uses the subset of IBM 370 instructions appropriate to the LASS analysis task. (Auth.)

  19. CERN Neutrino Platform Hardware

    CERN Document Server

    Nelson, Kevin

    2017-01-01

    My summer research was broadly in CERN's neutrino platform hardware efforts. This project had two main components: detector assembly and data analysis work for ICARUS. Specifically, I worked on assembly for the ProtoDUNE project and monitored the safety of ICARUS as it was transported to Fermilab by analyzing the accelerometer data from its move.

  20. RRFC hardware operation manual

    International Nuclear Information System (INIS)

    Abhold, M.E.; Hsue, S.T.; Menlove, H.O.; Walton, G.

    1996-05-01

    The Research Reactor Fuel Counter (RRFC) system was developed to assay the 235 U content in spent Material Test Reactor (MTR) type fuel elements underwater in a spent fuel pool. RRFC assays the 235 U content using active neutron coincidence counting and also incorporates an ion chamber for gross gamma-ray measurements. This manual describes RRFC hardware, including detectors, electronics, and performance characteristics

  1. Scalable photoreactor for hydrogen production

    KAUST Repository

    Takanabe, Kazuhiro; Shinagawa, Tatsuya

    2017-01-01

    Provided herein are scalable photoreactors that can include a membrane-free water- splitting electrolyzer and systems that can include a plurality of membrane-free water- splitting electrolyzers. Also provided herein are methods of using the scalable photoreactors provided herein.

  2. Scalable photoreactor for hydrogen production

    KAUST Repository

    Takanabe, Kazuhiro

    2017-04-06

    Provided herein are scalable photoreactors that can include a membrane-free water- splitting electrolyzer and systems that can include a plurality of membrane-free water- splitting electrolyzers. Also provided herein are methods of using the scalable photoreactors provided herein.

  3. Computational scalability of large size image dissemination

    Science.gov (United States)

    Kooper, Rob; Bajcsy, Peter

    2011-01-01

    We have investigated the computational scalability of image pyramid building needed for dissemination of very large image data. The sources of large images include high resolution microscopes and telescopes, remote sensing and airborne imaging, and high resolution scanners. The term 'large' is understood from a user perspective which means either larger than a display size or larger than a memory/disk to hold the image data. The application drivers for our work are digitization projects such as the Lincoln Papers project (each image scan is about 100-150MB or about 5000x8000 pixels with the total number to be around 200,000) and the UIUC library scanning project for historical maps from 17th and 18th century (smaller number but larger images). The goal of our work is understand computational scalability of the web-based dissemination using image pyramids for these large image scans, as well as the preservation aspects of the data. We report our computational benchmarks for (a) building image pyramids to be disseminated using the Microsoft Seadragon library, (b) a computation execution approach using hyper-threading to generate image pyramids and to utilize the underlying hardware, and (c) an image pyramid preservation approach using various hard drive configurations of Redundant Array of Independent Disks (RAID) drives for input/output operations. The benchmarks are obtained with a map (334.61 MB, JPEG format, 17591x15014 pixels). The discussion combines the speed and preservation objectives.

  4. Hardware Algorithms For Tile-Based Real-Time Rendering

    NARCIS (Netherlands)

    Crisu, D.

    2012-01-01

    In this dissertation, we present the GRAphics AcceLerator (GRAAL) framework for developing embedded tile-based rasterization hardware for mobile devices, meant to accelerate real-time 3-D graphics (OpenGL compliant) applications. The goal of the framework is a low-cost, low-power, high-performance

  5. Sterilization of space hardware.

    Science.gov (United States)

    Pflug, I. J.

    1971-01-01

    Discussion of various techniques of sterilization of space flight hardware using either destructive heating or the action of chemicals. Factors considered in the dry-heat destruction of microorganisms include the effects of microbial water content, temperature, the physicochemical properties of the microorganism and adjacent support, and nature of the surrounding gas atmosphere. Dry-heat destruction rates of microorganisms on the surface, between mated surface areas, or buried in the solid material of space vehicle hardware are reviewed, along with alternative dry-heat sterilization cycles, thermodynamic considerations, and considerations of final sterilization-process design. Discussed sterilization chemicals include ethylene oxide, formaldehyde, methyl bromide, dimethyl sulfoxide, peracetic acid, and beta-propiolactone.

  6. Hardware characteristic and application

    International Nuclear Information System (INIS)

    Gu, Dong Hyeon

    1990-03-01

    The contents of this book are system board on memory, performance, system timer system click and specification, coprocessor such as programing interface and hardware interface, power supply on input and output, protection for DC output, Power Good signal, explanation on 84 keyboard and 101/102 keyboard,BIOS system, 80286 instruction set and 80287 coprocessor, characters, keystrokes and colors, communication and compatibility of IBM personal computer on application direction, multitasking and code for distinction of system.

  7. Scalable Nanomanufacturing—A Review

    Directory of Open Access Journals (Sweden)

    Khershed Cooper

    2017-01-01

    Full Text Available This article describes the field of scalable nanomanufacturing, its importance and need, its research activities and achievements. The National Science Foundation is taking a leading role in fostering basic research in scalable nanomanufacturing (SNM. From this effort several novel nanomanufacturing approaches have been proposed, studied and demonstrated, including scalable nanopatterning. This paper will discuss SNM research areas in materials, processes and applications, scale-up methods with project examples, and manufacturing challenges that need to be addressed to move nanotechnology discoveries closer to the marketplace.

  8. Scalable Nonlinear Compact Schemes

    Energy Technology Data Exchange (ETDEWEB)

    Ghosh, Debojyoti [Argonne National Lab. (ANL), Argonne, IL (United States); Constantinescu, Emil M. [Univ. of Chicago, IL (United States); Brown, Jed [Univ. of Colorado, Boulder, CO (United States)

    2014-04-01

    In this work, we focus on compact schemes resulting in tridiagonal systems of equations, specifically the fifth-order CRWENO scheme. We propose a scalable implementation of the nonlinear compact schemes by implementing a parallel tridiagonal solver based on the partitioning/substructuring approach. We use an iterative solver for the reduced system of equations; however, we solve this system to machine zero accuracy to ensure that no parallelization errors are introduced. It is possible to achieve machine-zero convergence with few iterations because of the diagonal dominance of the system. The number of iterations is specified a priori instead of a norm-based exit criterion, and collective communications are avoided. The overall algorithm thus involves only point-to-point communication between neighboring processors. Our implementation of the tridiagonal solver differs from and avoids the drawbacks of past efforts in the following ways: it introduces no parallelization-related approximations (multiprocessor solutions are exactly identical to uniprocessor ones), it involves minimal communication, the mathematical complexity is similar to that of the Thomas algorithm on a single processor, and it does not require any communication and computation scheduling.

  9. COMPUTER HARDWARE MARKING

    CERN Multimedia

    Groupe de protection des biens

    2000-01-01

    As part of the campaign to protect CERN property and for insurance reasons, all computer hardware belonging to the Organization must be marked with the words 'PROPRIETE CERN'.IT Division has recently introduced a new marking system that is both economical and easy to use. From now on all desktop hardware (PCs, Macintoshes, printers) issued by IT Division with a value equal to or exceeding 500 CHF will be marked using this new system.For equipment that is already installed but not yet marked, including UNIX workstations and X terminals, IT Division's Desktop Support Service offers the following services free of charge:Equipment-marking wherever the Service is called out to perform other work (please submit all work requests to the IT Helpdesk on 78888 or helpdesk@cern.ch; for unavoidable operational reasons, the Desktop Support Service will only respond to marking requests when these coincide with requests for other work such as repairs, system upgrades, etc.);Training of personnel designated by Division Leade...

  10. Foundations of hardware IP protection

    CERN Document Server

    Torres, Lionel

    2017-01-01

    This book provides a comprehensive and up-to-date guide to the design of security-hardened, hardware intellectual property (IP). Readers will learn how IP can be threatened, as well as protected, by using means such as hardware obfuscation/camouflaging, watermarking, fingerprinting (PUF), functional locking, remote activation, hidden transmission of data, hardware Trojan detection, protection against hardware Trojan, use of secure element, ultra-lightweight cryptography, and digital rights management. This book serves as a single-source reference to design space exploration of hardware security and IP protection. · Provides readers with a comprehensive overview of hardware intellectual property (IP) security, describing threat models and presenting means of protection, from integrated circuit layout to digital rights management of IP; · Enables readers to transpose techniques fundamental to digital rights management (DRM) to the realm of hardware IP security; · Introduce designers to the concept of salutar...

  11. Open hardware for open science

    CERN Multimedia

    CERN Bulletin

    2011-01-01

    Inspired by the open source software movement, the Open Hardware Repository was created to enable hardware developers to share the results of their R&D activities. The recently published CERN Open Hardware Licence offers the legal framework to support this knowledge and technology exchange.   Two years ago, a group of electronics designers led by Javier Serrano, a CERN engineer, working in experimental physics laboratories created the Open Hardware Repository (OHR). This project was initiated in order to facilitate the exchange of hardware designs across the community in line with the ideals of “open science”. The main objectives include avoiding duplication of effort by sharing results across different teams that might be working on the same need. “For hardware developers, the advantages of open hardware are numerous. For example, it is a great learning tool for technologies some developers would not otherwise master, and it avoids unnecessary work if someone ha...

  12. A scalable healthcare information system based on a service-oriented architecture.

    Science.gov (United States)

    Yang, Tzu-Hsiang; Sun, Yeali S; Lai, Feipei

    2011-06-01

    Many existing healthcare information systems are composed of a number of heterogeneous systems and face the important issue of system scalability. This paper first describes the comprehensive healthcare information systems used in National Taiwan University Hospital (NTUH) and then presents a service-oriented architecture (SOA)-based healthcare information system (HIS) based on the service standard HL7. The proposed architecture focuses on system scalability, in terms of both hardware and software. Moreover, we describe how scalability is implemented in rightsizing, service groups, databases, and hardware scalability. Although SOA-based systems sometimes display poor performance, through a performance evaluation of our HIS based on SOA, the average response time for outpatient, inpatient, and emergency HL7Central systems are 0.035, 0.04, and 0.036 s, respectively. The outpatient, inpatient, and emergency WebUI average response times are 0.79, 1.25, and 0.82 s. The scalability of the rightsizing project and our evaluation results show that the SOA HIS we propose provides evidence that SOA can provide system scalability and sustainability in a highly demanding healthcare information system.

  13. A versatile scalable PET processing system

    International Nuclear Information System (INIS)

    Dong, H.; Weisenberger, A.; McKisson, J.; Wenze, Xi; Cuevas, C.; Wilson, J.; Zukerman, L.

    2011-01-01

    Positron Emission Tomography (PET) historically has major clinical and preclinical applications in cancerous oncology, neurology, and cardiovascular diseases. Recently, in a new direction, an application specific PET system is being developed at Thomas Jefferson National Accelerator Facility (Jefferson Lab) in collaboration with Duke University, University of Maryland at Baltimore (UMAB), and West Virginia University (WVU) targeted for plant eco-physiology research. The new plant imaging PET system is versatile and scalable such that it could adapt to several plant imaging needs - imaging many important plant organs including leaves, roots, and stems. The mechanical arrangement of the detectors is designed to accommodate the unpredictable and random distribution in space of the plant organs without requiring the plant be disturbed. Prototyping such a system requires a new data acquisition system (DAQ) and data processing system which are adaptable to the requirements of these unique and versatile detectors.

  14. Hardware Support for Embedded Java

    DEFF Research Database (Denmark)

    Schoeberl, Martin

    2012-01-01

    The general Java runtime environment is resource hungry and unfriendly for real-time systems. To reduce the resource consumption of Java in embedded systems, direct hardware support of the language is a valuable option. Furthermore, an implementation of the Java virtual machine in hardware enables...... worst-case execution time analysis of Java programs. This chapter gives an overview of current approaches to hardware support for embedded and real-time Java....

  15. HARDWARE TROJAN IDENTIFICATION AND DETECTION

    OpenAIRE

    Samer Moein; Fayez Gebali; T. Aaron Gulliver; Abdulrahman Alkandari

    2017-01-01

    ABSTRACT The majority of techniques developed to detect hardware trojans are based on specific attributes. Further, the ad hoc approaches employed to design methods for trojan detection are largely ineffective. Hardware trojans have a number of attributes which can be used to systematically develop detection techniques. Based on this concept, a detailed examination of current trojan detection techniques and the characteristics of existing hardware trojans is presented. This is used to dev...

  16. Scalable algorithms for contact problems

    CERN Document Server

    Dostál, Zdeněk; Sadowská, Marie; Vondrák, Vít

    2016-01-01

    This book presents a comprehensive and self-contained treatment of the authors’ newly developed scalable algorithms for the solutions of multibody contact problems of linear elasticity. The brand new feature of these algorithms is theoretically supported numerical scalability and parallel scalability demonstrated on problems discretized by billions of degrees of freedom. The theory supports solving multibody frictionless contact problems, contact problems with possibly orthotropic Tresca’s friction, and transient contact problems. It covers BEM discretization, jumping coefficients, floating bodies, mortar non-penetration conditions, etc. The exposition is divided into four parts, the first of which reviews appropriate facets of linear algebra, optimization, and analysis. The most important algorithms and optimality results are presented in the third part of the volume. The presentation is complete, including continuous formulation, discretization, decomposition, optimality results, and numerical experimen...

  17. Hardware assisted hypervisor introspection.

    Science.gov (United States)

    Shi, Jiangyong; Yang, Yuexiang; Tang, Chuan

    2016-01-01

    In this paper, we introduce hypervisor introspection, an out-of-box way to monitor the execution of hypervisors. Similar to virtual machine introspection which has been proposed to protect virtual machines in an out-of-box way over the past decade, hypervisor introspection can be used to protect hypervisors which are the basis of cloud security. Virtual machine introspection tools are usually deployed either in hypervisor or in privileged virtual machines, which might also be compromised. By utilizing hardware support including nested virtualization, EPT protection and #BP, we are able to monitor all hypercalls belongs to the virtual machines of one hypervisor, include that of privileged virtual machine and even when the hypervisor is compromised. What's more, hypercall injection method is used to simulate hypercall-based attacks and evaluate the performance of our method. Experiment results show that our method can effectively detect hypercall-based attacks with some performance cost. Lastly, we discuss our furture approaches of reducing the performance cost and preventing the compromised hypervisor from detecting the existence of our introspector, in addition with some new scenarios to apply our hypervisor introspection system.

  18. LHCb: Hardware Data Injector

    CERN Multimedia

    Delord, V; Neufeld, N

    2009-01-01

    The LHCb High Level Trigger and Data Acquisition system selects about 2 kHz of events out of the 1 MHz of events, which have been selected previously by the first-level hardware trigger. The selected events are consolidated into files and then sent to permanent storage for subsequent analysis on the Grid. The goal of the upgrade of the LHCb readout is to lift the limitation to 1 MHz. This means speeding up the DAQ to 40 MHz. Such a DAQ system will certainly employ 10 Gigabit or technologies and might also need new networking protocols: a customized TCP or proprietary solutions. A test module is being presented, which integrates in the existing LHCb infrastructure. It is a 10-Gigabit traffic generator, flexible enough to generate LHCb's raw data packets using dummy data or simulated data. These data are seen as real data coming from sub-detectors by the DAQ. The implementation is based on an FPGA using 10 Gigabit Ethernet interface. This module is integrated in the experiment control system. The architecture, ...

  19. Scalability Modeling for Optimal Provisioning of Data Centers in Telenor: A better balance between under- and over-provisioning

    OpenAIRE

    Rygg, Knut Helge

    2012-01-01

    The scalability of an information system describes the relationship between system ca-pacity and system size. This report studies the scalability of Microsoft Lync Server 2010 in order to provide guidelines for provisioning hardware resources. Optimal pro-visioning is required to reduce both deployment and operational costs, while keeping an acceptable service quality.All Lync servers in the test setup are virtualizedusingVMware ESXi 5.0 and the system runs on a Cisco Unified Computing System...

  20. Scalable cloud without dedicated storage

    Science.gov (United States)

    Batkovich, D. V.; Kompaniets, M. V.; Zarochentsev, A. K.

    2015-05-01

    We present a prototype of a scalable computing cloud. It is intended to be deployed on the basis of a cluster without the separate dedicated storage. The dedicated storage is replaced by the distributed software storage. In addition, all cluster nodes are used both as computing nodes and as storage nodes. This solution increases utilization of the cluster resources as well as improves fault tolerance and performance of the distributed storage. Another advantage of this solution is high scalability with a relatively low initial and maintenance cost. The solution is built on the basis of the open source components like OpenStack, CEPH, etc.

  1. Hardware for soft computing and soft computing for hardware

    CERN Document Server

    Nedjah, Nadia

    2014-01-01

    Single and Multi-Objective Evolutionary Computation (MOEA),  Genetic Algorithms (GAs), Artificial Neural Networks (ANNs), Fuzzy Controllers (FCs), Particle Swarm Optimization (PSO) and Ant colony Optimization (ACO) are becoming omnipresent in almost every intelligent system design. Unfortunately, the application of the majority of these techniques is complex and so requires a huge computational effort to yield useful and practical results. Therefore, dedicated hardware for evolutionary, neural and fuzzy computation is a key issue for designers. With the spread of reconfigurable hardware such as FPGAs, digital as well as analog hardware implementations of such computation become cost-effective. The idea behind this book is to offer a variety of hardware designs for soft computing techniques that can be embedded in any final product. Also, to introduce the successful application of soft computing technique to solve many hard problem encountered during the design of embedded hardware designs. Reconfigurable em...

  2. FY1995 evolvable hardware chip; 1995 nendo shinkasuru hardware chip

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    This project aims at the development of 'Evolvable Hardware' (EHW) which can adapt its hardware structure to the environment to attain better hardware performance, under the control of genetic algorithms. EHW is a key technology to explore the new application area requiring real-time performance and on-line adaptation. 1. Development of EHW-LSI for function level hardware evolution, which includes 15 DSPs in one chip. 2. Application of the EHW to the practical industrial applications such as data compression, ATM control, digital mobile communication. 3. Two patents : (1) the architecture and the processing method for programmable EHW-LSI. (2) The method of data compression for loss-less data, using EHW. 4. The first international conference for evolvable hardware was held by authors: Intl. Conf. on Evolvable Systems (ICES96). It was determined at ICES96 that ICES will be held every two years between Japan and Europe. So the new society has been established by us. (NEDO)

  3. Reconfigurable Hardware for Compressing Hyperspectral Image Data

    Science.gov (United States)

    Aranki, Nazeeh; Namkung, Jeffrey; Villapando, Carlos; Kiely, Aaron; Klimesh, Matthew; Xie, Hua

    2010-01-01

    High-speed, low-power, reconfigurable electronic hardware has been developed to implement ICER-3D, an algorithm for compressing hyperspectral-image data. The algorithm and parts thereof have been the topics of several NASA Tech Briefs articles, including Context Modeler for Wavelet Compression of Hyperspectral Images (NPO-43239) and ICER-3D Hyperspectral Image Compression Software (NPO-43238), which appear elsewhere in this issue of NASA Tech Briefs. As described in more detail in those articles, the algorithm includes three main subalgorithms: one for computing wavelet transforms, one for context modeling, and one for entropy encoding. For the purpose of designing the hardware, these subalgorithms are treated as modules to be implemented efficiently in field-programmable gate arrays (FPGAs). The design takes advantage of industry- standard, commercially available FPGAs. The implementation targets the Xilinx Virtex II pro architecture, which has embedded PowerPC processor cores with flexible on-chip bus architecture. It incorporates an efficient parallel and pipelined architecture to compress the three-dimensional image data. The design provides for internal buffering to minimize intensive input/output operations while making efficient use of offchip memory. The design is scalable in that the subalgorithms are implemented as independent hardware modules that can be combined in parallel to increase throughput. The on-chip processor manages the overall operation of the compression system, including execution of the top-level control functions as well as scheduling, initiating, and monitoring processes. The design prototype has been demonstrated to be capable of compressing hyperspectral data at a rate of 4.5 megasamples per second at a conservative clock frequency of 50 MHz, with a potential for substantially greater throughput at a higher clock frequency. The power consumption of the prototype is less than 6.5 W. The reconfigurability (by means of reprogramming) of

  4. Implementation of the Timepix ASIC in the Scalable Readout System

    Energy Technology Data Exchange (ETDEWEB)

    Lupberger, M., E-mail: lupberger@physik.uni-bonn.de; Desch, K.; Kaminski, J.

    2016-09-11

    We report on the development of electronics hardware, FPGA firmware and software to provide a flexible multi-chip readout of the Timepix ASIC within the framework of the Scalable Readout System (SRS). The system features FPGA-based zero-suppression and the possibility to read out up to 4×8 chips with a single Front End Concentrator (FEC). By operating several FECs in parallel, in principle an arbitrary number of chips can be read out, exploiting the scaling features of SRS. Specifically, we tested the system with a setup consisting of 160 Timepix ASICs, operated as GridPix devices in a large TPC field cage in a 1 T magnetic field at a DESY test beam facility providing an electron beam of up to 6 GeV. We discuss the design choices, the dedicated hardware components, the FPGA firmware as well as the performance of the system in the test beam.

  5. Trainable hardware for dynamical computing using error backpropagation through physical media.

    Science.gov (United States)

    Hermans, Michiel; Burm, Michaël; Van Vaerenbergh, Thomas; Dambre, Joni; Bienstman, Peter

    2015-03-24

    Neural networks are currently implemented on digital Von Neumann machines, which do not fully leverage their intrinsic parallelism. We demonstrate how to use a novel class of reconfigurable dynamical systems for analogue information processing, mitigating this problem. Our generic hardware platform for dynamic, analogue computing consists of a reciprocal linear dynamical system with nonlinear feedback. Thanks to reciprocity, a ubiquitous property of many physical phenomena like the propagation of light and sound, the error backpropagation-a crucial step for tuning such systems towards a specific task-can happen in hardware. This can potentially speed up the optimization process significantly, offering important benefits for the scalability of neuro-inspired hardware. In this paper, we show, using one experimentally validated and one conceptual example, that such systems may provide a straightforward mechanism for constructing highly scalable, fully dynamical analogue computers.

  6. Compact FPGA hardware architecture for public key encryption in embedded devices.

    Science.gov (United States)

    Rodríguez-Flores, Luis; Morales-Sandoval, Miguel; Cumplido, René; Feregrino-Uribe, Claudia; Algredo-Badillo, Ignacio

    2018-01-01

    Security is a crucial requirement in the envisioned applications of the Internet of Things (IoT), where most of the underlying computing platforms are embedded systems with reduced computing capabilities and energy constraints. In this paper we present the design and evaluation of a scalable low-area FPGA hardware architecture that serves as a building block to accelerate the costly operations of exponentiation and multiplication in [Formula: see text], commonly required in security protocols relying on public key encryption, such as in key agreement, authentication and digital signature. The proposed design can process operands of different size using the same datapath, which exhibits a significant reduction in area without loss of efficiency if compared to representative state of the art designs. For example, our design uses 96% less standard logic than a similar design optimized for performance, and 46% less resources than other design optimized for area. Even using fewer area resources, our design still performs better than its embedded software counterparts (190x and 697x).

  7. Secure coupling of hardware components

    NARCIS (Netherlands)

    Hoepman, J.H.; Joosten, H.J.M.; Knobbe, J.W.

    2011-01-01

    A method and a system for securing communication between at least a first and a second hardware components of a mobile device is described. The method includes establishing a first shared secret between the first and the second hardware components during an initialization of the mobile device and,

  8. Scalable shared-memory multiprocessing

    CERN Document Server

    Lenoski, Daniel E

    1995-01-01

    Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.

  9. Hardware support for CSP on a Java chip multiprocessor

    DEFF Research Database (Denmark)

    Gruian, Flavius; Schoeberl, Martin

    2013-01-01

    Due to memory bandwidth limitations, chip multiprocessors (CMPs) adopting the convenient shared memory model for their main memory architecture scale poorly. On-chip core-to-core communication is a solution to this problem, that can lead to further performance increase for a number of multithreaded...... applications. Programmatically, the Communicating Sequential Processes (CSPs) paradigm provides a sound computational model for such an architecture with message based communication. In this paper we explore hardware support for CSP in the context of an embedded Java CMP. The hardware support for CSP are on......-chip communication channels, implemented by a ring-based network-on-chip (NoC), to reduce the memory bandwidth pressure on the shared memory.The presented solution is scalable and also specific for our limited resources and real-time predictability requirements. CMP architectures of three to eight processors were...

  10. Binary Associative Memories as a Benchmark for Spiking Neuromorphic Hardware

    Directory of Open Access Journals (Sweden)

    Andreas Stöckel

    2017-08-01

    Full Text Available Large-scale neuromorphic hardware platforms, specialized computer systems for energy efficient simulation of spiking neural networks, are being developed around the world, for example as part of the European Human Brain Project (HBP. Due to conceptual differences, a universal performance analysis of these systems in terms of runtime, accuracy and energy efficiency is non-trivial, yet indispensable for further hard- and software development. In this paper we describe a scalable benchmark based on a spiking neural network implementation of the binary neural associative memory. We treat neuromorphic hardware and software simulators as black-boxes and execute exactly the same network description across all devices. Experiments on the HBP platforms under varying configurations of the associative memory show that the presented method allows to test the quality of the neuron model implementation, and to explain significant deviations from the expected reference output.

  11. The TOTEM DAQ based on the Scalable Readout System (SRS)

    Science.gov (United States)

    Quinto, Michele; Cafagna, Francesco S.; Fiergolski, Adrian; Radicioni, Emilio

    2018-02-01

    The TOTEM (TOTal cross section, Elastic scattering and diffraction dissociation Measurement at the LHC) experiment at LHC, has been designed to measure the total proton-proton cross-section and study the elastic and diffractive scattering at the LHC energies. In order to cope with the increased machine luminosity and the higher statistic required by the extension of the TOTEM physics program, approved for the LHC's Run Two phase, the previous VME based data acquisition system has been replaced with a new one based on the Scalable Readout System. The system features an aggregated data throughput of 2GB / s towards the online storage system. This makes it possible to sustain a maximum trigger rate of ˜ 24kHz, to be compared with the 1KHz rate of the previous system. The trigger rate is further improved by implementing zero-suppression and second-level hardware algorithms in the Scalable Readout System. The new system fulfils the requirements for an increased efficiency, providing higher bandwidth, and increasing the purity of the data recorded. Moreover full compatibility has been guaranteed with the legacy front-end hardware, as well as with the DAQ interface of the CMS experiment and with the LHC's Timing, Trigger and Control distribution system. In this contribution we describe in detail the architecture of full system and its performance measured during the commissioning phase at the LHC Interaction Point.

  12. NDAS Hardware Translation Layer Development

    Science.gov (United States)

    Nazaretian, Ryan N.; Holladay, Wendy T.

    2011-01-01

    The NASA Data Acquisition System (NDAS) project is aimed to replace all DAS software for NASA s Rocket Testing Facilities. There must be a software-hardware translation layer so the software can properly talk to the hardware. Since the hardware from each test stand varies, drivers for each stand have to be made. These drivers will act more like plugins for the software. If the software is being used in E3, then the software should point to the E3 driver package. If the software is being used at B2, then the software should point to the B2 driver package. The driver packages should also be filled with hardware drivers that are universal to the DAS system. For example, since A1, A2, and B2 all use the Preston 8300AU signal conditioners, then the driver for those three stands should be the same and updated collectively.

  13. Hardware for dynamic quantum computing.

    Science.gov (United States)

    Ryan, Colm A; Johnson, Blake R; Ristè, Diego; Donovan, Brian; Ohki, Thomas A

    2017-10-01

    We describe the hardware, gateware, and software developed at Raytheon BBN Technologies for dynamic quantum information processing experiments on superconducting qubits. In dynamic experiments, real-time qubit state information is fed back or fed forward within a fraction of the qubits' coherence time to dynamically change the implemented sequence. The hardware presented here covers both control and readout of superconducting qubits. For readout, we created a custom signal processing gateware and software stack on commercial hardware to convert pulses in a heterodyne receiver into qubit state assignments with minimal latency, alongside data taking capability. For control, we developed custom hardware with gateware and software for pulse sequencing and steering information distribution that is capable of arbitrary control flow in a fraction of superconducting qubit coherence times. Both readout and control platforms make extensive use of field programmable gate arrays to enable tailored qubit control systems in a reconfigurable fabric suitable for iterative development.

  14. Accelerating cardiac bidomain simulations using graphics processing units.

    Science.gov (United States)

    Neic, A; Liebmann, M; Hoetzl, E; Mitchell, L; Vigmond, E J; Haase, G; Plank, G

    2012-08-01

    Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6-20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20 GPUs, 476 CPU cores were required on a national supercomputing facility.

  15. Space Situational Awareness Data Processing Scalability Utilizing Google Cloud Services

    Science.gov (United States)

    Greenly, D.; Duncan, M.; Wysack, J.; Flores, F.

    Space Situational Awareness (SSA) is a fundamental and critical component of current space operations. The term SSA encompasses the awareness, understanding and predictability of all objects in space. As the population of orbital space objects and debris increases, the number of collision avoidance maneuvers grows and prompts the need for accurate and timely process measures. The SSA mission continually evolves to near real-time assessment and analysis demanding the need for higher processing capabilities. By conventional methods, meeting these demands requires the integration of new hardware to keep pace with the growing complexity of maneuver planning algorithms. SpaceNav has implemented a highly scalable architecture that will track satellites and debris by utilizing powerful virtual machines on the Google Cloud Platform. SpaceNav algorithms for processing CDMs outpace conventional means. A robust processing environment for tracking data, collision avoidance maneuvers and various other aspects of SSA can be created and deleted on demand. Migrating SpaceNav tools and algorithms into the Google Cloud Platform will be discussed and the trials and tribulations involved. Information will be shared on how and why certain cloud products were used as well as integration techniques that were implemented. Key items to be presented are: 1.Scientific algorithms and SpaceNav tools integrated into a scalable architecture a) Maneuver Planning b) Parallel Processing c) Monte Carlo Simulations d) Optimization Algorithms e) SW Application Development/Integration into the Google Cloud Platform 2. Compute Engine Processing a) Application Engine Automated Processing b) Performance testing and Performance Scalability c) Cloud MySQL databases and Database Scalability d) Cloud Data Storage e) Redundancy and Availability

  16. FY1995 evolvable hardware chip; 1995 nendo shinkasuru hardware chip

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    This project aims at the development of 'Evolvable Hardware' (EHW) which can adapt its hardware structure to the environment to attain better hardware performance, under the control of genetic algorithms. EHW is a key technology to explore the new application area requiring real-time performance and on-line adaptation. 1. Development of EHW-LSI for function level hardware evolution, which includes 15 DSPs in one chip. 2. Application of the EHW to the practical industrial applications such as data compression, ATM control, digital mobile communication. 3. Two patents : (1) the architecture and the processing method for programmable EHW-LSI. (2) The method of data compression for loss-less data, using EHW. 4. The first international conference for evolvable hardware was held by authors: Intl. Conf. on Evolvable Systems (ICES96). It was determined at ICES96 that ICES will be held every two years between Japan and Europe. So the new society has been established by us. (NEDO)

  17. Hardware Middleware for Person Tracking on Embedded Distributed Smart Cameras

    Directory of Open Access Journals (Sweden)

    Ali Akbar Zarezadeh

    2012-01-01

    Full Text Available Tracking individuals is a prominent application in such domains like surveillance or smart environments. This paper provides a development of a multiple camera setup with jointed view that observes moving persons in a site. It focuses on a geometry-based approach to establish correspondence among different views. The expensive computational parts of the tracker are hardware accelerated via a novel system-on-chip (SoC design. In conjunction with this vision application, a hardware object request broker (ORB middleware is presented as the underlying communication system. The hardware ORB provides a hardware/software architecture to achieve real-time intercommunication among multiple smart cameras. Via a probing mechanism, a performance analysis is performed to measure network latencies, that is, time traversing the TCP/IP stack, in both software and hardware ORB approaches on the same smart camera platform. The empirical results show that using the proposed hardware ORB as client and server in separate smart camera nodes will considerably reduce the network latency up to 100 times compared to the software ORB.

  18. Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

    Energy Technology Data Exchange (ETDEWEB)

    Widener, Patrick (University of New Mexico); Jaconette, Steven (Northwestern University); Bridges, Patrick G. (University of New Mexico); Xia, Lei (Northwestern University); Dinda, Peter (Northwestern University); Cui, Zheng.; Lange, John (Northwestern University); Hudson, Trammell B.; Levenhagen, Michael J.; Pedretti, Kevin Thomas Tauke; Brightwell, Ronald Brian

    2009-09-01

    Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.

  19. Advanced technologies for scalable ATLAS conditions database access on the grid

    CERN Document Server

    Basset, R; Dimitrov, G; Girone, M; Hawkings, R; Nevski, P; Valassi, A; Vaniachine, A; Viegas, F; Walker, R; Wong, A

    2010-01-01

    During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysi...

  20. Scalable Techniques for Formal Verification

    CERN Document Server

    Ray, Sandip

    2010-01-01

    This book presents state-of-the-art approaches to formal verification techniques to seamlessly integrate different formal verification methods within a single logical foundation. It should benefit researchers and practitioners looking to get a broad overview of the spectrum of formal verification techniques, as well as approaches to combining such techniques within a single framework. Coverage includes a range of case studies showing how such combination is fruitful in developing a scalable verification methodology for industrial designs. This book outlines both theoretical and practical issue

  1. Developing Scalable Information Security Systems

    Directory of Open Access Journals (Sweden)

    Valery Konstantinovich Ablekov

    2013-06-01

    Full Text Available Existing physical security systems has wide range of lacks, including: high cost, a large number of vulnerabilities, problems of modification and support system. This paper covers an actual problem of developing systems without this list of drawbacks. The paper presents the architecture of the information security system, which operates through the network protocol TCP/IP, including the ability to connect different types of devices and integration with existing security systems. The main advantage is a significant increase in system reliability, scalability, both vertically and horizontally, with minimal cost of both financial and time resources.

  2. Apple-CORE: Microgrids of SVP cores: flexible, general-purpose, fine-grained hardware concurrency management

    NARCIS (Netherlands)

    Poss, R.; Lankamp, M.; Yang, Q.; Fu, J.; van Tol, M.W.; Jesshope, C.; Nair, S.

    2012-01-01

    To harness the potential of CMPs for scalable, energy-efficient performance in general-purpose computers, the Apple-CORE project has co-designed a general machine model and concurrency control interface with dedicated hardware support for concurrency control across multiple cores. Its SVP interface

  3. Raspberry Pi hardware projects 1

    CERN Document Server

    Robinson, Andrew

    2013-01-01

    Learn how to take full advantage of all of Raspberry Pi's amazing features and functions-and have a blast doing it! Congratulations on becoming a proud owner of a Raspberry Pi, the credit-card-sized computer! If you're ready to dive in and start finding out what this amazing little gizmo is really capable of, this ebook is for you. Taken from the forthcoming Raspberry Pi Projects, Raspberry Pi Hardware Projects 1 contains three cool hardware projects that let you have fun with the Raspberry Pi while developing your Raspberry Pi skills. The authors - PiFace inventor, Andrew Robinson and Rasp

  4. Hardware controls for the STAR experiment at RHIC

    International Nuclear Information System (INIS)

    Reichhold, D.; Bieser, F.; Bordua, M.; Cherney, M.; Chrin, J.; Dunlop, J.C.; Ferguson, M.I.; Ghazikhanian, V.; Gross, J.; Harper, G.; Howe, M.; Jacobson, S.; Klein, S.R.; Kravtsov, P.; Lewis, S.; Lin, J.; Lionberger, C.; LoCurto, G.; McParland, C.; McShane, T.; Meier, J.; Sakrejda, I.; Sandler, Z.; Schambach, J.; Shi, Y.; Willson, R.; Yamamoto, E.; Zhang, W.

    2003-01-01

    The STAR detector sits in a high radiation area when operating normally; therefore it was necessary to develop a robust system to remotely control all hardware. The STAR hardware controls system monitors and controls approximately 14,000 parameters in the STAR detector. Voltages, currents, temperatures, and other parameters are monitored. Effort has been minimized by the adoption of experiment-wide standards and the use of pre-packaged software tools. The system is based on the Experimental Physics and Industrial Control System (EPICS) . VME processors communicate with subsystem-based sensors over a variety of field busses, with High-level Data Link Control (HDLC) being the most prevalent. Other features of the system include interfaces to accelerator and magnet control systems, a web-based archiver, and C++-based communication between STAR online, run control and hardware controls and their associated databases. The system has been designed for easy expansion as new detector elements are installed in STAR

  5. Hardware standardization for embedded systems

    International Nuclear Information System (INIS)

    Sharma, M.K.; Kalra, Mohit; Patil, M.B.; Mohanty, Ashutos; Ganesh, G.; Biswas, B.B.

    2010-01-01

    Reactor Control Division (RCnD) has been one of the main designers of safety and safety related systems for power reactors. These systems have been built using in-house developed hardware. Since the present set of hardware was designed long ago, a need was felt to design a new family of hardware boards. A Working Group on Electronics Hardware Standardization (WG-EHS) was formed with an objective to develop a family of boards, which is general purpose enough to meet the requirements of the system designers/end users. RCnD undertook the responsibility of design, fabrication and testing of boards for embedded systems. VME and a proprietary I/O bus were selected as the two system buses. The boards have been designed based on present day technology and components. The intelligence of these boards has been implemented on FPGA/CPLD using VHDL. This paper outlines the various boards that have been developed with a brief description. (author)

  6. Commodity hardware and software summary

    International Nuclear Information System (INIS)

    Wolbers, S.

    1997-04-01

    A review is given of the talks and papers presented in the Commodity Hardware and Software Session at the CHEP97 conference. An examination of the trends leading to the consideration of PC's for HEP is given, and a status of the work that is being done at various HEP labs and Universities is given

  7. Parallelized Local Volatility Estimation Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.; Lee, Hyoseop; Sheen, Dongwoo

    2010-01-01

    We introduce an inverse problem for the local volatility model in option pricing. We solve the problem using the Levenberg-Marquardt algorithm and use the notion of the Fréchet derivative when calculating the Jacobian matrix. We analyze

  8. RTOS acceleration in an MPSoC with reconfigurable hardware

    NARCIS (Netherlands)

    Zaykov, P.G.; Kuzmanov, G.; Molnos, A.M.; Goossens, K.G.W.

    2016-01-01

    In this paper, we address the problem of improving the performance of real-time embedded Multiprocessor System-on-Chip (MPSoC). Such MPSoCs often execute applications composed of multiple tasks. The tasks on each processor are scheduled by a Real-Time Operating System (RTOS) instance. To improve

  9. Basket Option Pricing Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.; Lee, Hyoseop

    2010-01-01

    We introduce a basket option pricing problem arisen in financial mathematics. We discretized the problem based on the alternating direction implicit (ADI) method and parallel cyclic reduction is applied to solve the set of tridiagonal matrices

  10. Graphics hardware accelerated panorama builder for mobile phones

    Science.gov (United States)

    Bordallo López, Miguel; Hannuksela, Jari; Silvén, Olli; Vehviläinen, Markku

    2009-02-01

    Modern mobile communication devices frequently contain built-in cameras allowing users to capture highresolution still images, but at the same time the imaging applications are facing both usability and throughput bottlenecks. The difficulties in taking ad hoc pictures of printed paper documents with multi-megapixel cellular phone cameras on a common business use case, illustrate these problems for anyone. The result can be examined only after several seconds, and is often blurry, so a new picture is needed, although the view-finder image had looked good. The process can be a frustrating one with waits and the user not being able to predict the quality beforehand. The problems can be traced to the processor speed and camera resolution mismatch, and application interactivity demands. In this context we analyze building mosaic images of printed documents from frames selected from VGA resolution (640x480 pixel) video. High interactivity is achieved by providing real-time feedback on the quality, while simultaneously guiding the user actions. The graphics processing unit of the mobile device can be used to speed up the reconstruction computations. To demonstrate the viability of the concept, we present an interactive document scanning application implemented on a Nokia N95 mobile phone.

  11. Basket Option Pricing Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.

    2010-08-01

    We introduce a basket option pricing problem arisen in financial mathematics. We discretized the problem based on the alternating direction implicit (ADI) method and parallel cyclic reduction is applied to solve the set of tridiagonal matrices generated by the ADI method. To reduce the computational time of the problem, a general purpose graphics processing units (GP-GPU) environment is considered. Numerical results confirm the convergence and efficiency of the proposed method. © 2010 IEEE.

  12. Parallelized Local Volatility Estimation Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.

    2010-01-01

    We introduce an inverse problem for the local volatility model in option pricing. We solve the problem using the Levenberg-Marquardt algorithm and use the notion of the Fréchet derivative when calculating the Jacobian matrix. We analyze the existence of the Fréchet derivative and its numerical computation. To reduce the computational time of the inverse problem, a GP-GPU environment is considered for parallel computation. Numerical results confirm the validity and efficiency of the proposed method. ©2010 IEEE.

  13. Accelerating wavelet lifting on graphics hardware using CUDA

    NARCIS (Netherlands)

    Laan, van der W.J.; Roerdink, J.B.T.M.; Jalba, A.C.

    2011-01-01

    The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as

  14. Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

    NARCIS (Netherlands)

    Laan, Wladimir J. van der; Jalba, Andrei C.; Roerdink, Jos B.T.M.

    The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as

  15. Optimizing memory-bound SYMV kernel on GPU hardware accelerators

    KAUST Repository

    Abdelfattah, Ahmad; Dongarra, Jack; Keyes, David E.; Ltaief, Hatem

    2013-01-01

    and increasing bandwidth, our preliminary asymptotic results show 3.5x and 2.5x fold speedups over the similar CUBLAS 4.0 kernel, and 7-8% and 30% fold improvement over the Matrix Algebra on GPU and Multicore Architectures (MAGMA) library in single and double

  16. An FPGA-Based Quench Detection and Protection System for Superconducting Accelerator Magnets

    CERN Document Server

    Carcagno, Ruben H; Lamm, Michael J; Makulski, Andrzej; Nehring, Roger; Orris, Darryl; Pishchalnikov, Yu M; Tartaglia, M

    2005-01-01

    A new quench detection and protection system for superconducting accelerator magnets was developed at the Fermilab's Magnet Test Facility (MTF). This system is based on a Field-Programmable Gate Array (FPGA) module, and it is made of mostly commerically available, integrated hardware and software components. It provides most of the functionality of our existing VME-based quench detection and protection system, but in addition the new system is easily scalable to protect multiple magnets powered independently and has a more powerful user interface and analysis tools. First applications of the new system will be for testing corrector coil packages. In this paper we describe the new system and present results of testing LHC Interaction Region Quadrupole (IRQ) correctors.

  17. An FPGA-based quench detection and protection system for superconducting accelerator magnets

    International Nuclear Information System (INIS)

    Carcagno, R.H.; Feher, S.; Lamm, M.; Makulski, A.; Nehring, R.; Orris, D.F.; Pischalnikov, Y.; Tartaglia, M.; Fermilab

    2005-01-01

    A new quench detection and protection system for superconducting accelerator magnets was developed for the Fermilab's Magnet Test Facility (MTF). This system is based on a Field-Programmable Gate Array (FPGA) module, and it is made of mostly commercially available, integrated hardware and software components. It provides all the functions of our existing VME-based quench detection and protection system, but in addition the new system is easily scalable to protect multiple magnets powered independently and a more powerful user interface and analysis tools. The new system has been used successfully for testing LHC Interaction Region Quadrupoles correctors and High Field Magnet HFDM04. In this paper we describe the system and present results

  18. An FPGA-based quench detection and protection system for superconducting accelerator magnets

    Energy Technology Data Exchange (ETDEWEB)

    Carcagno, R.H.; Feher, S.; Lamm, M.; Makulski, A.; Nehring, R.; Orris, D.F.; Pischalnikov, Y.; Tartaglia, M.; /Fermilab

    2005-05-01

    A new quench detection and protection system for superconducting accelerator magnets was developed for the Fermilab's Magnet Test Facility (MTF). This system is based on a Field-Programmable Gate Array (FPGA) module, and it is made of mostly commercially available, integrated hardware and software components. It provides all the functions of our existing VME-based quench detection and protection system, but in addition the new system is easily scalable to protect multiple magnets powered independently and a more powerful user interface and analysis tools. The new system has been used successfully for testing LHC Interaction Region Quadrupoles correctors and High Field Magnet HFDM04. In this paper we describe the system and present results.

  19. A Scalable Policy and SNMP Based Network Management Framework

    Institute of Scientific and Technical Information of China (English)

    LIU Su-ping; DING Yong-sheng

    2009-01-01

    Traditional SNMP-based network management can not deal with the task of managing large-scaled distributed network,while policy-based management is one of the effective solutions in network and distributed systems management. However,cross-vendor hardware compatibility is one of the limitations in policy-based management. Devices existing in current network mostly support SNMP rather than Common Open Policy Service (COPS) protocol. By analyzing traditional network management and policy-based network management, a scalable network management framework is proposed. It is combined with Internet Engineering Task Force (IETF) framework for policybased management and SNMP-based network management. By interpreting and translating policy decision to SNMP message,policy can be executed in traditional SNMP-based device.

  20. No-hardware-signature cybersecurity-crypto-module: a resilient cyber defense agent

    Science.gov (United States)

    Zaghloul, A. R. M.; Zaghloul, Y. A.

    2014-06-01

    We present an optical cybersecurity-crypto-module as a resilient cyber defense agent. It has no hardware signature since it is bitstream reconfigurable, where single hardware architecture functions as any selected device of all possible ones of the same number of inputs. For a two-input digital device, a 4-digit bitstream of 0s and 1s determines which device, of a total of 16 devices, the hardware performs as. Accordingly, the hardware itself is not physically reconfigured, but its performance is. Such a defense agent allows the attack to take place, rendering it harmless. On the other hand, if the system is already infected with malware sending out information, the defense agent allows the information to go out, rendering it meaningless. The hardware architecture is immune to side attacks since such an attack would reveal information on the attack itself and not on the hardware. This cyber defense agent can be used to secure a point-to-point, point-to-multipoint, a whole network, and/or a single entity in the cyberspace. Therefore, ensuring trust between cyber resources. It can provide secure communication in an insecure network. We provide the hardware design and explain how it works. Scalability of the design is briefly discussed. (Protected by United States Patents No.: US 8,004,734; US 8,325,404; and other National Patents worldwide.)

  1. Fast DRR splat rendering using common consumer graphics hardware

    International Nuclear Information System (INIS)

    Spoerk, Jakob; Bergmann, Helmar; Wanschitz, Felix; Dong, Shuo; Birkfellner, Wolfgang

    2007-01-01

    Digitally rendered radiographs (DRR) are a vital part of various medical image processing applications such as 2D/3D registration for patient pose determination in image-guided radiotherapy procedures. This paper presents a technique to accelerate DRR creation by using conventional graphics hardware for the rendering process. DRR computation itself is done by an efficient volume rendering method named wobbled splatting. For programming the graphics hardware, NVIDIAs C for Graphics (Cg) is used. The description of an algorithm used for rendering DRRs on the graphics hardware is presented, together with a benchmark comparing this technique to a CPU-based wobbled splatting program. Results show a reduction of rendering time by about 70%-90% depending on the amount of data. For instance, rendering a volume of 2x10 6 voxels is feasible at an update rate of 38 Hz compared to 6 Hz on a common Intel-based PC using the graphics processing unit (GPU) of a conventional graphics adapter. In addition, wobbled splatting using graphics hardware for DRR computation provides higher resolution DRRs with comparable image quality due to special processing characteristics of the GPU. We conclude that DRR generation on common graphics hardware using the freely available Cg environment is a major step toward 2D/3D registration in clinical routine

  2. Scalable Performance Measurement and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gamblin, Todd [Univ. of North Carolina, Chapel Hill, NC (United States)

    2009-01-01

    Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Modern machines may contain 100,000 or more microprocessor cores, and the largest of these, IBM's Blue Gene/L, contains over 200,000 cores. Future systems are expected to support millions of concurrent tasks. In this dissertation, we focus on efficient techniques for measuring and analyzing the performance of applications running on very large parallel machines. Tuning the performance of large-scale applications can be a subtle and time-consuming task because application developers must measure and interpret data from many independent processes. While the volume of the raw data scales linearly with the number of tasks in the running system, the number of tasks is growing exponentially, and data for even small systems quickly becomes unmanageable. Transporting performance data from so many processes over a network can perturb application performance and make measurements inaccurate, and storing such data would require a prohibitive amount of space. Moreover, even if it were stored, analyzing the data would be extremely time-consuming. In this dissertation, we present novel methods for reducing performance data volume. The first draws on multi-scale wavelet techniques from signal processing to compress systemwide, time-varying load-balance data. The second uses statistical sampling to select a small subset of running processes to generate low-volume traces. A third approach combines sampling and wavelet compression to stratify performance data adaptively at run-time and to reduce further the cost of sampled tracing. We have integrated these approaches into Libra, a toolset for scalable load-balance analysis. We present Libra and show how it can be used to analyze data from large scientific applications scalably.

  3. A Prediction Packetizing Scheme for Reducing Channel Traffic in Transaction-Level Hardware/Software Co-Emulation

    OpenAIRE

    Lee , Jae-Gon; Chung , Moo-Kyoung; Ahn , Ki-Yong; Lee , Sang-Heon; Kyung , Chong-Min

    2005-01-01

    Submitted on behalf of EDAA (http://www.edaa.com/); International audience; This paper presents a scheme for efficient channel usage between simulator and accelerator where the accelerator models some RTL sub-blocks in the accelerator-based hardware/software co-simulation while the simulator runs transaction-level model of the remaining part of the whole chip being verified. With conventional simulation accelerator, evaluations of simulator and accelerator alternate at every valid simulation ...

  4. BIOLOGICALLY INSPIRED HARDWARE CELL ARCHITECTURE

    DEFF Research Database (Denmark)

    2010-01-01

    Disclosed is a system comprising: - a reconfigurable hardware platform; - a plurality of hardware units defined as cells adapted to be programmed to provide self-organization and self-maintenance of the system by means of implementing a program expressed in a programming language defined as DNA...... language, where each cell is adapted to communicate with one or more other cells in the system, and where the system further comprises a converter program adapted to convert keywords from the DNA language to a binary DNA code; where the self-organisation comprises that the DNA code is transmitted to one...... or more of the cells, and each of the one or more cells is adapted to determine its function in the system; where if a fault occurs in a first cell and the first cell ceases to perform its function, self-maintenance is performed by that the system transmits information to the cells that the first cell has...

  5. The principles of computer hardware

    CERN Document Server

    Clements, Alan

    2000-01-01

    Principles of Computer Hardware, now in its third edition, provides a first course in computer architecture or computer organization for undergraduates. The book covers the core topics of such a course, including Boolean algebra and logic design; number bases and binary arithmetic; the CPU; assembly language; memory systems; and input/output methods and devices. It then goes on to cover the related topics of computer peripherals such as printers; the hardware aspects of the operating system; and data communications, and hence provides a broader overview of the subject. Its readable, tutorial-based approach makes it an accessible introduction to the subject. The book has extensive in-depth coverage of two microprocessors, one of which (the 68000) is widely used in education. All chapters in the new edition have been updated. Major updates include: powerful software simulations of digital systems to accompany the chapters on digital design; a tutorial-based introduction to assembly language, including many exam...

  6. Performance-scalable volumetric data classification for online industrial inspection

    Science.gov (United States)

    Abraham, Aby J.; Sadki, Mustapha; Lea, R. M.

    2002-03-01

    Non-intrusive inspection and non-destructive testing of manufactured objects with complex internal structures typically requires the enhancement, analysis and visualization of high-resolution volumetric data. Given the increasing availability of fast 3D scanning technology (e.g. cone-beam CT), enabling on-line detection and accurate discrimination of components or sub-structures, the inherent complexity of classification algorithms inevitably leads to throughput bottlenecks. Indeed, whereas typical inspection throughput requirements range from 1 to 1000 volumes per hour, depending on density and resolution, current computational capability is one to two orders-of-magnitude less. Accordingly, speeding up classification algorithms requires both reduction of algorithm complexity and acceleration of computer performance. A shape-based classification algorithm, offering algorithm complexity reduction, by using ellipses as generic descriptors of solids-of-revolution, and supporting performance-scalability, by exploiting the inherent parallelism of volumetric data, is presented. A two-stage variant of the classical Hough transform is used for ellipse detection and correlation of the detected ellipses facilitates position-, scale- and orientation-invariant component classification. Performance-scalability is achieved cost-effectively by accelerating a PC host with one or more COTS (Commercial-Off-The-Shelf) PCI multiprocessor cards. Experimental results are reported to demonstrate the feasibility and cost-effectiveness of the data-parallel classification algorithm for on-line industrial inspection applications.

  7. Hunting for hardware changes in data centres

    International Nuclear Information System (INIS)

    Coelho dos Santos, M; Steers, I; Szebenyi, I; Xafi, A; Barring, O; Bonfillou, E

    2012-01-01

    With many servers and server parts the environment of warehouse sized data centres is increasingly complex. Server life-cycle management and hardware failures are responsible for frequent changes that need to be managed. To manage these changes better a project codenamed “hardware hound” focusing on hardware failure trending and hardware inventory has been started at CERN. By creating and using a hardware oriented data set - the inventory - with detailed information on servers and their parts as well as tracking changes to this inventory, the project aims at, for example, being able to discover trends in hardware failure rates.

  8. Requirements for Scalable Access Control and Security Management Architectures

    National Research Council Canada - National Science Library

    Keromytis, Angelos D; Smith, Jonathan M

    2005-01-01

    Maximizing local autonomy has led to a scalable Internet. Scalability and the capacity for distributed control have unfortunately not extended well to resource access control policies and mechanisms...

  9. Adaptive format conversion for scalable video coding

    Science.gov (United States)

    Wan, Wade K.; Lim, Jae S.

    2001-12-01

    The enhancement layer in many scalable coding algorithms is composed of residual coding information. There is another type of information that can be transmitted instead of (or in addition to) residual coding. Since the encoder has access to the original sequence, it can utilize adaptive format conversion (AFC) to generate the enhancement layer and transmit the different format conversion methods as enhancement data. This paper investigates the use of adaptive format conversion information as enhancement data in scalable video coding. Experimental results are shown for a wide range of base layer qualities and enhancement bitrates to determine when AFC can improve video scalability. Since the parameters needed for AFC are small compared to residual coding, AFC can provide video scalability at low enhancement layer bitrates that are not possible with residual coding. In addition, AFC can also be used in addition to residual coding to improve video scalability at higher enhancement layer bitrates. Adaptive format conversion has not been studied in detail, but many scalable applications may benefit from it. An example of an application that AFC is well-suited for is the migration path for digital television where AFC can provide immediate video scalability as well as assist future migrations.

  10. Modern control techniques for accelerators

    International Nuclear Information System (INIS)

    Goodwin, R.W.; Shea, M.F.

    1984-01-01

    Beginning in the mid to late sixties, most new accelerators were designed to include computer based control systems. Although each installation differed in detail, the technology of the sixties and early to mid seventies dictated an architecture that was essentially the same for the control systems of that era. A mini-computer was connected to the hardware and to a console. Two developments have changed the architecture of modern systems: the microprocessor and local area networks. This paper discusses these two developments and demonstrates their impact on control system design and implementation by way of describing a possible architecture for any size of accelerator. Both hardware and software aspects are included

  11. ALADDIN - enhancing applicability and scalability

    International Nuclear Information System (INIS)

    Roverso, Davide

    2001-02-01

    The ALADDIN project aims at the study and development of flexible, accurate, and reliable techniques and principles for computerised event classification and fault diagnosis for complex machinery and industrial processes. The main focus of the project is on advanced numerical techniques, such as wavelets, and empirical modelling with neural networks. This document reports on recent important advancements, which significantly widen the practical applicability of the developed principles, both in terms of flexibility of use, and in terms of scalability to large problem domains. In particular, two novel techniques are here described. The first, which we call Wavelet On- Line Pre-processing (WOLP), is aimed at extracting, on-line, relevant dynamic features from the process data streams. This technique allows a system a greater flexibility in detecting and processing transients at a range of different time scales. The second technique, which we call Autonomous Recursive Task Decomposition (ARTD), is aimed at tackling the problem of constructing a classifier able to discriminate among a large number of different event/fault classes, which is often the case when the application domain is a complex industrial process. ARTD also allows for incremental application development (i.e. the incremental addition of new classes to an existing classifier, without the need of retraining the entire system), and for simplified application maintenance. The description of these novel techniques is complemented by reports of quantitative experiments that show in practice the extent of these improvements. (Author)

  12. Fast and scalable inequality joins

    KAUST Repository

    Khayyat, Zuhair

    2016-09-07

    Inequality joins, which is to join relations with inequality conditions, are used in various applications. Optimizing joins has been the subject of intensive research ranging from efficient join algorithms such as sort-merge join, to the use of efficient indices such as (Formula presented.)-tree, (Formula presented.)-tree and Bitmap. However, inequality joins have received little attention and queries containing such joins are notably very slow. In this paper, we introduce fast inequality join algorithms based on sorted arrays and space-efficient bit-arrays. We further introduce a simple method to estimate the selectivity of inequality joins which is then used to optimize multiple predicate queries and multi-way joins. Moreover, we study an incremental inequality join algorithm to handle scenarios where data keeps changing. We have implemented a centralized version of these algorithms on top of PostgreSQL, a distributed version on top of Spark SQL, and an existing data cleaning system, Nadeef. By comparing our algorithms against well-known optimization techniques for inequality joins, we show our solution is more scalable and several orders of magnitude faster. © 2016 Springer-Verlag Berlin Heidelberg

  13. VALU, AVX and GPU acceleration techniques for parallel FDTD methods

    CERN Document Server

    Yu, Wenhua

    2013-01-01

    This book introduces a general hardware acceleration technique that can significantly speed up FDTD simulations and their applications to engineering problems without requiring any additional hardware devices. This acceleration of complex problems can be efficient in saving both time and money and once learned these new techniques can be used repeatedly.

  14. Qualification of software and hardware

    International Nuclear Information System (INIS)

    Gossner, S.; Schueller, H.; Gloee, G.

    1987-01-01

    The qualification of on-line process control equipment is subdivided into three areas: 1) materials and structural elements; 2) on-line process-control components and devices; 3) electrical systems (reactor protection and confinement system). Microprocessor-aided process-control equipment are difficult to verify for failure-free function owing to the complexity of the functional structures of the hardware and to the variety of the software feasible for microprocessors. Hence, qualification will make great demands on the inspecting expert. (DG) [de

  15. Interfacing to accelerator instrumentation

    International Nuclear Information System (INIS)

    Shea, T.J.

    1995-01-01

    As the sensory system for an accelerator, the beam instrumentation provides a tremendous amount of diagnostic information. Access to this information can vary from periodic spot checks by operators to high bandwidth data acquisition during studies. In this paper, example applications will illustrate the requirements on interfaces between the control system and the instrumentation hardware. A survey of the major accelerator facilities will identify the most popular interface standards. The impact of developments such as isochronous protocols and embedded digital signal processing will also be discussed

  16. Door Hardware and Installations; Carpentry: 901894.

    Science.gov (United States)

    Dade County Public Schools, Miami, FL.

    The curriculum guide outlines a course designed to provide instruction in the selection, preparation, and installation of hardware for door assemblies. The course is divided into five blocks of instruction (introduction to doors and hardware, door hardware, exterior doors and jambs, interior doors and jambs, and a quinmester post-test) totaling…

  17. Compact accelerator for medical therapy

    Science.gov (United States)

    Caporaso, George J.; Chen, Yu-Jiuan; Hawkins, Steven A.; Sampayan, Stephen E.; Paul, Arthur C.

    2010-05-04

    A compact accelerator system having an integrated particle generator-linear accelerator with a compact, small-scale construction capable of producing an energetic (.about.70-250 MeV) proton beam or other nuclei and transporting the beam direction to a medical therapy patient without the need for bending magnets or other hardware often required for remote beam transport. The integrated particle generator-accelerator is actuable as a unitary body on a support structure to enable scanning of a particle beam by direction actuation of the particle generator-accelerator.

  18. Embedded High Performance Scalable Computing Systems

    National Research Council Canada - National Science Library

    Ngo, David

    2003-01-01

    The Embedded High Performance Scalable Computing Systems (EHPSCS) program is a cooperative agreement between Sanders, A Lockheed Martin Company and DARPA that ran for three years, from Apr 1995 - Apr 1998...

  19. Tomographic image reconstruction and rendering with texture-mapping hardware

    International Nuclear Information System (INIS)

    Azevedo, S.G.; Cabral, B.K.; Foran, J.

    1994-07-01

    The image reconstruction problem, also known as the inverse Radon transform, for x-ray computed tomography (CT) is found in numerous applications in medicine and industry. The most common algorithm used in these cases is filtered backprojection (FBP), which, while a simple procedure, is time-consuming for large images on any type of computational engine. Specially-designed, dedicated parallel processors are commonly used in medical CT scanners, whose results are then passed to graphics workstation for rendering and analysis. However, a fast direct FBP algorithm can be implemented on modern texture-mapping hardware in current high-end workstation platforms. This is done by casting the FBP algorithm as an image warping operation with summing. Texture-mapping hardware, such as that on the Silicon Graphics Reality Engine (TM), shows around 600 times speedup of backprojection over a CPU-based implementation (a 100 Mhz R4400 in this case). This technique has the further advantages of flexibility and rapid programming. In addition, the same hardware can be used for both image reconstruction and for volumetric rendering. The techniques can also be used to accelerate iterative reconstruction algorithms. The hardware architecture also allows more complex operations than straight-ray backprojection if they are required, including fan-beam, cone-beam, and curved ray paths, with little or no speed penalties

  20. Resource-aware complexity scalability for mobile MPEG encoding

    NARCIS (Netherlands)

    Mietens, S.O.; With, de P.H.N.; Hentschel, C.; Panchanatan, S.; Vasudev, B.

    2004-01-01

    Complexity scalability attempts to scale the required resources of an algorithm with the chose quality settings, in order to broaden the application range. In this paper, we present complexity-scalable MPEG encoding of which the core processing modules are modified for scalability. Scalability is

  1. Fast and Reliable Mouse Picking Using Graphics Hardware

    Directory of Open Access Journals (Sweden)

    Hanli Zhao

    2009-01-01

    Full Text Available Mouse picking is the most commonly used intuitive operation to interact with 3D scenes in a variety of 3D graphics applications. High performance for such operation is necessary in order to provide users with fast responses. This paper proposes a fast and reliable mouse picking algorithm using graphics hardware for 3D triangular scenes. Our approach uses a multi-layer rendering algorithm to perform the picking operation in linear time complexity. The objectspace based ray-triangle intersection test is implemented in a highly parallelized geometry shader. After applying the hardware-supported occlusion queries, only a small number of objects (or sub-objects are rendered in subsequent layers, which accelerates the picking efficiency. Experimental results demonstrate the high performance of our novel approach. Due to its simplicity, our algorithm can be easily integrated into existing real-time rendering systems.

  2. Chromium Renderserver: Scalable and Open Source Remote RenderingInfrastructure

    Energy Technology Data Exchange (ETDEWEB)

    Paul, Brian; Ahern, Sean; Bethel, E. Wes; Brugger, Eric; Cook,Rich; Daniel, Jamison; Lewis, Ken; Owen, Jens; Southard, Dale

    2007-12-01

    Chromium Renderserver (CRRS) is software infrastructure thatprovides the ability for one or more users to run and view image outputfrom unmodified, interactive OpenGL and X11 applications on a remote,parallel computational platform equipped with graphics hardwareaccelerators via industry-standard Layer 7 network protocolsand clientviewers. The new contributions of this work include a solution to theproblem of synchronizing X11 and OpenGL command streams, remote deliveryof parallel hardware-accelerated rendering, and a performance analysis ofseveral different optimizations that are generally applicable to avariety of rendering architectures. CRRSis fully operational, Open Sourcesoftware.

  3. Superconductivity and future accelerators

    International Nuclear Information System (INIS)

    Danby, G.T.; Jackson, J.W.

    1963-01-01

    For 50 years particle accelerators employing accelerating cavities and deflecting magnets have been developed at a prodigious rate. New accelerator concepts and hardware ensembles have yielded great improvements in performance and GeV/$. The great idea for collective acceleration resulting from intense auxiliary charged-particle beams or laser light may or may not be just around the corner. In its absence, superconductivity (SC) applied both to rf cavities and to magnets opened up the potential for very large accelerators without excessive energy consumption and with other economies, even with the cw operation desirable for colliding beams. HEP has aggressively pioneered this new technology: the Fermilab single ring 1 TeV accelerator - 2 TeV collider is near the testing stage. Brookhaven National Laboratory's high luminosity pp 2 ring 800 GeV CBA collider is well into construction. Other types of superconducting projects are in the planning stage with much background R and D accomplished. The next generation of hadron colliders under discussion involves perhaps a 20 TeV ring (or rings) with 40 TeV CM energy. This is a very large machine: even if the highest practical field B approx. 10T is used, the radius is 10x that of the Fermilab accelerator. An extreme effort to get maximum GeV/$ may be crucial even for serious consideration of funding

  4. Generic algorithms for high performance scalable geocomputing

    Science.gov (United States)

    de Jong, Kor; Schmitz, Oliver; Karssenberg, Derek

    2016-04-01

    During the last decade, the characteristics of computing hardware have changed a lot. For example, instead of a single general purpose CPU core, personal computers nowadays contain multiple cores per CPU and often general purpose accelerators, like GPUs. Additionally, compute nodes are often grouped together to form clusters or a supercomputer, providing enormous amounts of compute power. For existing earth simulation models to be able to use modern hardware platforms, their compute intensive parts must be rewritten. This can be a major undertaking and may involve many technical challenges. Compute tasks must be distributed over CPU cores, offloaded to hardware accelerators, or distributed to different compute nodes. And ideally, all of this should be done in such a way that the compute task scales well with the hardware resources. This presents two challenges: 1) how to make good use of all the compute resources and 2) how to make these compute resources available for developers of simulation models, who may not (want to) have the required technical background for distributing compute tasks. The first challenge requires the use of specialized technology (e.g.: threads, OpenMP, MPI, OpenCL, CUDA). The second challenge requires the abstraction of the logic handling the distribution of compute tasks from the model-specific logic, hiding the technical details from the model developer. To assist the model developer, we are developing a C++ software library (called Fern) containing algorithms that can use all CPU cores available in a single compute node (distributing tasks over multiple compute nodes will be done at a later stage). The algorithms are grid-based (finite difference) and include local and spatial operations such as convolution filters. The algorithms handle distribution of the compute tasks to CPU cores internally. In the resulting model the low-level details of how this is done is separated from the model-specific logic representing the modeled system

  5. Laplacian embedded regression for scalable manifold regularization.

    Science.gov (United States)

    Chen, Lin; Tsang, Ivor W; Xu, Dong

    2012-06-01

    Semi-supervised learning (SSL), as a powerful tool to learn from a limited number of labeled data and a large number of unlabeled data, has been attracting increasing attention in the machine learning community. In particular, the manifold regularization framework has laid solid theoretical foundations for a large family of SSL algorithms, such as Laplacian support vector machine (LapSVM) and Laplacian regularized least squares (LapRLS). However, most of these algorithms are limited to small scale problems due to the high computational cost of the matrix inversion operation involved in the optimization problem. In this paper, we propose a novel framework called Laplacian embedded regression by introducing an intermediate decision variable into the manifold regularization framework. By using ∈-insensitive loss, we obtain the Laplacian embedded support vector regression (LapESVR) algorithm, which inherits the sparse solution from SVR. Also, we derive Laplacian embedded RLS (LapERLS) corresponding to RLS under the proposed framework. Both LapESVR and LapERLS possess a simpler form of a transformed kernel, which is the summation of the original kernel and a graph kernel that captures the manifold structure. The benefits of the transformed kernel are two-fold: (1) we can deal with the original kernel matrix and the graph Laplacian matrix in the graph kernel separately and (2) if the graph Laplacian matrix is sparse, we only need to perform the inverse operation for a sparse matrix, which is much more efficient when compared with that for a dense one. Inspired by kernel principal component analysis, we further propose to project the introduced decision variable into a subspace spanned by a few eigenvectors of the graph Laplacian matrix in order to better reflect the data manifold, as well as accelerate the calculation of the graph kernel, allowing our methods to efficiently and effectively cope with large scale SSL problems. Extensive experiments on both toy and real

  6. Argonne Wakefield Accelerator update '92

    International Nuclear Information System (INIS)

    Rosing, M.; Balka, L.; Chojnacki, E.; Gai, W.; Ho, C.; Konecny, R.; Power, J.; Schoessow, P.; Simpson, J.

    1992-01-01

    The construction of the Argonne Wakefield Accelerator (AWA) is under way. The majority of the hardware is about to be delivered or is installed. Radiation safety systems are in the review process, and the laser system is operational. Bunch production should begin in December 1992. 4 refs., 5 figs

  7. Scalable Multi-Platform Distribution of Spatial 3d Contents

    Science.gov (United States)

    Klimke, J.; Hagedorn, B.; Döllner, J.

    2013-09-01

    Virtual 3D city models provide powerful user interfaces for communication of 2D and 3D geoinformation. Providing high quality visualization of massive 3D geoinformation in a scalable, fast, and cost efficient manner is still a challenging task. Especially for mobile and web-based system environments, software and hardware configurations of target systems differ significantly. This makes it hard to provide fast, visually appealing renderings of 3D data throughout a variety of platforms and devices. Current mobile or web-based solutions for 3D visualization usually require raw 3D scene data such as triangle meshes together with textures delivered from server to client, what makes them strongly limited in terms of size and complexity of the models they can handle. In this paper, we introduce a new approach for provisioning of massive, virtual 3D city models on different platforms namely web browsers, smartphones or tablets, by means of an interactive map assembled from artificial oblique image tiles. The key concept is to synthesize such images of a virtual 3D city model by a 3D rendering service in a preprocessing step. This service encapsulates model handling and 3D rendering techniques for high quality visualization of massive 3D models. By generating image tiles using this service, the 3D rendering process is shifted from the client side, which provides major advantages: (a) The complexity of the 3D city model data is decoupled from data transfer complexity (b) the implementation of client applications is simplified significantly as 3D rendering is encapsulated on server side (c) 3D city models can be easily deployed for and used by a large number of concurrent users, leading to a high degree of scalability of the overall approach. All core 3D rendering techniques are performed on a dedicated 3D rendering server, and thin-client applications can be compactly implemented for various devices and platforms.

  8. Can Accelerators Accelerate Learning?

    International Nuclear Information System (INIS)

    Santos, A. C. F.; Fonseca, P.; Coelho, L. F. S.

    2009-01-01

    The 'Young Talented' education program developed by the Brazilian State Funding Agency (FAPERJ)[1] makes it possible for high-schools students from public high schools to perform activities in scientific laboratories. In the Atomic and Molecular Physics Laboratory at Federal University of Rio de Janeiro (UFRJ), the students are confronted with modern research tools like the 1.7 MV ion accelerator. Being a user-friendly machine, the accelerator is easily manageable by the students, who can perform simple hands-on activities, stimulating interest in physics, and getting the students close to modern laboratory techniques.

  9. Can Accelerators Accelerate Learning?

    Science.gov (United States)

    Santos, A. C. F.; Fonseca, P.; Coelho, L. F. S.

    2009-03-01

    The 'Young Talented' education program developed by the Brazilian State Funding Agency (FAPERJ) [1] makes it possible for high-schools students from public high schools to perform activities in scientific laboratories. In the Atomic and Molecular Physics Laboratory at Federal University of Rio de Janeiro (UFRJ), the students are confronted with modern research tools like the 1.7 MV ion accelerator. Being a user-friendly machine, the accelerator is easily manageable by the students, who can perform simple hands-on activities, stimulating interest in physics, and getting the students close to modern laboratory techniques.

  10. Constructing Hardware in a Scale Embedded Language

    Energy Technology Data Exchange (ETDEWEB)

    2014-08-21

    Chisel is a new open-source hardware construction language developed at UC Berkeley that supports advanced hardware design using highly parameterized generators and layered domain-specific hardware languages. Chisel is embedded in the Scala programming language, which raises the level of hardware design abstraction by providing concepts including object orientation, functional programming, parameterized types, and type inference. From the same source, Chisel can generate a high-speed C++-based cycle-accurate software simulator, or low-level Verilog designed to pass on to standard ASIC or FPGA tools for synthesis and place and route.

  11. Open-source hardware for medical devices.

    Science.gov (United States)

    Niezen, Gerrit; Eslambolchilar, Parisa; Thimbleby, Harold

    2016-04-01

    Open-source hardware is hardware whose design is made publicly available so anyone can study, modify, distribute, make and sell the design or the hardware based on that design. Some open-source hardware projects can potentially be used as active medical devices. The open-source approach offers a unique combination of advantages, including reducing costs and faster innovation. This article compares 10 of open-source healthcare projects in terms of how easy it is to obtain the required components and build the device.

  12. Hardware Resource Allocation for Hardware/Software Partitioning in the LYCOS System

    DEFF Research Database (Denmark)

    Grode, Jesper Nicolai Riis; Knudsen, Peter Voigt; Madsen, Jan

    1998-01-01

    as a designer's/design tool's aid to generate good hardware allocations for use in hardware/software partitioning. The algorithm has been implemented in a tool under the LYCOS system. The results show that the allocations produced by the algorithm come close to the best allocations obtained by exhaustive search.......This paper presents a novel hardware resource allocation technique for hardware/software partitioning. It allocates hardware resources to the hardware data-path using information such as data-dependencies between operations in the application, and profiling information. The algorithm is useful...

  13. The Concept of Business Model Scalability

    DEFF Research Database (Denmark)

    Lund, Morten; Nielsen, Christian

    2018-01-01

    -term pro table business. However, the main message of this article is that while providing a good value proposition may help the rm ‘get by’, the really successful businesses of today are those able to reach the sweet-spot of business model scalability. Design/Methodology/Approach: The article is based...... on a ve-year longitudinal action research project of over 90 companies that participated in the International Center for Innovation project aimed at building 10 global network-based business models. Findings: This article introduces and discusses the term scalability from a company-level perspective......Purpose: The purpose of the article is to de ne what scalable business models are. Central to the contemporary understanding of business models is the value proposition towards the customer and the hypotheses generated about delivering value to the customer which become a good foundation for a long...

  14. Declarative and Scalable Selection for Map Visualizations

    DEFF Research Database (Denmark)

    Kefaloukos, Pimin Konstantin Balic

    and is itself a source and cause of prolific data creation. This calls for scalable map processing techniques that can handle the data volume and which play well with the predominant data models on the Web. (4) Maps are now consumed around the clock by a global audience. While historical maps were singleuser......-defined constraints as well as custom objectives. The purpose of the language is to derive a target multi-scale database from a source database according to holistic specifications. (b) The Glossy SQL compiler allows Glossy SQL to be scalably executed in a spatial analytics system, such as a spatial relational......, there are indications that the method is scalable for databases that contain millions of records, especially if the target language of the compiler is substituted by a cluster-ready variant of SQL. While several realistic use cases for maps have been implemented in CVL, additional non-geographic data visualization uses...

  15. Scalable Density-Based Subspace Clustering

    DEFF Research Database (Denmark)

    Müller, Emmanuel; Assent, Ira; Günnemann, Stephan

    2011-01-01

    For knowledge discovery in high dimensional databases, subspace clustering detects clusters in arbitrary subspace projections. Scalability is a crucial issue, as the number of possible projections is exponential in the number of dimensions. We propose a scalable density-based subspace clustering...... method that steers mining to few selected subspace clusters. Our novel steering technique reduces subspace processing by identifying and clustering promising subspaces and their combinations directly. Thereby, it narrows down the search space while maintaining accuracy. Thorough experiments on real...... and synthetic databases show that steering is efficient and scalable, with high quality results. For future work, our steering paradigm for density-based subspace clustering opens research potential for speeding up other subspace clustering approaches as well....

  16. The ELSA control system hardware

    International Nuclear Information System (INIS)

    Nietzel, Ch.; Schillo, M.; Welt, H.J.; Wermelskirchen, C.

    1984-01-01

    ELSA is an Electron Stretcher and Acclerator ring fed by the Bonn 2.5 GeV Electron Synchrotron and has been designed to provide electron and bremsstrahlung beams with high duty cycle. In stretcher mode operation electron pulses from the synchrotron are injected into ELSA with a maximum rate of 50 Hz. The electrons are then ejected from ELSA at a constant rate within 20 msec or more. The duty cycle will be of the order of 95%. When used as a post accelerator to yield up to 3.5 GeV electrons ELSA is fed with 1.75 GeV electrons from the synchrotron. Times for ramping up and down are both fixed to 150 msec. With a maximum length of the high energy flat top of 500 msec and a 20 msec injection plateau a duty cycle of up to 60% will be achieved. ELSA is planned to operate in the stretcher mode at the end of 1985 and as a post accelerator about one year later

  17. Software performance and scalability a quantitative approach

    CERN Document Server

    Liu, Henry H

    2009-01-01

    Praise from the Reviewers:"The practicality of the subject in a real-world situation distinguishes this book from othersavailable on the market."—Professor Behrouz Far, University of Calgary"This book could replace the computer organization texts now in use that every CS and CpEstudent must take. . . . It is much needed, well written, and thoughtful."—Professor Larry Bernstein, Stevens Institute of TechnologyA distinctive, educational text onsoftware performance and scalabilityThis is the first book to take a quantitative approach to the subject of software performance and scalability

  18. From Digital Disruption to Business Model Scalability

    DEFF Research Database (Denmark)

    Nielsen, Christian; Lund, Morten; Thomsen, Peter Poulsen

    2017-01-01

    This article discusses the terms disruption, digital disruption, business models and business model scalability. It illustrates how managers should be using these terms for the benefit of their business by developing business models capable of achieving exponentially increasing returns to scale...... will seldom lead to business model scalability capable of competing with digital disruption(s)....... as a response to digital disruption. A series of case studies illustrate that besides frequent existing messages in the business literature relating to the importance of creating agile businesses, both in growing and declining economies, as well as hard to copy value propositions or value propositions that take...

  19. Integration of an intelligent systems behavior simulator and a scalable soldier-machine interface

    Science.gov (United States)

    Johnson, Tony; Manteuffel, Chris; Brewster, Benjamin; Tierney, Terry

    2007-04-01

    As the Army's Future Combat Systems (FCS) introduce emerging technologies and new force structures to the battlefield, soldiers will increasingly face new challenges in workload management. The next generation warfighter will be responsible for effectively managing robotic assets in addition to performing other missions. Studies of future battlefield operational scenarios involving the use of automation, including the specification of existing and proposed technologies, will provide significant insight into potential problem areas regarding soldier workload. The US Army Tank Automotive Research, Development, and Engineering Center (TARDEC) is currently executing an Army technology objective program to analyze and evaluate the effect of automated technologies and their associated control devices with respect to soldier workload. The Human-Robotic Interface (HRI) Intelligent Systems Behavior Simulator (ISBS) is a human performance measurement simulation system that allows modelers to develop constructive simulations of military scenarios with various deployments of interface technologies in order to evaluate operator effectiveness. One such interface is TARDEC's Scalable Soldier-Machine Interface (SMI). The scalable SMI provides a configurable machine interface application that is capable of adapting to several hardware platforms by recognizing the physical space limitations of the display device. This paper describes the integration of the ISBS and Scalable SMI applications, which will ultimately benefit both systems. The ISBS will be able to use the Scalable SMI to visualize the behaviors of virtual soldiers performing HRI tasks, such as route planning, and the scalable SMI will benefit from stimuli provided by the ISBS simulation environment. The paper describes the background of each system and details of the system integration approach.

  20. Hardware-software face detection system based on multi-block local binary patterns

    Science.gov (United States)

    Acasandrei, Laurentiu; Barriga, Angel

    2015-03-01

    Face detection is an important aspect for biometrics, video surveillance and human computer interaction. Due to the complexity of the detection algorithms any face detection system requires a huge amount of computational and memory resources. In this communication an accelerated implementation of MB LBP face detection algorithm targeting low frequency, low memory and low power embedded system is presented. The resulted implementation is time deterministic and uses a customizable AMBA IP hardware accelerator. The IP implements the kernel operations of the MB-LBP algorithm and can be used as universal accelerator for MB LBP based applications. The IP employs 8 parallel MB-LBP feature evaluators cores, uses a deterministic bandwidth, has a low area profile and the power consumption is ~95 mW on a Virtex5 XC5VLX50T. The resulted implementation acceleration gain is between 5 to 8 times, while the hardware MB-LBP feature evaluation gain is between 69 and 139 times.

  1. Scalable Multicasting over Next-Generation Internet Design, Analysis and Applications

    CERN Document Server

    Tian, Xiaohua

    2013-01-01

    Next-generation Internet providers face high expectations, as contemporary users worldwide expect high-quality multimedia functionality in a landscape of ever-expanding network applications. This volume explores the critical research issue of turning today’s greatly enhanced hardware capacity to good use in designing a scalable multicast  protocol for supporting large-scale multimedia services. Linking new hardware to improved performance in the Internet’s next incarnation is a research hot-spot in the computer communications field.   The methodical presentation deals with the key questions in turn: from the mechanics of multicast protocols to current state-of-the-art designs, and from methods of theoretical analysis of these protocols to applying them in the ns2 network simulator, known for being hard to extend. The authors’ years of research in the field inform this thorough treatment, which covers details such as applying AOM (application-oriented multicast) protocol to IPTV provision and resolving...

  2. Content-Aware Scalability-Type Selection for Rate Adaptation of Scalable Video

    Directory of Open Access Journals (Sweden)

    Tekalp A Murat

    2007-01-01

    Full Text Available Scalable video coders provide different scaling options, such as temporal, spatial, and SNR scalabilities, where rate reduction by discarding enhancement layers of different scalability-type results in different kinds and/or levels of visual distortion depend on the content and bitrate. This dependency between scalability type, video content, and bitrate is not well investigated in the literature. To this effect, we first propose an objective function that quantifies flatness, blockiness, blurriness, and temporal jerkiness artifacts caused by rate reduction by spatial size, frame rate, and quantization parameter scaling. Next, the weights of this objective function are determined for different content (shot types and different bitrates using a training procedure with subjective evaluation. Finally, a method is proposed for choosing the best scaling type for each temporal segment that results in minimum visual distortion according to this objective function given the content type of temporal segments. Two subjective tests have been performed to validate the proposed procedure for content-aware selection of the best scalability type on soccer videos. Soccer videos scaled from 600 kbps to 100 kbps by the proposed content-aware selection of scalability type have been found visually superior to those that are scaled using a single scalability option over the whole sequence.

  3. Plasma accelerators

    International Nuclear Information System (INIS)

    Bingham, R.; Angelis, U. de; Johnston, T.W.

    1991-01-01

    Recently attention has focused on charged particle acceleration in a plasma by a fast, large amplitude, longitudinal electron plasma wave. The plasma beat wave and plasma wakefield accelerators are two efficient ways of producing ultra-high accelerating gradients. Starting with the plasma beat wave accelerator (PBWA) and laser wakefield accelerator (LWFA) schemes and the plasma wakefield accelerator (PWFA) steady progress has been made in theory, simulations and experiments. Computations are presented for the study of LWFA. (author)

  4. Linear Accelerators

    International Nuclear Information System (INIS)

    Vretenar, M

    2014-01-01

    The main features of radio-frequency linear accelerators are introduced, reviewing the different types of accelerating structures and presenting the main characteristics aspects of linac beam dynamics

  5. Computer hardware description languages - A tutorial

    Science.gov (United States)

    Shiva, S. G.

    1979-01-01

    The paper introduces hardware description languages (HDL) as useful tools for hardware design and documentation. The capabilities and limitations of HDLs are discussed along with the guidelines needed in selecting an appropriate HDL. The directions for future work are provided and attention is given to the implementation of HDLs in microcomputers.

  6. Implementation of the Lattice Boltzmann Method on Heterogeneous Hardware and Platforms using OpenCL

    Directory of Open Access Journals (Sweden)

    TEKIC, P. M.

    2012-02-01

    Full Text Available The Lattice Boltzmann method (LBM has become an alternative method for computational fluid dynamics with a wide range of applications. Besides its numerical stability and accuracy, one of the major advantages of LBM is its relatively easy parallelization and, hence, it is especially well fitted to many-core hardware as graphics processing units (GPU. The majority of work concerning LBM implementation on GPU's has used the CUDA programming model, supported exclusively by NVIDIA. Recently, the open standard for parallel programming of heterogeneous systems (OpenCL has been introduced. OpenCL standard matures and is supported on processors from most vendors. In this paper, we make use of the OpenCL framework for the lattice Boltzmann method simulation, using hardware accelerators - AMD ATI Radeon GPU, AMD Dual-Core CPU and NVIDIA GeForce GPU's. Application has been developed using a combination of Java and OpenCL programming languages. Java bindings for OpenCL have been utilized. This approach offers the benefits of hardware and operating system independence, as well as speeding up of lattice Boltzmann algorithm. It has been showed that the developed lattice Boltzmann source code can be executed without modification on all of the used hardware accelerators. Performance results have been presented and compared for the hardware accelerators that have been utilized.

  7. An evaluation of Skylab habitability hardware

    Science.gov (United States)

    Stokes, J.

    1974-01-01

    For effective mission performance, participants in space missions lasting 30-60 days or longer must be provided with hardware to accommodate their personal needs. Such habitability hardware was provided on Skylab. Equipment defined as habitability hardware was that equipment composing the food system, water system, sleep system, waste management system, personal hygiene system, trash management system, and entertainment equipment. Equipment not specifically defined as habitability hardware but which served that function were the Wardroom window, the exercise equipment, and the intercom system, which was occasionally used for private communications. All Skylab habitability hardware generally functioned as intended for the three missions, and most items could be considered as adequate concepts for future flights of similar duration. Specific components were criticized for their shortcomings.

  8. Comparative Modal Analysis of Sieve Hardware Designs

    Science.gov (United States)

    Thompson, Nathaniel

    2012-01-01

    The CMTB Thwacker hardware operates as a testbed analogue for the Flight Thwacker and Sieve components of CHIMRA, a device on the Curiosity Rover. The sieve separates particles with a diameter smaller than 150 microns for delivery to onboard science instruments. The sieving behavior of the testbed hardware should be similar to the Flight hardware for the results to be meaningful. The elastodynamic behavior of both sieves was studied analytically using the Rayleigh Ritz method in conjunction with classical plate theory. Finite element models were used to determine the mode shapes of both designs, and comparisons between the natural frequencies and mode shapes were made. The analysis predicts that the performance of the CMTB Thwacker will closely resemble the performance of the Flight Thwacker within the expected steady state operating regime. Excitations of the testbed hardware that will mimic the flight hardware were recommended, as were those that will improve the efficiency of the sieving process.

  9. Using scalable vector graphics to evolve art

    NARCIS (Netherlands)

    den Heijer, E.; Eiben, A. E.

    2016-01-01

    In this paper, we describe our investigations of the use of scalable vector graphics as a genotype representation in evolutionary art. We describe the technical aspects of using SVG in evolutionary art, and explain our custom, SVG specific operators initialisation, mutation and crossover. We perform

  10. Scalable Domain Decomposed Monte Carlo Particle Transport

    Energy Technology Data Exchange (ETDEWEB)

    O' Brien, Matthew Joseph [Univ. of California, Davis, CA (United States)

    2013-12-05

    In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.

  11. Scalable Open Source Smart Grid Simulator (SGSim)

    DEFF Research Database (Denmark)

    Ebeid, Emad Samuel Malki; Jacobsen, Rune Hylsberg; Stefanni, Francesco

    2017-01-01

    . This paper presents an open source smart grid simulator (SGSim). The simulator is based on open source SystemC Network Simulation Library (SCNSL) and aims to model scalable smart grid applications. SGSim has been tested under different smart grid scenarios that contain hundreds of thousands of households...

  12. Cooperative Scalable Moving Continuous Query Processing

    DEFF Research Database (Denmark)

    Li, Xiaohui; Karras, Panagiotis; Jensen, Christian S.

    2012-01-01

    of the global view and handle the majority of the workload. Meanwhile, moving clients, having basic memory and computation resources, handle small portions of the workload. This model is further enhanced by dynamic region allocation and grid size adjustment mechanisms that reduce the communication...... and computation cost for both servers and clients. An experimental study demonstrates that our approaches offer better scalability than competitors...

  13. Scalable optical switches for computing applications

    NARCIS (Netherlands)

    White, I.H.; Aw, E.T.; Williams, K.A.; Wang, Haibo; Wonfor, A.; Penty, R.V.

    2009-01-01

    A scalable photonic interconnection network architecture is proposed whereby a Clos network is populated with broadcast-and-select stages. This enables the efficient exploitation of an emerging class of photonic integrated switch fabric. A low distortion space switch technology based on recently

  14. Toward a Scalable and Sustainable Intervention for Complementary Food Safety.

    Science.gov (United States)

    Rahman, Musarrat J; Nizame, Fosiul A; Nuruzzaman, Mohammad; Akand, Farhana; Islam, Mohammad Aminul; Parvez, Sarker Masud; Stewart, Christine P; Unicomb, Leanne; Luby, Stephen P; Winch, Peter J

    2016-06-01

    Contaminated complementary foods are associated with diarrhea and malnutrition among children aged 6 to 24 months. However, existing complementary food safety intervention models are likely not scalable and sustainable. To understand current behaviors, motivations for these behaviors, and the potential barriers to behavior change and to identify one or two simple actions that can address one or few food contamination pathways and have potential to be sustainably delivered to a larger population. Data were collected from 2 rural sites in Bangladesh through semistructured observations (12), video observations (12), in-depth interviews (18), and focus group discussions (3). Although mothers report preparing dedicated foods for children, observations show that these are not separate from family foods. Children are regularly fed store-bought foods that are perceived to be bad for children. Mothers explained that long storage durations, summer temperatures, flies, animals, uncovered food, and unclean utensils are threats to food safety. Covering foods, storing foods on elevated surfaces, and reheating foods before consumption are methods believed to keep food safe. Locally made cabinet-like hardware is perceived to be acceptable solution to address reported food safety threats. Conventional approaches that include teaching food safety and highlighting benefits such as reduced contamination may be a disincentive for rural mothers who need solutions for their physical environment. We propose extending existing beneficial behaviors by addressing local preferences of taste and convenience. © The Author(s) 2016.

  15. Highly scalable parallel processing of extracellular recordings of Multielectrode Arrays.

    Science.gov (United States)

    Gehring, Tiago V; Vasilaki, Eleni; Giugliano, Michele

    2015-01-01

    Technological advances of Multielectrode Arrays (MEAs) used for multisite, parallel electrophysiological recordings, lead to an ever increasing amount of raw data being generated. Arrays with hundreds up to a few thousands of electrodes are slowly seeing widespread use and the expectation is that more sophisticated arrays will become available in the near future. In order to process the large data volumes resulting from MEA recordings there is a pressing need for new software tools able to process many data channels in parallel. Here we present a new tool for processing MEA data recordings that makes use of new programming paradigms and recent technology developments to unleash the power of modern highly parallel hardware, such as multi-core CPUs with vector instruction sets or GPGPUs. Our tool builds on and complements existing MEA data analysis packages. It shows high scalability and can be used to speed up some performance critical pre-processing steps such as data filtering and spike detection, helping to make the analysis of larger data sets tractable.

  16. A Scalable proxy cache for Grid Data Access

    International Nuclear Information System (INIS)

    Cristian Cirstea, Traian; Just Keijser, Jan; Arthur Koeroo, Oscar; Starink, Ronald; Alan Templon, Jeffrey

    2012-01-01

    We describe a prototype grid proxy cache system developed at Nikhef, motivated by a desire to construct the first building block of a future https-based Content Delivery Network for grid infrastructures. Two goals drove the project: firstly to provide a “native view” of the grid for desktop-type users, and secondly to improve performance for physics-analysis type use cases, where multiple passes are made over the same set of data (residing on the grid). We further constrained the design by requiring that the system should be made of standard components wherever possible. The prototype that emerged from this exercise is a horizontally-scalable, cooperating system of web server / cache nodes, fronted by a customized webDAV server. The webDAV server is custom only in the sense that it supports http redirects (providing horizontal scaling) and that the authentication module has, as back end, a proxy delegation chain that can be used by the cache nodes to retrieve files from the grid. The prototype was deployed at Nikhef and tested at a scale of several terabytes of data and approximately one hundred fast cores of computing. Both small and large files were tested, in a number of scenarios, and with various numbers of cache nodes, in order to understand the scaling properties of the system. For properly-dimensioned cache-node hardware, the system showed speedup of several integer factors for the analysis-type use cases. These results and others are presented and discussed.

  17. PCI hardware support in LIA-2 control system

    International Nuclear Information System (INIS)

    Bolkhovityanov, D.; Cheblakov, P.

    2012-01-01

    The control system of the LIA-2 accelerator is built on cPCI crates with *86- compatible processor boards running Linux. Slow electronics is connected via CAN-bus, while fast electronics (4 MHz and 200 MHz fast ADCs and 200 MHz timers) are implemented as cPCI/PMC modules. Several ways to drive PCI control electronics in Linux were examined. Finally a user-space drivers approach was chosen. These drivers communicate with hardware via a small kernel module, which provides access to PCI BARs and to interrupt handling. This module was named USPCI (User-Space PCI access). This approach dramatically simplifies creation of drivers, as opposed to kernel drivers, and provides high reliability (because only a tiny and thoroughly-debugged piece of code runs in kernel). LIA-2 accelerator was successfully commissioned, and the solution chosen has proven adequate and very easy to use. Besides, USPCI turned out to be a handy tool for examination and debugging of PCI devices direct from command-line. In this paper available approaches to work with PCI control hardware in Linux are considered, and USPCI architecture is described. (authors)

  18. A Message-Passing Hardware/Software Cosimulation Environment for Reconfigurable Computing Systems

    Directory of Open Access Journals (Sweden)

    Manuel Saldaña

    2009-01-01

    Full Text Available High-performance reconfigurable computers (HPRCs provide a mix of standard processors and FPGAs to collectively accelerate applications. This introduces new design challenges, such as the need for portable programming models across HPRCs and system-level verification tools. To address the need for cosimulating a complete heterogeneous application using both software and hardware in an HPRC, we have created a tool called the Message-passing Simulation Framework (MSF. We have used it to simulate and develop an interface enabling an MPI-based approach to exchange data between X86 processors and hardware engines inside FPGAs. The MSF can also be used as an application development tool that enables multiple FPGAs in simulation to exchange messages amongst themselves and with X86 processors. As an example, we simulate a LINPACK benchmark hardware core using an Intel-FSB-Xilinx-FPGA platform to quickly prototype the hardware, to test the communications. and to verify the benchmark results.

  19. Accelerator Service

    International Nuclear Information System (INIS)

    Champelovier, Y.; Ferrari, M.; Gardon, A.; Hadinger, G.; Martin, J.; Plantier, A.

    1998-01-01

    Since the cessation of the operation of hydrogen cluster accelerator in July 1996, four electrostatic accelerators were in operation and used by the peri-nuclear teams working in multidisciplinary collaborations. These are the 4 MV Van de Graaff accelerator, 2,5 MV Van de Graaff accelerator, 400 kV ion implanter as well as the 120 kV isotope separator

  20. Scalable System Design for Covert MIMO Communications

    Science.gov (United States)

    2014-06-01

    Vehicles US United States VHDL VHSIC Hardware Description Language VLSI Very Large Scale Integration WARP Wireless open-Access Research Platform WLAN ...communications, satellite radio and Wireless Local Area Network ( WLAN ) OFDM has been utilized for its multi-path resistance. OFDM relies on the...develop hardware specific to the application provides faster computation times, making FPGA development a very powerful tool. 2.5.1 MIMO Receiver Latency

  1. Modern computer networks and distributed intelligence in accelerator controls

    International Nuclear Information System (INIS)

    Briegel, C.

    1991-01-01

    Appropriate hardware and software network protocols are surveyed for accelerator control environments. Accelerator controls network topologies are discussed with respect to the following criteria: vertical versus horizontal and distributed versus centralized. Decision-making considerations are provided for accelerator network architecture specification. Current trends and implementations at Fermilab are discussed

  2. Transmission delays in hardware clock synchronization

    Science.gov (United States)

    Shin, Kang G.; Ramanathan, P.

    1988-01-01

    Various methods, both with software and hardware, have been proposed to synchronize a set of physical clocks in a system. Software methods are very flexible and economical but suffer an excessive time overhead, whereas hardware methods require no time overhead but are unable to handle transmission delays in clock signals. The effects of nonzero transmission delays in synchronization have been studied extensively in the communication area in the absence of malicious or Byzantine faults. The authors show that it is easy to incorporate the ideas from the communication area into the existing hardware clock synchronization algorithms to take into account the presence of both malicious faults and nonzero transmission delays.

  3. FPGA-accelerated simulation of computer systems

    CERN Document Server

    Angepat, Hari; Chung, Eric S; Hoe, James C; Chung, Eric S

    2014-01-01

    To date, the most common form of simulators of computer systems are software-based running on standard computers. One promising approach to improve simulation performance is to apply hardware, specifically reconfigurable hardware in the form of field programmable gate arrays (FPGAs). This manuscript describes various approaches of using FPGAs to accelerate software-implemented simulation of computer systems and selected simulators that incorporate those techniques. More precisely, we describe a simulation architecture taxonomy that incorporates a simulation architecture specifically designed f

  4. Exploiting multi-scale parallelism for large scale numerical modelling of laser wakefield accelerators

    International Nuclear Information System (INIS)

    Fonseca, R A; Vieira, J; Silva, L O; Fiuza, F; Davidson, A; Tsung, F S; Mori, W B

    2013-01-01

    A new generation of laser wakefield accelerators (LWFA), supported by the extreme accelerating fields generated in the interaction of PW-Class lasers and underdense targets, promises the production of high quality electron beams in short distances for multiple applications. Achieving this goal will rely heavily on numerical modelling to further understand the underlying physics and identify optimal regimes, but large scale modelling of these scenarios is computationally heavy and requires the efficient use of state-of-the-art petascale supercomputing systems. We discuss the main difficulties involved in running these simulations and the new developments implemented in the OSIRIS framework to address these issues, ranging from multi-dimensional dynamic load balancing and hybrid distributed/shared memory parallelism to the vectorization of the PIC algorithm. We present the results of the OASCR Joule Metric program on the issue of large scale modelling of LWFA, demonstrating speedups of over 1 order of magnitude on the same hardware. Finally, scalability to over ∼10 6 cores and sustained performance over ∼2 P Flops is demonstrated, opening the way for large scale modelling of LWFA scenarios. (paper)

  5. Hardware-in-the-Loop Testing

    Data.gov (United States)

    Federal Laboratory Consortium — RTC has a suite of Hardware-in-the Loop facilities that include three operational facilities that provide performance assessment and production acceptance testing of...

  6. Hardware device binding and mutual authentication

    Science.gov (United States)

    Hamlet, Jason R; Pierson, Lyndon G

    2014-03-04

    Detection and deterrence of device tampering and subversion by substitution may be achieved by including a cryptographic unit within a computing device for binding multiple hardware devices and mutually authenticating the devices. The cryptographic unit includes a physically unclonable function ("PUF") circuit disposed in or on the hardware device, which generates a binding PUF value. The cryptographic unit uses the binding PUF value during an enrollment phase and subsequent authentication phases. During a subsequent authentication phase, the cryptographic unit uses the binding PUF values of the multiple hardware devices to generate a challenge to send to the other device, and to verify a challenge received from the other device to mutually authenticate the hardware devices.

  7. Research of Virtual Accelerator Control System

    Institute of Scientific and Technical Information of China (English)

    DongJinmei; YuanYoujin; ZhengJianhua

    2003-01-01

    A Virtual Accelerator is a computer process which simulates behavior of beam in an accelerator and responds to the accelerator control program under development in a same way as an actual accelerator. To realize Virtual Accelerator, control system should provide the same program interface to top layer Application Control Program, it can make 'Real Accelerator' and 'Virtual Accelerator'use the same GUI, so control system should have a layer to hide hardware details, Application Control Program access control devices through logical name but not through coded hardware address. Without this layer, it is difficult to develop application program which can access both 'Virtual' and 'Real' Accelerators using same program interfaces. For this reason, we can create CSR Runtime Database which allows application program to access hardware devices and data on a simulation process in a unified way. A device 'is represented as a collection of records in CSR Runtime Database. A control program on host computer can access devices in the system only through names of record fields, called channel.

  8. Cooperative communications hardware, channel and PHY

    CERN Document Server

    Dohler, Mischa

    2010-01-01

    Facilitating Cooperation for Wireless Systems Cooperative Communications: Hardware, Channel & PHY focuses on issues pertaining to the PHY layer of wireless communication networks, offering a rigorous taxonomy of this dispersed field, along with a range of application scenarios for cooperative and distributed schemes, demonstrating how these techniques can be employed. The authors discuss hardware, complexity and power consumption issues, which are vital for understanding what can be realized at the PHY layer, showing how wireless channel models differ from more traditional

  9. Designing Secure Systems on Reconfigurable Hardware

    OpenAIRE

    Huffmire, Ted; Brotherton, Brett; Callegari, Nick; Valamehr, Jonathan; White, Jeff; Kastner, Ryan; Sherwood, Ted

    2008-01-01

    The extremely high cost of custom ASIC fabrication makes FPGAs an attractive alternative for deployment of custom hardware. Embedded systems based on reconfigurable hardware integrate many functions onto a single device. Since embedded designers often have no choice but to use soft IP cores obtained from third parties, the cores operate at different trust levels, resulting in mixed trust designs. The goal of this project is to evaluate recently proposed security primitives for reconfigurab...

  10. IDD Archival Hardware Architecture and Workflow

    Energy Technology Data Exchange (ETDEWEB)

    Mendonsa, D; Nekoogar, F; Martz, H

    2008-10-09

    This document describes the functionality of every component in the DHS/IDD archival and storage hardware system shown in Fig. 1. The document describes steps by step process of image data being received at LLNL then being processed and made available to authorized personnel and collaborators. Throughout this document references will be made to one of two figures, Fig. 1 describing the elements of the architecture and the Fig. 2 describing the workflow and how the project utilizes the available hardware.

  11. Hardware Genetic Algorithm Optimization by Critical Path Analysis using a Custom VLSI Architecture

    Directory of Open Access Journals (Sweden)

    Farouk Smith

    2015-07-01

    Full Text Available This paper propose a Virtual-Field Programmable Gate Array (V-FPGA architecture that allows direct access to its configuration bits to facilitate hardware evolution, thereby allowing any combinational or sequential digital circuit to be realized. By using the V-FPGA, this paper investigates two possible ways of making evolutionary hardware systems more scalable: by optimizing the system’s genetic algorithm (GA; and by decomposing the solution circuit into smaller, evolvable sub-circuits. GA optimization is done by: omitting a canonical GA’s crossover operator (i.e. by using a 1+λ algorithm; applying evolution constraints; and optimizing the fitness function. A noteworthy contribution this research has made is the in-depth analysis of the phenotypes’ CPs. Through analyzing the CPs, it has been shown that a great amount of insight can be gained into a phenotype’s fitness. We found that as the number of columns in the Cartesian Genetic Programming array increases, so the likelihood of an external output being placed in the column decreases. Furthermore, the number of used LEs per column also substantially decreases per added column. Finally, we demonstrated the evolution of a state-decomposed control circuit. It was shown that the evolution of each state’s sub-circuit was possible, and suggest that modular evolution can be a successful tool when dealing with scalability.

  12. Scalable Algorithms for Adaptive Statistical Designs

    Directory of Open Access Journals (Sweden)

    Robert Oehmke

    2000-01-01

    Full Text Available We present a scalable, high-performance solution to multidimensional recurrences that arise in adaptive statistical designs. Adaptive designs are an important class of learning algorithms for a stochastic environment, and we focus on the problem of optimally assigning patients to treatments in clinical trials. While adaptive designs have significant ethical and cost advantages, they are rarely utilized because of the complexity of optimizing and analyzing them. Computational challenges include massive memory requirements, few calculations per memory access, and multiply-nested loops with dynamic indices. We analyze the effects of various parallelization options, and while standard approaches do not work well, with effort an efficient, highly scalable program can be developed. This allows us to solve problems thousands of times more complex than those solved previously, which helps make adaptive designs practical. Further, our work applies to many other problems involving neighbor recurrences, such as generalized string matching.

  13. Scalable Packet Classification with Hash Tables

    Science.gov (United States)

    Wang, Pi-Chung

    In the last decade, the technique of packet classification has been widely deployed in various network devices, including routers, firewalls and network intrusion detection systems. In this work, we improve the performance of packet classification by using multiple hash tables. The existing hash-based algorithms have superior scalability with respect to the required space; however, their search performance may not be comparable to other algorithms. To improve the search performance, we propose a tuple reordering algorithm to minimize the number of accessed hash tables with the aid of bitmaps. We also use pre-computation to ensure the accuracy of our search procedure. Performance evaluation based on both real and synthetic filter databases shows that our scheme is effective and scalable and the pre-computation cost is moderate.

  14. Scalable fabrication of perovskite solar cells

    Energy Technology Data Exchange (ETDEWEB)

    Li, Zhen; Klein, Talysa R.; Kim, Dong Hoe; Yang, Mengjin; Berry, Joseph J.; van Hest, Maikel F. A. M.; Zhu, Kai

    2018-03-27

    Perovskite materials use earth-abundant elements, have low formation energies for deposition and are compatible with roll-to-roll and other high-volume manufacturing techniques. These features make perovskite solar cells (PSCs) suitable for terawatt-scale energy production with low production costs and low capital expenditure. Demonstrations of performance comparable to that of other thin-film photovoltaics (PVs) and improvements in laboratory-scale cell stability have recently made scale up of this PV technology an intense area of research focus. Here, we review recent progress and challenges in scaling up PSCs and related efforts to enable the terawatt-scale manufacturing and deployment of this PV technology. We discuss common device and module architectures, scalable deposition methods and progress in the scalable deposition of perovskite and charge-transport layers. We also provide an overview of device and module stability, module-level characterization techniques and techno-economic analyses of perovskite PV modules.

  15. Scalable Atomistic Simulation Algorithms for Materials Research

    Directory of Open Access Journals (Sweden)

    Aiichiro Nakano

    2002-01-01

    Full Text Available A suite of scalable atomistic simulation programs has been developed for materials research based on space-time multiresolution algorithms. Design and analysis of parallel algorithms are presented for molecular dynamics (MD simulations and quantum-mechanical (QM calculations based on the density functional theory. Performance tests have been carried out on 1,088-processor Cray T3E and 1,280-processor IBM SP3 computers. The linear-scaling algorithms have enabled 6.44-billion-atom MD and 111,000-atom QM calculations on 1,024 SP3 processors with parallel efficiency well over 90%. production-quality programs also feature wavelet-based computational-space decomposition for adaptive load balancing, spacefilling-curve-based adaptive data compression with user-defined error bound for scalable I/O, and octree-based fast visibility culling for immersive and interactive visualization of massive simulation data.

  16. Software for Managing Inventory of Flight Hardware

    Science.gov (United States)

    Salisbury, John; Savage, Scott; Thomas, Shirman

    2003-01-01

    The Flight Hardware Support Request System (FHSRS) is a computer program that relieves engineers at Marshall Space Flight Center (MSFC) of most of the non-engineering administrative burden of managing an inventory of flight hardware. The FHSRS can also be adapted to perform similar functions for other organizations. The FHSRS affords a combination of capabilities, including those formerly provided by three separate programs in purchasing, inventorying, and inspecting hardware. The FHSRS provides a Web-based interface with a server computer that supports a relational database of inventory; electronic routing of requests and approvals; and electronic documentation from initial request through implementation of quality criteria, acquisition, receipt, inspection, storage, and final issue of flight materials and components. The database lists both hardware acquired for current projects and residual hardware from previous projects. The increased visibility of residual flight components provided by the FHSRS has dramatically improved the re-utilization of materials in lieu of new procurements, resulting in a cost savings of over $1.7 million. The FHSRS includes subprograms for manipulating the data in the database, informing of the status of a request or an item of hardware, and searching the database on any physical or other technical characteristic of a component or material. The software structure forces normalization of the data to facilitate inquiries and searches for which users have entered mixed or inconsistent values.

  17. Scalable manufacturing processes with soft materials

    OpenAIRE

    White, Edward; Case, Jennifer; Kramer, Rebecca

    2014-01-01

    The emerging field of soft robotics will benefit greatly from new scalable manufacturing techniques for responsive materials. Currently, most of soft robotic examples are fabricated one-at-a-time, using techniques borrowed from lithography and 3D printing to fabricate molds. This limits both the maximum and minimum size of robots that can be fabricated, and hinders batch production, which is critical to gain wider acceptance for soft robotic systems. We have identified electrical structures, ...

  18. Architecture Knowledge for Evaluating Scalable Databases

    Science.gov (United States)

    2015-01-16

    Architecture Knowledge for Evaluating Scalable Databases 5a. CONTRACT NUMBER 5b. GRANT NUMBER 5c. PROGRAM ELEMENT NUMBER 6. AUTHOR(S) Nurgaliev... Scala , Erlang, Javascript Cursor-based queries Supported, Not Supported JOIN queries Supported, Not Supported Complex data types Lists, maps, sets...is therefore needed, using technology such as machine learning to extract content from product documentation. The terminology used in the database

  19. Randomized Algorithms for Scalable Machine Learning

    OpenAIRE

    Kleiner, Ariel Jacob

    2012-01-01

    Many existing procedures in machine learning and statistics are computationally intractable in the setting of large-scale data. As a result, the advent of rapidly increasing dataset sizes, which should be a boon yielding improved statistical performance, instead severely blunts the usefulness of a variety of existing inferential methods. In this work, we use randomness to ameliorate this lack of scalability by reducing complex, computationally difficult inferential problems to larger sets o...

  20. Bitcoin-NG: A Scalable Blockchain Protocol

    OpenAIRE

    Eyal, Ittay; Gencer, Adem Efe; Sirer, Emin Gun; van Renesse, Robbert

    2015-01-01

    Cryptocurrencies, based on and led by Bitcoin, have shown promise as infrastructure for pseudonymous online payments, cheap remittance, trustless digital asset exchange, and smart contracts. However, Bitcoin-derived blockchain protocols have inherent scalability limits that trade-off between throughput and latency and withhold the realization of this potential. This paper presents Bitcoin-NG, a new blockchain protocol designed to scale. Based on Bitcoin's blockchain protocol, Bitcoin-NG is By...

  1. Scuba: scalable kernel-based gene prioritization.

    Science.gov (United States)

    Zampieri, Guido; Tran, Dinh Van; Donini, Michele; Navarin, Nicolò; Aiolli, Fabio; Sperduti, Alessandro; Valle, Giorgio

    2018-01-25

    The uncovering of genes linked to human diseases is a pressing challenge in molecular biology and precision medicine. This task is often hindered by the large number of candidate genes and by the heterogeneity of the available information. Computational methods for the prioritization of candidate genes can help to cope with these problems. In particular, kernel-based methods are a powerful resource for the integration of heterogeneous biological knowledge, however, their practical implementation is often precluded by their limited scalability. We propose Scuba, a scalable kernel-based method for gene prioritization. It implements a novel multiple kernel learning approach, based on a semi-supervised perspective and on the optimization of the margin distribution. Scuba is optimized to cope with strongly unbalanced settings where known disease genes are few and large scale predictions are required. Importantly, it is able to efficiently deal both with a large amount of candidate genes and with an arbitrary number of data sources. As a direct consequence of scalability, Scuba integrates also a new efficient strategy to select optimal kernel parameters for each data source. We performed cross-validation experiments and simulated a realistic usage setting, showing that Scuba outperforms a wide range of state-of-the-art methods. Scuba achieves state-of-the-art performance and has enhanced scalability compared to existing kernel-based approaches for genomic data. This method can be useful to prioritize candidate genes, particularly when their number is large or when input data is highly heterogeneous. The code is freely available at https://github.com/gzampieri/Scuba .

  2. A scalable distributed RRT for motion planning

    KAUST Repository

    Jacobs, Sam Ade

    2013-05-01

    Rapidly-exploring Random Tree (RRT), like other sampling-based motion planning methods, has been very successful in solving motion planning problems. Even so, sampling-based planners cannot solve all problems of interest efficiently, so attention is increasingly turning to parallelizing them. However, one challenge in parallelizing RRT is the global computation and communication overhead of nearest neighbor search, a key operation in RRTs. This is a critical issue as it limits the scalability of previous algorithms. We present two parallel algorithms to address this problem. The first algorithm extends existing work by introducing a parameter that adjusts how much local computation is done before a global update. The second algorithm radially subdivides the configuration space into regions, constructs a portion of the tree in each region in parallel, and connects the subtrees,i removing cycles if they exist. By subdividing the space, we increase computation locality enabling a scalable result. We show that our approaches are scalable. We present results demonstrating almost linear scaling to hundreds of processors on a Linux cluster and a Cray XE6 machine. © 2013 IEEE.

  3. A scalable distributed RRT for motion planning

    KAUST Repository

    Jacobs, Sam Ade; Stradford, Nicholas; Rodriguez, Cesar; Thomas, Shawna; Amato, Nancy M.

    2013-01-01

    Rapidly-exploring Random Tree (RRT), like other sampling-based motion planning methods, has been very successful in solving motion planning problems. Even so, sampling-based planners cannot solve all problems of interest efficiently, so attention is increasingly turning to parallelizing them. However, one challenge in parallelizing RRT is the global computation and communication overhead of nearest neighbor search, a key operation in RRTs. This is a critical issue as it limits the scalability of previous algorithms. We present two parallel algorithms to address this problem. The first algorithm extends existing work by introducing a parameter that adjusts how much local computation is done before a global update. The second algorithm radially subdivides the configuration space into regions, constructs a portion of the tree in each region in parallel, and connects the subtrees,i removing cycles if they exist. By subdividing the space, we increase computation locality enabling a scalable result. We show that our approaches are scalable. We present results demonstrating almost linear scaling to hundreds of processors on a Linux cluster and a Cray XE6 machine. © 2013 IEEE.

  4. Scalable robotic biofabrication of tissue spheroids

    International Nuclear Information System (INIS)

    Mehesz, A Nagy; Hajdu, Z; Visconti, R P; Markwald, R R; Mironov, V; Brown, J; Beaver, W; Da Silva, J V L

    2011-01-01

    Development of methods for scalable biofabrication of uniformly sized tissue spheroids is essential for tissue spheroid-based bioprinting of large size tissue and organ constructs. The most recent scalable technique for tissue spheroid fabrication employs a micromolded recessed template prepared in a non-adhesive hydrogel, wherein the cells loaded into the template self-assemble into tissue spheroids due to gravitational force. In this study, we present an improved version of this technique. A new mold was designed to enable generation of 61 microrecessions in each well of a 96-well plate. The microrecessions were seeded with cells using an EpMotion 5070 automated pipetting machine. After 48 h of incubation, tissue spheroids formed at the bottom of each microrecession. To assess the quality of constructs generated using this technology, 600 tissue spheroids made by this method were compared with 600 spheroids generated by the conventional hanging drop method. These analyses showed that tissue spheroids fabricated by the micromolded method are more uniform in diameter. Thus, use of micromolded recessions in a non-adhesive hydrogel, combined with automated cell seeding, is a reliable method for scalable robotic fabrication of uniform-sized tissue spheroids.

  5. Scalable robotic biofabrication of tissue spheroids

    Energy Technology Data Exchange (ETDEWEB)

    Mehesz, A Nagy; Hajdu, Z; Visconti, R P; Markwald, R R; Mironov, V [Advanced Tissue Biofabrication Center, Department of Regenerative Medicine and Cell Biology, Medical University of South Carolina, Charleston, SC (United States); Brown, J [Department of Mechanical Engineering, Clemson University, Clemson, SC (United States); Beaver, W [York Technical College, Rock Hill, SC (United States); Da Silva, J V L, E-mail: mironovv@musc.edu [Renato Archer Information Technology Center-CTI, Campinas (Brazil)

    2011-06-15

    Development of methods for scalable biofabrication of uniformly sized tissue spheroids is essential for tissue spheroid-based bioprinting of large size tissue and organ constructs. The most recent scalable technique for tissue spheroid fabrication employs a micromolded recessed template prepared in a non-adhesive hydrogel, wherein the cells loaded into the template self-assemble into tissue spheroids due to gravitational force. In this study, we present an improved version of this technique. A new mold was designed to enable generation of 61 microrecessions in each well of a 96-well plate. The microrecessions were seeded with cells using an EpMotion 5070 automated pipetting machine. After 48 h of incubation, tissue spheroids formed at the bottom of each microrecession. To assess the quality of constructs generated using this technology, 600 tissue spheroids made by this method were compared with 600 spheroids generated by the conventional hanging drop method. These analyses showed that tissue spheroids fabricated by the micromolded method are more uniform in diameter. Thus, use of micromolded recessions in a non-adhesive hydrogel, combined with automated cell seeding, is a reliable method for scalable robotic fabrication of uniform-sized tissue spheroids.

  6. Future accelerators (?)

    Energy Technology Data Exchange (ETDEWEB)

    John Womersley

    2003-08-21

    I describe the future accelerator facilities that are currently foreseen for electroweak scale physics, neutrino physics, and nuclear structure. I will explore the physics justification for these machines, and suggest how the case for future accelerators can be made.

  7. Acceleration of polarized protons in the IHEP accelerator complex

    International Nuclear Information System (INIS)

    Anferov, V.A.; Ado, Yu.M.; Shoumkin, D.

    1995-01-01

    The paper considers possibility to accelerate polarized beam in the IHEP accelerator complex (including first stage of the UNK). The scheme of preserving beam polarization is described for all acceleration stages up to 400 GeV beam energy. Polarization and intensity of the polarized proton beam are estimated. The suggested scheme includes using two Siberian snakes in opposite straight sections of the UNK-1, where each snake consists of five dipole magnets. In the U-70 it is suggested to use one helical Siberian snake, which is turned on adiabatically at 10 GeV, and four pulsed quadrupoles. To incorporate the snake into the accelerator lattice it is proposed to make modification of one superperiod. This would make a 13 m long straight section. Spin depolarization in the Booster is avoided by decreasing the extraction energy to 0.9 GeV. Then no additional hardware is required in the Booster

  8. Numeric Analysis for Relationship-Aware Scalable Streaming Scheme

    Directory of Open Access Journals (Sweden)

    Heung Ki Lee

    2014-01-01

    Full Text Available Frequent packet loss of media data is a critical problem that degrades the quality of streaming services over mobile networks. Packet loss invalidates frames containing lost packets and other related frames at the same time. Indirect loss caused by losing packets decreases the quality of streaming. A scalable streaming service can decrease the amount of dropped multimedia resulting from a single packet loss. Content providers typically divide one large media stream into several layers through a scalable streaming service and then provide each scalable layer to the user depending on the mobile network. Also, a scalable streaming service makes it possible to decode partial multimedia data depending on the relationship between frames and layers. Therefore, a scalable streaming service provides a way to decrease the wasted multimedia data when one packet is lost. However, the hierarchical structure between frames and layers of scalable streams determines the service quality of the scalable streaming service. Even if whole packets of layers are transmitted successfully, they cannot be decoded as a result of the absence of reference frames and layers. Therefore, the complicated relationship between frames and layers in a scalable stream increases the volume of abandoned layers. For providing a high-quality scalable streaming service, we choose a proper relationship between scalable layers as well as the amount of transmitted multimedia data depending on the network situation. We prove that a simple scalable scheme outperforms a complicated scheme in an error-prone network. We suggest an adaptive set-top box (AdaptiveSTB to lower the dependency between scalable layers in a scalable stream. Also, we provide a numerical model to obtain the indirect loss of multimedia data and apply it to various multimedia streams. Our AdaptiveSTB enhances the quality of a scalable streaming service by removing indirect loss.

  9. Decentralized control of a scalable photovoltaic (PV)-battery hybrid power system

    International Nuclear Information System (INIS)

    Kim, Myungchin; Bae, Sungwoo

    2017-01-01

    Highlights: • This paper introduces the design and control of a PV-battery hybrid power system. • Reliable and scalable operation of hybrid power systems is achieved. • System and power control are performed without a centralized controller. • Reliability and scalability characteristics are studied in a quantitative manner. • The system control performance is verified using realistic solar irradiation data. - Abstract: This paper presents the design and control of a sustainable standalone photovoltaic (PV)-battery hybrid power system (HPS). The research aims to develop an approach that contributes to increased level of reliability and scalability for an HPS. To achieve such objectives, a PV-battery HPS with a passively connected battery was studied. A quantitative hardware reliability analysis was performed to assess the effect of energy storage configuration to the overall system reliability. Instead of requiring the feedback control information of load power through a centralized supervisory controller, the power flow in the proposed HPS is managed by a decentralized control approach that takes advantage of the system architecture. Reliable system operation of an HPS is achieved through the proposed control approach by not requiring a separate supervisory controller. Furthermore, performance degradation of energy storage can be prevented by selecting the controller gains such that the charge rate does not exceed operational requirements. The performance of the proposed system architecture with the control strategy was verified by simulation results using realistic irradiance data and a battery model in which its temperature effect was considered. With an objective to support scalable operation, details on how the proposed design could be applied were also studied so that the HPS could satisfy potential system growth requirements. Such scalability was verified by simulating various cases that involve connection and disconnection of sources and loads. The

  10. Scalable and balanced dynamic hybrid data assimilation

    Science.gov (United States)

    Kauranne, Tuomo; Amour, Idrissa; Gunia, Martin; Kallio, Kari; Lepistö, Ahti; Koponen, Sampsa

    2017-04-01

    Scalability of complex weather forecasting suites is dependent on the technical tools available for implementing highly parallel computational kernels, but to an equally large extent also on the dependence patterns between various components of the suite, such as observation processing, data assimilation and the forecast model. Scalability is a particular challenge for 4D variational assimilation methods that necessarily couple the forecast model into the assimilation process and subject this combination to an inherently serial quasi-Newton minimization process. Ensemble based assimilation methods are naturally more parallel, but large models force ensemble sizes to be small and that results in poor assimilation accuracy, somewhat akin to shooting with a shotgun in a million-dimensional space. The Variational Ensemble Kalman Filter (VEnKF) is an ensemble method that can attain the accuracy of 4D variational data assimilation with a small ensemble size. It achieves this by processing a Gaussian approximation of the current error covariance distribution, instead of a set of ensemble members, analogously to the Extended Kalman Filter EKF. Ensemble members are re-sampled every time a new set of observations is processed from a new approximation of that Gaussian distribution which makes VEnKF a dynamic assimilation method. After this a smoothing step is applied that turns VEnKF into a dynamic Variational Ensemble Kalman Smoother VEnKS. In this smoothing step, the same process is iterated with frequent re-sampling of the ensemble but now using past iterations as surrogate observations until the end result is a smooth and balanced model trajectory. In principle, VEnKF could suffer from similar scalability issues as 4D-Var. However, this can be avoided by isolating the forecast model completely from the minimization process by implementing the latter as a wrapper code whose only link to the model is calling for many parallel and totally independent model runs, all of them

  11. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Directory of Open Access Journals (Sweden)

    Giovanni Delussu

    Full Text Available This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  12. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

    Science.gov (United States)

    Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. PMID:27936191

  13. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Science.gov (United States)

    Delussu, Giovanni; Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  14. Modern control techniques for accelerators

    International Nuclear Information System (INIS)

    Goodwin, R.W.; Shea, M.F.

    1984-05-01

    Beginning in the mid to late sixties, most new accelerators were designed to include computer based control systems. Although each installation differed in detail, the technology of the sixties and early to mid seventies dictated an architecture that was essentially the same for the control systems of that era. A mini-computer was connected to the hardware and to a console. Two developments have changed the architecture of modern systems: (a) the microprocessor and (b) local area networks. This paper discusses these two developments and demonstrates their impact on control system design and implementation by way of describing a possible architecture for any size of accelerator. Both hardware and software aspects are included

  15. VEG-01: Veggie Hardware Verification Testing

    Science.gov (United States)

    Massa, Gioia; Newsham, Gary; Hummerick, Mary; Morrow, Robert; Wheeler, Raymond

    2013-01-01

    The Veggie plant/vegetable production system is scheduled to fly on ISS at the end of2013. Since much of the technology associated with Veggie has not been previously tested in microgravity, a hardware validation flight was initiated. This test will allow data to be collected about Veggie hardware functionality on ISS, allow crew interactions to be vetted for future improvements, validate the ability of the hardware to grow and sustain plants, and collect data that will be helpful to future Veggie investigators as they develop their payloads. Additionally, food safety data on the lettuce plants grown will be collected to help support the development of a pathway for the crew to safely consume produce grown on orbit. Significant background research has been performed on the Veggie plant growth system, with early tests focusing on the development of the rooting pillow concept, and the selection of fertilizer, rooting medium and plant species. More recent testing has been conducted to integrate the pillow concept into the Veggie hardware and to ensure that adequate water is provided throughout the growth cycle. Seed sanitation protocols have been established for flight, and hardware sanitation between experiments has been studied. Methods for shipping and storage of rooting pillows and the development of crew procedures and crew training videos for plant activities on-orbit have been established. Science verification testing was conducted and lettuce plants were successfully grown in prototype Veggie hardware, microbial samples were taken, plant were harvested, frozen, stored and later analyzed for microbial growth, nutrients, and A TP levels. An additional verification test, prior to the final payload verification testing, is desired to demonstrate similar growth in the flight hardware and also to test a second set of pillows containing zinnia seeds. Issues with root mat water supply are being resolved, with final testing and flight scheduled for later in 2013.

  16. From Open Source Software to Open Source Hardware

    OpenAIRE

    Viseur , Robert

    2012-01-01

    Part 2: Lightning Talks; International audience; The open source software principles progressively give rise to new initiatives for culture (free culture), data (open data) or hardware (open hardware). The open hardware is experiencing a significant growth but the business models and legal aspects are not well known. This paper is dedicated to the economics of open hardware. We define the open hardware concept and determine intellectual property tools we can apply to open hardware, with a str...

  17. Principle of accelerator mass spectrometry

    International Nuclear Information System (INIS)

    Matsuzaki, Hiroyuki

    2007-01-01

    The principle of accelerator mass spectrometry (AMS) is described mainly on technical aspects: hardware construction of AMS, measurement of isotope ratio, sensitivity of measurement (measuring limit), measuring accuracy, and application of data. The content may be summarized as follows: rare isotope (often long-lived radioactive isotope) can be detected by various use of the ion energy obtained by the acceleration of ions, a measurable isotope ratio is one of rare isotope to abundant isotopes, and a measured value of isotope ratio is uncertainty to true one. Such a fact must be kept in mind on the use of AMS data to application research. (M.H.)

  18. Hardware-in-the-loop vehicle system including dynamic fuel cell model

    Energy Technology Data Exchange (ETDEWEB)

    Lemes, Z.; Lenhart, T.; Braun, M.; Maencher, H. [MAGNUM Automatisierungstechnik GmbH, Darmstadt (Germany)

    2005-07-01

    In order to reduce costs and accelerate the development of fuel cells and systems the usage of hardware-in-the-loop (HIL) testing and dynamic modelling opens new possibilities. The dynamic model of a proton exchange membrane fuel cell (PEMFC) together with a vehicle model is used to carry out a comprehensive system investigation, which allows designing and optimising the behaviour of the components and the entire fuel cell system. The set-up of a HIL system enables real time interaction between the selected hardware and the model. (orig.)

  19. Comparison Of Hybrid Sorting Algorithms Implemented On Different Parallel Hardware Platforms

    Directory of Open Access Journals (Sweden)

    Dominik Zurek

    2013-01-01

    Full Text Available Sorting is a common problem in computer science. There are lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, hardware platforms enable to create wide parallel algorithms. We have standard processors consist of multiple cores and hardware accelerators like GPU. The graphic cards with their parallel architecture give new possibility to speed up many algorithms. In this paper we describe results of implementation of a few different sorting algorithms on GPU cards and multicore processors. Then hybrid algorithm will be presented which consists of parts executed on both platforms, standard CPU and GPU.

  20. Flight Hardware Virtualization for On-Board Science Data Processing

    Data.gov (United States)

    National Aeronautics and Space Administration — Utilize Hardware Virtualization technology to benefit on-board science data processing by investigating new real time embedded Hardware Virtualization solutions and...

  1. Non-fuel bearing hardware melting technology

    International Nuclear Information System (INIS)

    Newman, D.F.

    1993-01-01

    Battelle has developed a portable hardware melter concept that would allow spent fuel rod consolidation operations at commercial nuclear power plants to provide significantly more storage space for other spent fuel assemblies in existing pool racks at lower cost. Using low pressure compaction, the non-fuel bearing hardware (NFBH) left over from the removal of spent fuel rods from the stainless steel end fittings and the Zircaloy guide tubes and grid spacers still occupies 1/3 to 2/5 of the volume of the consolidated fuel rod assemblies. Melting the non-fuel bearing hardware reduces its volume by a factor 4 from that achievable with low-pressure compaction. This paper describes: (1) the configuration and design features of Battelle's hardware melter system that permit its portability, (2) the system's throughput capacity, (3) the bases for capital and operating estimates, and (4) the status of NFBH melter demonstration to reduce technical risks for implementation of the concept. Since all NFBH handling and processing operations would be conducted at the reactor site, costs for shipping radioactive hardware to and from a stationary processing facility for volume reduction are avoided. Initial licensing, testing, and installation in the field would follow the successful pattern achieved with rod consolidation technology

  2. Reconfigurable ATCA hardware for plasma control and data acquisition

    Energy Technology Data Exchange (ETDEWEB)

    Carvalho, B.B., E-mail: bernardo@ipfn.ist.utl.p [Associacao EURATOM/IST Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Av. Rovisco Pais, 1049-001 Lisboa (Portugal); Batista, A.J.N.; Correia, M.; Neto, A.; Fernandes, H.; Goncalves, B.; Sousa, J. [Associacao EURATOM/IST Instituto de Plasmas e Fusao Nuclear, Instituto Superior Tecnico, Av. Rovisco Pais, 1049-001 Lisboa (Portugal)

    2010-07-15

    The IST/EURATOM Association is developing a new generation of control and data acquisition hardware for fusion experiments based on the ATCA architecture. This emerging open standard offers a significantly higher data throughput over a reliable High Availability (HA) mechanical and electrical platform. One of this ATCA boards has 32 galvanically isolated ADC channels (18 bit) each mounted on a swappable plug-in card, 8 DAC channels (16 bit), 8 digital I/O channels and embeds a high performance XILINX Virtex 4 family field programmable gate array (FPGA). The specific modular and configurable hardware design enables adaptable utilization of the board in dissimilar applications. The first configuration, specially developed for tokamak plasma Vertical Stabilization, consists of a Multiple-Input-Multiple-Output (MIMO) controller that is capable of feedback loops faster than 1 ms using a multitude of input signals fed from different boards communicating through the Aurora{sup TM} point-to-point protocol. Massive parallel algorithms can be implemented on the FPGA either with programmed digital logic, using a HDL hardware description language, or within its internal silicon PowerPC{sup TM} running a full fledged real-time operating system. The second board configuration is dedicated for transient recording of the entire 32 channels at 2 MSamples/s to the on-board 512 MB DDR2 memory. Signal data retrieval is accelerated by a DMA-driven PCI Express{sup TM} x1 Interface to the ATCA system controller, providing an overall throughput in excess of 100 MB/s. This paper illustrates these developments and discusses possible configurations for foreseen applications.

  3. Architecture and development of the CDF hardware event builder

    International Nuclear Information System (INIS)

    Shaw, T.M.; Booth, A.W.; Bowden, M.

    1989-01-01

    A hardware Event Builder (EVB) has been developed for use at the Collider Detector experiment at Fermi National Accelerator (CDF). the Event builder presently consists of five FASTBUS modules and has the task of reading out the front end scanners, reformatting the data into YBOS bank structure, and transmitting the data to a Level 3 (L3) trigger system which is composed of multiple VME processing nodes. The Event Builder receives its instructions from a VAX based Buffer Manager (BFM) program via a Unibus Processor Interface (UPI). The Buffer Manager instructs the Event Builder to read out one of the four CDF front end buffers. The Event Builder then informs the Buffer Manager when the event has been formatted and then is instructed to push it up to the L3 trigger system. Once in the L3 system, a decision is made as to whether to write the event to tape

  4. A hardware overview of the RHIC LLRF platform

    International Nuclear Information System (INIS)

    Hayes, T.; Smith, K.S.

    2011-01-01

    The RHIC Low Level RF (LLRF) platform is a flexible, modular system designed around a carrier board with six XMC daughter sites. The carrier board features a Xilinx FPGA with an embedded, hard core Power PC that is remotely reconfigurable. It serves as a front end computer (FEC) that interfaces with the RHIC control system. The carrier provides high speed serial data paths to each daughter site and between daughter sites as well as four generic external fiber optic links. It also distributes low noise clocks and serial data links to all daughter sites and monitors temperature, voltage and current. To date, two XMC cards have been designed: a four channel high speed ADC and a four channel high speed DAC. The new LLRF hardware was used to replace the old RHIC LLRF system for the 2009 run. For the 2010 run, the RHIC RF system operation was dramatically changed with the introduction of accelerating both beams in a new, common cavity instead of each ring having independent cavities. The flexibility of the new system was beneficial in allowing the low level system to be adapted to support this new configuration. This hardware was also used in 2009 to provide LLRF for the newly commissioned Electron Beam Ion Source.

  5. A Hardware Abstraction Layer in Java

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Korsholm, Stephan; Kalibera, Tomas

    2011-01-01

    Embedded systems use specialized hardware devices to interact with their environment, and since they have to be dependable, it is attractive to use a modern, type-safe programming language like Java to develop programs for them. Standard Java, as a platform-independent language, delegates access...... to devices, direct memory access, and interrupt handling to some underlying operating system or kernel, but in the embedded systems domain resources are scarce and a Java Virtual Machine (JVM) without an underlying middleware is an attractive architecture. The contribution of this article is a proposal...... for Java packages with hardware objects and interrupt handlers that interface to such a JVM. We provide implementations of the proposal directly in hardware, as extensions of standard interpreters, and finally with an operating system middleware. The latter solution is mainly seen as a migration path...

  6. MFTF supervisory control and diagnostics system hardware

    International Nuclear Information System (INIS)

    Butner, D.N.

    1979-01-01

    The Supervisory Control and Diagnostics System (SCDS) for the Mirror Fusion Test Facility (MFTF) is a multiprocessor minicomputer system designed so that for most single-point failures, the hardware may be quickly reconfigured to provide continued operation of the experiment. The system is made up of nine Perkin-Elmer computers - a mixture of 8/32's and 7/32's. Each computer has ports on a shared memory system consisting of two independent shared memory modules. Each processor can signal other processors through hardware external to the shared memory. The system communicates with the Local Control and Instrumentation System, which consists of approximately 65 microprocessors. Each of the six system processors has facilities for communicating with a group of microprocessors; the groups consist of from four to 24 microprocessors. There are hardware switches so that if an SCDS processor communicating with a group of microprocessors fails, another SCDS processor takes over the communication

  7. A Modular Framework for Modeling Hardware Elements in Distributed Engine Control Systems

    Science.gov (United States)

    Zinnecker, Alicia M.; Culley, Dennis E.; Aretskin-Hariton, Eliot D.

    2015-01-01

    Progress toward the implementation of distributed engine control in an aerospace application may be accelerated through the development of a hardware-in-the-loop (HIL) system for testing new control architectures and hardware outside of a physical test cell environment. One component required in an HIL simulation system is a high-fidelity model of the control platform: sensors, actuators, and the control law. The control system developed for the Commercial Modular Aero-Propulsion System Simulation 40k (C-MAPSS40k) provides a verifiable baseline for development of a model for simulating a distributed control architecture. This distributed controller model will contain enhanced hardware models, capturing the dynamics of the transducer and the effects of data processing, and a model of the controller network. A multilevel framework is presented that establishes three sets of interfaces in the control platform: communication with the engine (through sensors and actuators), communication between hardware and controller (over a network), and the physical connections within individual pieces of hardware. This introduces modularity at each level of the model, encouraging collaboration in the development and testing of various control schemes or hardware designs. At the hardware level, this modularity is leveraged through the creation of a SimulinkR library containing blocks for constructing smart transducer models complying with the IEEE 1451 specification. These hardware models were incorporated in a distributed version of the baseline C-MAPSS40k controller and simulations were run to compare the performance of the two models. The overall tracking ability differed only due to quantization effects in the feedback measurements in the distributed controller. Additionally, it was also found that the added complexity of the smart transducer models did not prevent real-time operation of the distributed controller model, a requirement of an HIL system.

  8. Quantum neuromorphic hardware for quantum artificial intelligence

    Science.gov (United States)

    Prati, Enrico

    2017-08-01

    The development of machine learning methods based on deep learning boosted the field of artificial intelligence towards unprecedented achievements and application in several fields. Such prominent results were made in parallel with the first successful demonstrations of fault tolerant hardware for quantum information processing. To which extent deep learning can take advantage of the existence of a hardware based on qubits behaving as a universal quantum computer is an open question under investigation. Here I review the convergence between the two fields towards implementation of advanced quantum algorithms, including quantum deep learning.

  9. Human Centered Hardware Modeling and Collaboration

    Science.gov (United States)

    Stambolian Damon; Lawrence, Brad; Stelges, Katrine; Henderson, Gena

    2013-01-01

    In order to collaborate engineering designs among NASA Centers and customers, to in clude hardware and human activities from multiple remote locations, live human-centered modeling and collaboration across several sites has been successfully facilitated by Kennedy Space Center. The focus of this paper includes innovative a pproaches to engineering design analyses and training, along with research being conducted to apply new technologies for tracking, immersing, and evaluating humans as well as rocket, vehic le, component, or faci lity hardware utilizing high resolution cameras, motion tracking, ergonomic analysis, biomedical monitoring, wor k instruction integration, head-mounted displays, and other innovative human-system integration modeling, simulation, and collaboration applications.

  10. Programming Scala Scalability = Functional Programming + Objects

    CERN Document Server

    Wampler, Dean

    2009-01-01

    Learn how to be more productive with Scala, a new multi-paradigm language for the Java Virtual Machine (JVM) that integrates features of both object-oriented and functional programming. With this book, you'll discover why Scala is ideal for highly scalable, component-based applications that support concurrency and distribution. Programming Scala clearly explains the advantages of Scala as a JVM language. You'll learn how to leverage the wealth of Java class libraries to meet the practical needs of enterprise and Internet projects more easily. Packed with code examples, this book provides us

  11. Tip-Based Nanofabrication for Scalable Manufacturing

    Directory of Open Access Journals (Sweden)

    Huan Hu

    2017-03-01

    Full Text Available Tip-based nanofabrication (TBN is a family of emerging nanofabrication techniques that use a nanometer scale tip to fabricate nanostructures. In this review, we first introduce the history of the TBN and the technology development. We then briefly review various TBN techniques that use different physical or chemical mechanisms to fabricate features and discuss some of the state-of-the-art techniques. Subsequently, we focus on those TBN methods that have demonstrated potential to scale up the manufacturing throughput. Finally, we discuss several research directions that are essential for making TBN a scalable nano-manufacturing technology.

  12. Tip-Based Nanofabrication for Scalable Manufacturing

    International Nuclear Information System (INIS)

    Hu, Huan; Somnath, Suhas

    2017-01-01

    Tip-based nanofabrication (TBN) is a family of emerging nanofabrication techniques that use a nanometer scale tip to fabricate nanostructures. Here in this review, we first introduce the history of the TBN and the technology development. We then briefly review various TBN techniques that use different physical or chemical mechanisms to fabricate features and discuss some of the state-of-the-art techniques. Subsequently, we focus on those TBN methods that have demonstrated potential to scale up the manufacturing throughput. Finally, we discuss several research directions that are essential for making TBN a scalable nano-manufacturing technology.

  13. Towards a Scalable, Biomimetic, Antibacterial Coating

    Science.gov (United States)

    Dickson, Mary Nora

    Corneal afflictions are the second leading cause of blindness worldwide. When a corneal transplant is unavailable or contraindicated, an artificial cornea device is the only chance to save sight. Bacterial or fungal biofilm build up on artificial cornea devices can lead to serious complications including the need for systemic antibiotic treatment and even explantation. As a result, much emphasis has been placed on anti-adhesion chemical coatings and antibiotic leeching coatings. These methods are not long-lasting, and microorganisms can eventually circumvent these measures. Thus, I have developed a surface topographical antimicrobial coating. Various surface structures including rough surfaces, superhydrophobic surfaces, and the natural surfaces of insects' wings and sharks' skin are promising anti-biofilm candidates, however none meet the criteria necessary for implementation on the surface of an artificial cornea device. In this thesis I: 1) developed scalable fabrication protocols for a library of biomimetic nanostructure polymer surfaces 2) assessed the potential these for poly(methyl methacrylate) nanopillars to kill or prevent formation of biofilm by E. coli bacteria and species of Pseudomonas and Staphylococcus bacteria and improved upon a proposed mechanism for the rupture of Gram-negative bacterial cell walls 3) developed a scalable, commercially viable method for producing antibacterial nanopillars on a curved, PMMA artificial cornea device and 4) developed scalable fabrication protocols for implantation of antibacterial nanopatterned surfaces on the surfaces of thermoplastic polyurethane materials, commonly used in catheter tubings. This project constitutes a first step towards fabrication of the first entirely PMMA artificial cornea device. The major finding of this work is that by precisely controlling the topography of a polymer surface at the nano-scale, we can kill adherent bacteria and prevent biofilm formation of certain pathogenic bacteria

  14. Scalable Optical-Fiber Communication Networks

    Science.gov (United States)

    Chow, Edward T.; Peterson, John C.

    1993-01-01

    Scalable arbitrary fiber extension network (SAFEnet) is conceptual fiber-optic communication network passing digital signals among variety of computers and input/output devices at rates from 200 Mb/s to more than 100 Gb/s. Intended for use with very-high-speed computers and other data-processing and communication systems in which message-passing delays must be kept short. Inherent flexibility makes it possible to match performance of network to computers by optimizing configuration of interconnections. In addition, interconnections made redundant to provide tolerance to faults.

  15. Scalable Tensor Factorizations with Missing Data

    DEFF Research Database (Denmark)

    Acar, Evrim; Dunlavy, Daniel M.; Kolda, Tamara G.

    2010-01-01

    of missing data, many important data sets will be discarded or improperly analyzed. Therefore, we need a robust and scalable approach for factorizing multi-way arrays (i.e., tensors) in the presence of missing data. We focus on one of the most well-known tensor factorizations, CANDECOMP/PARAFAC (CP...... is shown to successfully factor tensors with noise and up to 70% missing data. Moreover, our approach is significantly faster than the leading alternative and scales to larger problems. To show the real-world usefulness of CP-WOPT, we illustrate its applicability on a novel EEG (electroencephalogram...

  16. Scalable and Anonymous Group Communication with MTor

    Directory of Open Access Journals (Sweden)

    Lin Dong

    2016-04-01

    Full Text Available This paper presents MTor, a low-latency anonymous group communication system. We construct MTor as an extension to Tor, allowing the construction of multi-source multicast trees on top of the existing Tor infrastructure. MTor does not depend on an external service to broker the group communication, and avoids central points of failure and trust. MTor’s substantial bandwidth savings and graceful scalability enable new classes of anonymous applications that are currently too bandwidth-intensive to be viable through traditional unicast Tor communication-e.g., group file transfer, collaborative editing, streaming video, and real-time audio conferencing.

  17. Grassmann Averages for Scalable Robust PCA

    DEFF Research Database (Denmark)

    Hauberg, Søren; Feragen, Aasa; Black, Michael J.

    2014-01-01

    As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase—“big data” implies “big outliers”. While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can...... to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements...

  18. Hardware Implementation of a Bilateral Subtraction Filter

    Science.gov (United States)

    Huertas, Andres; Watson, Robert; Villalpando, Carlos; Goldberg, Steven

    2009-01-01

    A bilateral subtraction filter has been implemented as a hardware module in the form of a field-programmable gate array (FPGA). In general, a bilateral subtraction filter is a key subsystem of a high-quality stereoscopic machine vision system that utilizes images that are large and/or dense. Bilateral subtraction filters have been implemented in software on general-purpose computers, but the processing speeds attainable in this way even on computers containing the fastest processors are insufficient for real-time applications. The present FPGA bilateral subtraction filter is intended to accelerate processing to real-time speed and to be a prototype of a link in a stereoscopic-machine- vision processing chain, now under development, that would process large and/or dense images in real time and would be implemented in an FPGA. In terms that are necessarily oversimplified for the sake of brevity, a bilateral subtraction filter is a smoothing, edge-preserving filter for suppressing low-frequency noise. The filter operation amounts to replacing the value for each pixel with a weighted average of the values of that pixel and the neighboring pixels in a predefined neighborhood or window (e.g., a 9 9 window). The filter weights depend partly on pixel values and partly on the window size. The present FPGA implementation of a bilateral subtraction filter utilizes a 9 9 window. This implementation was designed to take advantage of the ability to do many of the component computations in parallel pipelines to enable processing of image data at the rate at which they are generated. The filter can be considered to be divided into the following parts (see figure): a) An image pixel pipeline with a 9 9- pixel window generator, b) An array of processing elements; c) An adder tree; d) A smoothing-and-delaying unit; and e) A subtraction unit. After each 9 9 window is created, the affected pixel data are fed to the processing elements. Each processing element is fed the pixel value for

  19. 500 kV mercury accelerator

    International Nuclear Information System (INIS)

    Brodowski, J.; Maschke, A.W.; Mobley, R.M.; Keane, J.T.; Meier, E.

    1979-01-01

    The objective of building a low-cost pre-accelerator for low energy heavy ion particle accelerator was realized by using standard, readily available material and hardware. Some savings were obtained in the construction of the dome by avoiding welding, expensive metal spinnings and unnecessary corona rings. Larger monetary economies were realized by unique approach to building the high voltage column utilizing a glass tube

  20. Electrostatic accelerators

    OpenAIRE

    Hinterberger, F

    2006-01-01

    The principle of electrostatic accelerators is presented. We consider Cockcroft– Walton, Van de Graaff and Tandem Van de Graaff accelerators. We resume high voltage generators such as cascade generators, Van de Graaff band generators, Pelletron generators, Laddertron generators and Dynamitron generators. The speci c features of accelerating tubes, ion optics and methods of voltage stabilization are described. We discuss the characteristic beam properties and the variety of possible beams. We ...

  1. Electrostatic accelerators

    CERN Document Server

    Hinterberger, F

    2006-01-01

    The principle of electrostatic accelerators is presented. We consider Cockcroft– Walton, Van de Graaff and Tandem Van de Graaff accelerators. We resume high voltage generators such as cascade generators, Van de Graaff band generators, Pelletron generators, Laddertron generators and Dynamitron generators. The speci c features of accelerating tubes, ion optics and methods of voltage stabilization are described. We discuss the characteristic beam properties and the variety of possible beams. We sketch possible applications and the progress in the development of electrostatic accelerators.

  2. Accelerator development

    International Nuclear Information System (INIS)

    Anon.

    1975-01-01

    Because the use of accelerated heavy ions would provide many opportunities for new and important studies in nuclear physics and nuclear chemistry, as well as other disciplines, both the Chemistry and Physics Divisions are supporting the development of a heavy-ion accelerator. The design of greatest current interest includes a tandem accelerator with a terminal voltage of approximately 25 MV injecting into a linear accelerator with rf superconducting resonators. This combined accelerator facility would be capable of accelerating ions of masses ranging over the entire periodic table to an energy corresponding to approximately 10 MeV/nucleon. This approach, as compared to other concepts, has the advantages of lower construction costs, lower operating power, 100 percent duty factor, and high beam quality (good energy resolution, good timing resolution, small beam size, and small beam divergence). The included sections describe the concept of the proposed heavy-ion accelerator, and the development program aiming at: (1) investigation of the individual questions concerning the superconducting accelerating resonators; (2) construction and testing of prototype accelerator systems; and (3) search for economical solutions to engineering problems. (U.S.)

  3. Scalability Optimization of Seamless Positioning Service

    Directory of Open Access Journals (Sweden)

    Juraj Machaj

    2016-01-01

    Full Text Available Recently positioning services are getting more attention not only within research community but also from service providers. From the service providers point of view positioning service that will be able to work seamlessly in all environments, for example, indoor, dense urban, and rural, has a huge potential to open new markets. However, such system does not only need to provide accurate position estimates but have to be scalable and resistant to fake positioning requests. In the previous works we have proposed a modular system, which is able to provide seamless positioning in various environments. The system automatically selects optimal positioning module based on available radio signals. The system currently consists of three positioning modules—GPS, GSM based positioning, and Wi-Fi based positioning. In this paper we will propose algorithm which will reduce time needed for position estimation and thus allow higher scalability of the modular system and thus allow providing positioning services to higher amount of users. Such improvement is extremely important, for real world application where large number of users will require position estimates, since positioning error is affected by response time of the positioning server.

  4. Highly scalable Ab initio genomic motif identification

    KAUST Repository

    Marchand, Benoit; Bajic, Vladimir B.; Kaushik, Dinesh

    2011-01-01

    We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time. Copyright 2011 ACM.

  5. Algorithmic psychometrics and the scalable subject.

    Science.gov (United States)

    Stark, Luke

    2018-04-01

    Recent public controversies, ranging from the 2014 Facebook 'emotional contagion' study to psychographic data profiling by Cambridge Analytica in the 2016 American presidential election, Brexit referendum and elsewhere, signal watershed moments in which the intersecting trajectories of psychology and computer science have become matters of public concern. The entangled history of these two fields grounds the application of applied psychological techniques to digital technologies, and an investment in applying calculability to human subjectivity. Today, a quantifiable psychological subject position has been translated, via 'big data' sets and algorithmic analysis, into a model subject amenable to classification through digital media platforms. I term this position the 'scalable subject', arguing it has been shaped and made legible by algorithmic psychometrics - a broad set of affordances in digital platforms shaped by psychology and the behavioral sciences. In describing the contours of this 'scalable subject', this paper highlights the urgent need for renewed attention from STS scholars on the psy sciences, and on a computational politics attentive to psychology, emotional expression, and sociality via digital media.

  6. Scalable Simulation of Electromagnetic Hybrid Codes

    International Nuclear Information System (INIS)

    Perumalla, Kalyan S.; Fujimoto, Richard; Karimabadi, Dr. Homa

    2006-01-01

    New discrete-event formulations of physics simulation models are emerging that can outperform models based on traditional time-stepped techniques. Detailed simulation of the Earth's magnetosphere, for example, requires execution of sub-models that are at widely differing timescales. In contrast to time-stepped simulation which requires tightly coupled updates to entire system state at regular time intervals, the new discrete event simulation (DES) approaches help evolve the states of sub-models on relatively independent timescales. However, parallel execution of DES-based models raises challenges with respect to their scalability and performance. One of the key challenges is to improve the computation granularity to offset synchronization and communication overheads within and across processors. Our previous work was limited in scalability and runtime performance due to the parallelization challenges. Here we report on optimizations we performed on DES-based plasma simulation models to improve parallel performance. The net result is the capability to simulate hybrid particle-in-cell (PIC) models with over 2 billion ion particles using 512 processors on supercomputing platforms

  7. Towards Scalable Graph Computation on Mobile Devices.

    Science.gov (United States)

    Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng

    2014-10-01

    Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach.

  8. Towards Scalable Graph Computation on Mobile Devices

    Science.gov (United States)

    Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng

    2015-01-01

    Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564

  9. Big data integration: scalability and sustainability

    KAUST Repository

    Zhang, Zhang

    2016-01-26

    Integration of various types of omics data is critically indispensable for addressing most important and complex biological questions. In the era of big data, however, data integration becomes increasingly tedious, time-consuming and expensive, posing a significant obstacle to fully exploit the wealth of big biological data. Here we propose a scalable and sustainable architecture that integrates big omics data through community-contributed modules. Community modules are contributed and maintained by different committed groups and each module corresponds to a specific data type, deals with data collection, processing and visualization, and delivers data on-demand via web services. Based on this community-based architecture, we build Information Commons for Rice (IC4R; http://ic4r.org), a rice knowledgebase that integrates a variety of rice omics data from multiple community modules, including genome-wide expression profiles derived entirely from RNA-Seq data, resequencing-based genomic variations obtained from re-sequencing data of thousands of rice varieties, plant homologous genes covering multiple diverse plant species, post-translational modifications, rice-related literatures, and community annotations. Taken together, such architecture achieves integration of different types of data from multiple community-contributed modules and accordingly features scalable, sustainable and collaborative integration of big data as well as low costs for database update and maintenance, thus helpful for building IC4R into a comprehensive knowledgebase covering all aspects of rice data and beneficial for both basic and translational researches.

  10. Scalable High-Performance Parallel Design for Network Intrusion Detection Systems on Many-Core Processors

    OpenAIRE

    Jiang, Hayang; Xie, Gaogang; Salamatian, Kavé; Mathy, Laurent

    2013-01-01

    Network Intrusion Detection Systems (NIDSes) face significant challenges coming from the relentless network link speed growth and increasing complexity of threats. Both hardware accelerated and parallel software-based NIDS solutions, based on commodity multi-core and GPU processors, have been proposed to overcome these challenges. Network Intrusion Detection Systems (NIDSes) face significant challenges coming from the relentless network link speed growth and increasing complexity of threats. ...

  11. Enabling Open Hardware through FOSS tools

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Software developers often take open file formats and tools for granted. When you publish code on github, you do not ask yourself if somebody will be able to open it and modify it. We need the same freedom in the open hardware world, to make it truly accessible for everyone.

  12. Hardware and layout aspects affecting maintainability

    International Nuclear Information System (INIS)

    Jayaraman, V.N.; Surendar, Ch.

    1977-01-01

    It has been found from maintenance experience at the Rajasthan Atomic Power Station that proper hardware and instrumentation layout can reduce maintenance and down-time on the related equipment. The problems faced in this connection and how they were solved is narrated. (M.G.B.)

  13. CAMAC high energy physics electronics hardware

    International Nuclear Information System (INIS)

    Kolpakov, I.F.

    1977-01-01

    CAMAC hardware for high energy physics large spectrometers and control systems is reviewed as is the development of CAMAC modules at the High Energy Laboratory, JINR (Dubna). The total number of crates used at the Laboratory is 179. The number of CAMAC modules of 120 different types exceeds 1700. The principles of organization and the structure of developed CAMAC systems are described. (author)

  14. Building Correlators with Many-Core Hardware

    NARCIS (Netherlands)

    van Nieuwpoort, R.V.

    2010-01-01

    Radio telescopes typically consist of multiple receivers whose signals are cross-correlated to filter out noise. A recent trend is to correlate in software instead of custom-built hardware, taking advantage of the flexibility that software solutions offer. Examples include e-VLBI and LOFAR. However,

  15. Computer hardware for radiologists: Part I

    Directory of Open Access Journals (Sweden)

    Indrajit I

    2010-01-01

    Full Text Available Computers are an integral part of modern radiology practice. They are used in different radiology modalities to acquire, process, and postprocess imaging data. They have had a dramatic influence on contemporary radiology practice. Their impact has extended further with the emergence of Digital Imaging and Communications in Medicine (DICOM, Picture Archiving and Communication System (PACS, Radiology information system (RIS technology, and Teleradiology. A basic overview of computer hardware relevant to radiology practice is presented here. The key hardware components in a computer are the motherboard, central processor unit (CPU, the chipset, the random access memory (RAM, the memory modules, bus, storage drives, and ports. The personnel computer (PC has a rectangular case that contains important components called hardware, many of which are integrated circuits (ICs. The fiberglass motherboard is the main printed circuit board and has a variety of important hardware mounted on it, which are connected by electrical pathways called "buses". The CPU is the largest IC on the motherboard and contains millions of transistors. Its principal function is to execute "programs". A Pentium® 4 CPU has transistors that execute a billion instructions per second. The chipset is completely different from the CPU in design and function; it controls data and interaction of buses between the motherboard and the CPU. Memory (RAM is fundamentally semiconductor chips storing data and instructions for access by a CPU. RAM is classified by storage capacity, access speed, data rate, and configuration.

  16. Computer hardware for radiologists: Part I

    International Nuclear Information System (INIS)

    Indrajit, IK; Alam, A

    2010-01-01

    Computers are an integral part of modern radiology practice. They are used in different radiology modalities to acquire, process, and postprocess imaging data. They have had a dramatic influence on contemporary radiology practice. Their impact has extended further with the emergence of Digital Imaging and Communications in Medicine (DICOM), Picture Archiving and Communication System (PACS), Radiology information system (RIS) technology, and Teleradiology. A basic overview of computer hardware relevant to radiology practice is presented here. The key hardware components in a computer are the motherboard, central processor unit (CPU), the chipset, the random access memory (RAM), the memory modules, bus, storage drives, and ports. The personnel computer (PC) has a rectangular case that contains important components called hardware, many of which are integrated circuits (ICs). The fiberglass motherboard is the main printed circuit board and has a variety of important hardware mounted on it, which are connected by electrical pathways called “buses”. The CPU is the largest IC on the motherboard and contains millions of transistors. Its principal function is to execute “programs”. A Pentium ® 4 CPU has transistors that execute a billion instructions per second. The chipset is completely different from the CPU in design and function; it controls data and interaction of buses between the motherboard and the CPU. Memory (RAM) is fundamentally semiconductor chips storing data and instructions for access by a CPU. RAM is classified by storage capacity, access speed, data rate, and configuration

  17. Environmental Control System Software & Hardware Development

    Science.gov (United States)

    Vargas, Daniel Eduardo

    2017-01-01

    ECS hardware: (1) Provides controlled purge to SLS Rocket and Orion spacecraft. (2) Provide mission-focused engineering products and services. ECS software: (1) NASA requires Compact Unique Identifiers (CUIs); fixed-length identifier used to identify information items. (2) CUI structure; composed of nine semantic fields that aid the user in recognizing its purpose.

  18. Digital Hardware Design Teaching: An Alternative Approach

    Science.gov (United States)

    Benkrid, Khaled; Clayton, Thomas

    2012-01-01

    This article presents the design and implementation of a complete review of undergraduate digital hardware design teaching in the School of Engineering at the University of Edinburgh. Four guiding principles have been used in this exercise: learning-outcome driven teaching, deep learning, affordability, and flexibility. This has identified…

  19. The fast Amsterdam multiprocessor (FAMP) system hardware

    International Nuclear Information System (INIS)

    Hertzberger, L.O.; Kieft, G.; Kisielewski, B.; Wiggers, L.W.; Engster, C.; Koningsveld, L. van

    1981-01-01

    The architecture of a multiprocessor system is described that will be used for on-line filter and second stage trigger applications. The system is based on the MC 68000 microprocessor from Motorola. Emphasis is paid to hardware aspects, in particular the modularity, processor communication and interfacing, whereas the system software and the applications will be described in separate articles. (orig.)

  20. Remote hardware-reconfigurable robotic camera

    Science.gov (United States)

    Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.

    2001-10-01

    In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.

  1. A simple route to scalable fabrication of perfectly ordered ZnO nanorod arrays

    International Nuclear Information System (INIS)

    Liu, D F; Xiang, Y J; Liao, Q; Zhang, J P; Wu, X C; Zhang, Z X; Liu, L F; Ma, W J; Shen, J; Zhou, W Y; Xie, S S

    2007-01-01

    ZnO nanorod arrays with perfect order and uniformity were prepared using a simple, low-cost, commonly available and scalable nanosphere lithography for patterning gold catalyst particles and a successive bottom-up growth technique in a tube furnace chemical vapor deposition system. Each rod in the arrays had perfect surface facets, sharp edges and uniform size. For all of the rods, their sides were oriented the same. This bottom-up assembly method may accelerate the use of ZnO nanorods in real device applications

  2. Accelerated Adaptive MGS Phase Retrieval

    Science.gov (United States)

    Lam, Raymond K.; Ohara, Catherine M.; Green, Joseph J.; Bikkannavar, Siddarayappa A.; Basinger, Scott A.; Redding, David C.; Shi, Fang

    2011-01-01

    The Modified Gerchberg-Saxton (MGS) algorithm is an image-based wavefront-sensing method that can turn any science instrument focal plane into a wavefront sensor. MGS characterizes optical systems by estimating the wavefront errors in the exit pupil using only intensity images of a star or other point source of light. This innovative implementation of MGS significantly accelerates the MGS phase retrieval algorithm by using stream-processing hardware on conventional graphics cards. Stream processing is a relatively new, yet powerful, paradigm to allow parallel processing of certain applications that apply single instructions to multiple data (SIMD). These stream processors are designed specifically to support large-scale parallel computing on a single graphics chip. Computationally intensive algorithms, such as the Fast Fourier Transform (FFT), are particularly well suited for this computing environment. This high-speed version of MGS exploits commercially available hardware to accomplish the same objective in a fraction of the original time. The exploit involves performing matrix calculations in nVidia graphic cards. The graphical processor unit (GPU) is hardware that is specialized for computationally intensive, highly parallel computation. From the software perspective, a parallel programming model is used, called CUDA, to transparently scale multicore parallelism in hardware. This technology gives computationally intensive applications access to the processing power of the nVidia GPUs through a C/C++ programming interface. The AAMGS (Accelerated Adaptive MGS) software takes advantage of these advanced technologies, to accelerate the optical phase error characterization. With a single PC that contains four nVidia GTX-280 graphic cards, the new implementation can process four images simultaneously to produce a JWST (James Webb Space Telescope) wavefront measurement 60 times faster than the previous code.

  3. Accelerating artificial intelligence with reconfigurable computing

    Science.gov (United States)

    Cieszewski, Radoslaw

    Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.

  4. Evolution of control systems for accelerators

    International Nuclear Information System (INIS)

    Crowley-Milling, M.C.

    1983-01-01

    The author reviews the development of control systems for accelerators. After an historical survey and a general introduction the hardware and software of such systems is described. As example the control system of the CERN SP5 is considered. Finally an outlook is given to future developments with special regards to the LEP storage ring. (HSI)

  5. The BNL Accelerator Test Facility control system

    International Nuclear Information System (INIS)

    Malone, R.; Bottke, I.; Fernow, R.; Ben-Zvi, I.

    1993-01-01

    Described is the VAX/CAMAC-based control system for Brookhaven National Laboratory's Accelerator Test Facility, a laser/linac research complex. Details of hardware and software configurations are presented along with experiences of using Vsystem, a commercial control system package

  6. Development of a scalable suspension culture for cardiac differentiation from human pluripotent stem cells

    Directory of Open Access Journals (Sweden)

    Vincent C. Chen

    2015-09-01

    Full Text Available To meet the need of a large quantity of hPSC-derived cardiomyocytes (CM for pre-clinical and clinical studies, a robust and scalable differentiation system for CM production is essential. With a human pluripotent stem cells (hPSC aggregate suspension culture system we established previously, we developed a matrix-free, scalable, and GMP-compliant process for directing hPSC differentiation to CM in suspension culture by modulating Wnt pathways with small molecules. By optimizing critical process parameters including: cell aggregate size, small molecule concentrations, induction timing, and agitation rate, we were able to consistently differentiate hPSCs to >90% CM purity with an average yield of 1.5 to 2 × 109 CM/L at scales up to 1 L spinner flasks. CM generated from the suspension culture displayed typical genetic, morphological, and electrophysiological cardiac cell characteristics. This suspension culture system allows seamless transition from hPSC expansion to CM differentiation in a continuous suspension culture. It not only provides a cost and labor effective scalable process for large scale CM production, but also provides a bioreactor prototype for automation of cell manufacturing, which will accelerate the advance of hPSC research towards therapeutic applications.

  7. RECIRCULATING ACCELERATION

    International Nuclear Information System (INIS)

    BERG, J.S.; GARREN, A.A.; JOHNSTONE, C.

    2000-01-01

    This paper compares various types of recirculating accelerators, outlining the advantages and disadvantages of various approaches. The accelerators are characterized according to the types of arcs they use: whether there is a single arc for the entire recirculator or there are multiple arcs, and whether the arc(s) are isochronous or non-isochronous

  8. LIBO accelerates

    CERN Multimedia

    2002-01-01

    The prototype module of LIBO, a linear accelerator project designed for cancer therapy, has passed its first proton-beam acceleration test. In parallel a new version - LIBO-30 - is being developed, which promises to open up even more interesting avenues.

  9. Accelerating Inspire

    CERN Document Server

    AUTHOR|(CDS)2266999

    2017-01-01

    CERN has been involved in the dissemination of scientific results since its early days and has continuously updated the distribution channels. Currently, Inspire hosts catalogues of articles, authors, institutions, conferences, jobs, experiments, journals and more. Successful orientation among this amount of data requires comprehensive linking between the content. Inspire has lacked a system for linking experiments and articles together based on which accelerator they were conducted at. The purpose of this project has been to create such a system. Records for 156 accelerators were created and all 2913 experiments on Inspire were given corresponding MARC tags. Records of 18404 accelerator physics related bibliographic entries were also tagged with corresponding accelerator tags. Finally, as a part of the endeavour to broaden CERN's presence on Wikipedia, existing Wikipedia articles of accelerators were updated with short descriptions and links to Inspire. In total, 86 Wikipedia articles were updated. This repo...

  10. Low cost, scalable proteomics data analysis using Amazon's cloud computing services and open source search algorithms.

    Science.gov (United States)

    Halligan, Brian D; Geiger, Joey F; Vallejos, Andrew K; Greene, Andrew S; Twigger, Simon N

    2009-06-01

    One of the major difficulties for many laboratories setting up proteomics programs has been obtaining and maintaining the computational infrastructure required for the analysis of the large flow of proteomics data. We describe a system that combines distributed cloud computing and open source software to allow laboratories to set up scalable virtual proteomics analysis clusters without the investment in computational hardware or software licensing fees. Additionally, the pricing structure of distributed computing providers, such as Amazon Web Services, allows laboratories or even individuals to have large-scale computational resources at their disposal at a very low cost per run. We provide detailed step-by-step instructions on how to implement the virtual proteomics analysis clusters as well as a list of current available preconfigured Amazon machine images containing the OMSSA and X!Tandem search algorithms and sequence databases on the Medical College of Wisconsin Proteomics Center Web site ( http://proteomics.mcw.edu/vipdac ).

  11. Internet-based hardware/software co-design framework for embedded 3D graphics applications

    Directory of Open Access Journals (Sweden)

    Wong Weng-Fai

    2011-01-01

    Full Text Available Abstract Advances in technology are making it possible to run three-dimensional (3D graphics applications on embedded and handheld devices. In this article, we propose a hardware/software co-design environment for 3D graphics application development that includes the 3D graphics software, OpenGL ES application programming interface (API, device driver, and 3D graphics hardware simulators. We developed a 3D graphics system-on-a-chip (SoC accelerator using transaction-level modeling (TLM. This gives software designers early access to the hardware even before it is ready. On the other hand, hardware designers also stand to gain from the more complex test benches made available in the software for verification. A unique aspect of our framework is that it allows hardware and software designers from geographically dispersed areas to cooperate and work on the same framework. Designs can be entered and executed from anywhere in the world without full access to the entire framework, which may include proprietary components. This results in controlled and secure transparency and reproducibility, granting leveled access to users of various roles.

  12. The control computer for the Chalk River electron test accelerator

    International Nuclear Information System (INIS)

    McMichael, G.E.; Fraser, J.S.; McKeown, J.

    1978-02-01

    A versatile control and data acquisition system has been developed for a modest-sized linear accelerator using mainly process I/O hardware and software. This report describes the evolution of the present system since 1972, the modifications needed to satisfy the changing requirements of the various accelerator physics experiments and the limitations of such a system in process control. (author)

  13. Architecture design of reconfigurable accelerators for demanding apllications.

    NARCIS (Netherlands)

    Jozwiak, L.; Jan, Y.

    2010-01-01

    This paper focuses on mastering the architecture development of reconfigurable hardware accelerators for highly demanding applications. It presents the results of our analysis of the main issues that have to be addressed when designing accelerators for demanding applications, when using as an

  14. Automatic generation of application specific FPGA multicore accelerators

    DEFF Research Database (Denmark)

    Hindborg, Andreas Erik; Schleuniger, Pascal; Jensen, Nicklas Bo

    2014-01-01

    High performance computing systems make increasing use of hardware accelerators to improve performance and power properties. For large high-performance FPGAs to be successfully integrated in such computing systems, methods to raise the abstraction level of FPGA programming are required...... to identify optimal performance energy trade-offs points for a multicore based FPGA accelerator....

  15. Harnessing the crowd to accelerate molecular medicine research.

    Science.gov (United States)

    Smith, Robert J; Merchant, Raina M

    2015-07-01

    Crowdsourcing presents a novel approach to solving complex problems within molecular medicine. By leveraging the expertise of fellow scientists across the globe, broadcasting to and engaging the public for idea generation, harnessing a scalable workforce for quick data management, and fundraising for research endeavors, crowdsourcing creates novel opportunities for accelerating scientific progress. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Scalable conditional induction variables (CIV) analysis

    DEFF Research Database (Denmark)

    Oancea, Cosmin Eugen; Rauchwerger, Lawrence

    2015-01-01

    parallelizing compiler and evaluated its impact on five Fortran benchmarks. We have found that that there are many important loops using CIV subscripts and that our analysis can lead to their scalable parallelization. This in turn has led to the parallelization of the benchmark programs they appear in.......Subscripts using induction variables that cannot be expressed as a formula in terms of the enclosing-loop indices appear in the low-level implementation of common programming abstractions such as filter, or stack operations and pose significant challenges to automatic parallelization. Because...... the complexity of such induction variables is often due to their conditional evaluation across the iteration space of loops we name them Conditional Induction Variables (CIV). This paper presents a flow-sensitive technique that summarizes both such CIV-based and affine subscripts to program level, using the same...

  17. Scalable Faceted Ranking in Tagging Systems

    Science.gov (United States)

    Orlicki, José I.; Alvarez-Hamelin, J. Ignacio; Fierens, Pablo I.

    Nowadays, web collaborative tagging systems which allow users to upload, comment on and recommend contents, are growing. Such systems can be represented as graphs where nodes correspond to users and tagged-links to recommendations. In this paper we analyze the problem of computing a ranking of users with respect to a facet described as a set of tags. A straightforward solution is to compute a PageRank-like algorithm on a facet-related graph, but it is not feasible for online computation. We propose an alternative: (i) a ranking for each tag is computed offline on the basis of tag-related subgraphs; (ii) a faceted order is generated online by merging rankings corresponding to all the tags in the facet. Based on the graph analysis of YouTube and Flickr, we show that step (i) is scalable. We also present efficient algorithms for step (ii), which are evaluated by comparing their results with two gold standards.

  18. A graph algebra for scalable visual analytics.

    Science.gov (United States)

    Shaverdian, Anna A; Zhou, Hao; Michailidis, George; Jagadish, Hosagrahar V

    2012-01-01

    Visual analytics (VA), which combines analytical techniques with advanced visualization features, is fast becoming a standard tool for extracting information from graph data. Researchers have developed many tools for this purpose, suggesting a need for formal methods to guide these tools' creation. Increased data demands on computing requires redesigning VA tools to consider performance and reliability in the context of analysis of exascale datasets. Furthermore, visual analysts need a way to document their analyses for reuse and results justification. A VA graph framework encapsulated in a graph algebra helps address these needs. Its atomic operators include selection and aggregation. The framework employs a visual operator and supports dynamic attributes of data to enable scalable visual exploration of data.

  19. Parallel scalability of Hartree-Fock calculations

    Science.gov (United States)

    Chow, Edmond; Liu, Xing; Smelyanskiy, Mikhail; Hammond, Jeff R.

    2015-03-01

    Quantum chemistry is increasingly performed using large cluster computers consisting of multiple interconnected nodes. For a fixed molecular problem, the efficiency of a calculation usually decreases as more nodes are used, due to the cost of communication between the nodes. This paper empirically investigates the parallel scalability of Hartree-Fock calculations. The construction of the Fock matrix and the density matrix calculation are analyzed separately. For the former, we use a parallelization of Fock matrix construction based on a static partitioning of work followed by a work stealing phase. For the latter, we use density matrix purification from the linear scaling methods literature, but without using sparsity. When using large numbers of nodes for moderately sized problems, density matrix computations are network-bandwidth bound, making purification methods potentially faster than eigendecomposition methods.

  20. iSIGHT-FD scalability test report.

    Energy Technology Data Exchange (ETDEWEB)

    Clay, Robert L.; Shneider, Max S.

    2008-07-01

    The engineering analysis community at Sandia National Laboratories uses a number of internal and commercial software codes and tools, including mesh generators, preprocessors, mesh manipulators, simulation codes, post-processors, and visualization packages. We define an analysis workflow as the execution of an ordered, logical sequence of these tools. Various forms of analysis (and in particular, methodologies that use multiple function evaluations or samples) involve executing parameterized variations of these workflows. As part of the DART project, we are evaluating various commercial workflow management systems, including iSIGHT-FD from Engineous. This report documents the results of a scalability test that was driven by DAKOTA and conducted on a parallel computer (Thunderbird). The purpose of this experiment was to examine the suitability and performance of iSIGHT-FD for large-scale, parameterized analysis workflows. As the results indicate, we found iSIGHT-FD to be suitable for this type of application.

  1. Scalable group level probabilistic sparse factor analysis

    DEFF Research Database (Denmark)

    Hinrich, Jesper Løve; Nielsen, Søren Føns Vind; Riis, Nicolai Andre Brogaard

    2017-01-01

    Many data-driven approaches exist to extract neural representations of functional magnetic resonance imaging (fMRI) data, but most of them lack a proper probabilistic formulation. We propose a scalable group level probabilistic sparse factor analysis (psFA) allowing spatially sparse maps, component...... pruning using automatic relevance determination (ARD) and subject specific heteroscedastic spatial noise modeling. For task-based and resting state fMRI, we show that the sparsity constraint gives rise to components similar to those obtained by group independent component analysis. The noise modeling...... shows that noise is reduced in areas typically associated with activation by the experimental design. The psFA model identifies sparse components and the probabilistic setting provides a natural way to handle parameter uncertainties. The variational Bayesian framework easily extends to more complex...

  2. Scalable on-chip quantum state tomography

    Science.gov (United States)

    Titchener, James G.; Gräfe, Markus; Heilmann, René; Solntsev, Alexander S.; Szameit, Alexander; Sukhorukov, Andrey A.

    2018-03-01

    Quantum information systems are on a path to vastly exceed the complexity of any classical device. The number of entangled qubits in quantum devices is rapidly increasing, and the information required to fully describe these systems scales exponentially with qubit number. This scaling is the key benefit of quantum systems, however it also presents a severe challenge. To characterize such systems typically requires an exponentially long sequence of different measurements, becoming highly resource demanding for large numbers of qubits. Here we propose and demonstrate a novel and scalable method for characterizing quantum systems based on expanding a multi-photon state to larger dimensionality. We establish that the complexity of this new measurement technique only scales linearly with the number of qubits, while providing a tomographically complete set of data without a need for reconfigurability. We experimentally demonstrate an integrated photonic chip capable of measuring two- and three-photon quantum states with statistical reconstruction fidelity of 99.71%.

  3. The Concept of Business Model Scalability

    DEFF Research Database (Denmark)

    Nielsen, Christian; Lund, Morten

    2015-01-01

    The power of business models lies in their ability to visualize and clarify how firms’ may configure their value creation processes. Among the key aspects of business model thinking are a focus on what the customer values, how this value is best delivered to the customer and how strategic partners...... are leveraged in this value creation, delivery and realization exercise. Central to the mainstream understanding of business models is the value proposition towards the customer and the hypothesis generated is that if the firm delivers to the customer what he/she requires, then there is a good foundation...... for a long-term profitable business. However, the message conveyed in this article is that while providing a good value proposition may help the firm ‘get by’, the really successful businesses of today are those able to reach the sweet-spot of business model scalability. This article introduces and discusses...

  4. The scalable coherent interface, IEEE P1596

    International Nuclear Information System (INIS)

    Gustavson, D.B.

    1990-01-01

    IEEE P1596, the scalable coherent interface (formerly known as SuperBus) is based on experience gained while developing Fastbus (ANSI/IEEE 960--1986, IEC 935), Futurebus (IEEE P896.x) and other modern 32-bit buses. SCI goals include a minimum bandwidth of 1 GByte/sec per processor in multiprocessor systems with thousands of processors; efficient support of a coherent distributed-cache image of distributed shared memory; support for repeaters which interface to existing or future buses; and support for inexpensive small rings as well as for general switched interconnections like Banyan, Omega, or crossbar networks. This paper presents a summary of current directions, reports the status of the work in progress, and suggests some applications in data acquisition and physics

  5. BASSET: Scalable Gateway Finder in Large Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Tong, H; Papadimitriou, S; Faloutsos, C; Yu, P S; Eliassi-Rad, T

    2010-11-03

    Given a social network, who is the best person to introduce you to, say, Chris Ferguson, the poker champion? Or, given a network of people and skills, who is the best person to help you learn about, say, wavelets? The goal is to find a small group of 'gateways': persons who are close enough to us, as well as close enough to the target (person, or skill) or, in other words, are crucial in connecting us to the target. The main contributions are the following: (a) we show how to formulate this problem precisely; (b) we show that it is sub-modular and thus it can be solved near-optimally; (c) we give fast, scalable algorithms to find such gateways. Experiments on real data sets validate the effectiveness and efficiency of the proposed methods, achieving up to 6,000,000x speedup.

  6. Scalable quantum search using trapped ions

    International Nuclear Information System (INIS)

    Ivanov, S. S.; Ivanov, P. A.; Linington, I. E.; Vitanov, N. V.

    2010-01-01

    We propose a scalable implementation of Grover's quantum search algorithm in a trapped-ion quantum information processor. The system is initialized in an entangled Dicke state by using adiabatic techniques. The inversion-about-average and oracle operators take the form of single off-resonant laser pulses. This is made possible by utilizing the physical symmetries of the trapped-ion linear crystal. The physical realization of the algorithm represents a dramatic simplification: each logical iteration (oracle and inversion about average) requires only two physical interaction steps, in contrast to the large number of concatenated gates required by previous approaches. This not only facilitates the implementation but also increases the overall fidelity of the algorithm.

  7. Scalable graphene aptasensors for drug quantification

    Science.gov (United States)

    Vishnubhotla, Ramya; Ping, Jinglei; Gao, Zhaoli; Lee, Abigail; Saouaf, Olivia; Vrudhula, Amey; Johnson, A. T. Charlie

    2017-11-01

    Simpler and more rapid approaches for therapeutic drug-level monitoring are highly desirable to enable use at the point-of-care. We have developed an all-electronic approach for detection of the HIV drug tenofovir based on scalable fabrication of arrays of graphene field-effect transistors (GFETs) functionalized with a commercially available DNA aptamer. The shift in the Dirac voltage of the GFETs varied systematically with the concentration of tenofovir in deionized water, with a detection limit less than 1 ng/mL. Tests against a set of negative controls confirmed the specificity of the sensor response. This approach offers the potential for further development into a rapid and convenient point-of-care tool with clinically relevant performance.

  8. A Cross-Platform Infrastructure for Scalable Runtime Application Performance Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Jack Dongarra; Shirley Moore; Bart Miller, Jeffrey Hollingsworth; Tracy Rafferty

    2005-03-15

    The purpose of this project was to build an extensible cross-platform infrastructure to facilitate the development of accurate and portable performance analysis tools for current and future high performance computing (HPC) architectures. Major accomplishments include tools and techniques for multidimensional performance analysis, as well as improved support for dynamic performance monitoring of multithreaded and multiprocess applications. Previous performance tool development has been limited by the burden of having to re-write a platform-dependent low-level substrate for each architecture/operating system pair in order to obtain the necessary performance data from the system. Manual interpretation of performance data is not scalable for large-scale long-running applications. The infrastructure developed by this project provides a foundation for building portable and scalable performance analysis tools, with the end goal being to provide application developers with the information they need to analyze, understand, and tune the performance of terascale applications on HPC architectures. The backend portion of the infrastructure provides runtime instrumentation capability and access to hardware performance counters, with thread-safety for shared memory environments and a communication substrate to support instrumentation of multiprocess and distributed programs. Front end interfaces provides tool developers with a well-defined, platform-independent set of calls for requesting performance data. End-user tools have been developed that demonstrate runtime data collection, on-line and off-line analysis of performance data, and multidimensional performance analysis. The infrastructure is based on two underlying performance instrumentation technologies. These technologies are the PAPI cross-platform library interface to hardware performance counters and the cross-platform Dyninst library interface for runtime modification of executable images. The Paradyn and KOJAK

  9. Scalable Light Module for Low-Cost, High-Efficiency Light- Emitting Diode Luminaires

    Energy Technology Data Exchange (ETDEWEB)

    Tarsa, Eric [Cree, Inc., Goleta, CA (United States)

    2015-08-31

    During this two-year program Cree developed a scalable, modular optical architecture for low-cost, high-efficacy light emitting diode (LED) luminaires. Stated simply, the goal of this architecture was to efficiently and cost-effectively convey light from LEDs (point sources) to broad luminaire surfaces (area sources). By simultaneously developing warm-white LED components and low-cost, scalable optical elements, a high system optical efficiency resulted. To meet program goals, Cree evaluated novel approaches to improve LED component efficacy at high color quality while not sacrificing LED optical efficiency relative to conventional packages. Meanwhile, efficiently coupling light from LEDs into modular optical elements, followed by optimally distributing and extracting this light, were challenges that were addressed via novel optical design coupled with frequent experimental evaluations. Minimizing luminaire bill of materials and assembly costs were two guiding principles for all design work, in the effort to achieve luminaires with significantly lower normalized cost ($/klm) than existing LED fixtures. Chief project accomplishments included the achievement of >150 lm/W warm-white LEDs having primary optics compatible with low-cost modular optical elements. In addition, a prototype Light Module optical efficiency of over 90% was measured, demonstrating the potential of this scalable architecture for ultra-high-efficacy LED luminaires. Since the project ended, Cree has continued to evaluate optical element fabrication and assembly methods in an effort to rapidly transfer this scalable, cost-effective technology to Cree production development groups. The Light Module concept is likely to make a strong contribution to the development of new cost-effective, high-efficacy luminaries, thereby accelerating widespread adoption of energy-saving SSL in the U.S.

  10. Scalable Transactions for Web Applications in the Cloud

    NARCIS (Netherlands)

    Zhou, W.; Pierre, G.E.O.; Chi, C.-H.

    2009-01-01

    Cloud Computing platforms provide scalability and high availability properties for web applications but they sacrifice data consistency at the same time. However, many applications cannot afford any data inconsistency. We present a scalable transaction manager for NoSQL cloud database services to

  11. New Complexity Scalable MPEG Encoding Techniques for Mobile Applications

    Directory of Open Access Journals (Sweden)

    Stephan Mietens

    2004-03-01

    Full Text Available Complexity scalability offers the advantage of one-time design of video applications for a large product family, including mobile devices, without the need of redesigning the applications on the algorithmic level to meet the requirements of the different products. In this paper, we present complexity scalable MPEG encoding having core modules with modifications for scalability. The interdependencies of the scalable modules and the system performance are evaluated. Experimental results show scalability giving a smooth change in complexity and corresponding video quality. Scalability is basically achieved by varying the number of computed DCT coefficients and the number of evaluated motion vectors but other modules are designed such they scale with the previous parameters. In the experiments using the “Stefan” sequence, the elapsed execution time of the scalable encoder, reflecting the computational complexity, can be gradually reduced to roughly 50% of its original execution time. The video quality scales between 20 dB and 48 dB PSNR with unity quantizer setting, and between 21.5 dB and 38.5 dB PSNR for different sequences targeting 1500 kbps. The implemented encoder and the scalability techniques can be successfully applied in mobile systems based on MPEG video compression.

  12. Building scalable apps with Redis and Node.js

    CERN Document Server

    Johanan, Joshua

    2014-01-01

    If the phrase scalability sounds alien to you, then this is an ideal book for you. You will not need much Node.js experience as each framework is demonstrated in a way that requires no previous knowledge of the framework. You will be building scalable Node.js applications in no time! Knowledge of JavaScript is required.

  13. Fourier transform based scalable image quality measure.

    Science.gov (United States)

    Narwaria, Manish; Lin, Weisi; McLoughlin, Ian; Emmanuel, Sabu; Chia, Liang-Tien

    2012-08-01

    We present a new image quality assessment (IQA) algorithm based on the phase and magnitude of the 2D (twodimensional) Discrete Fourier Transform (DFT). The basic idea is to compare the phase and magnitude of the reference and distorted images to compute the quality score. However, it is well known that the Human Visual Systems (HVSs) sensitivity to different frequency components is not the same. We accommodate this fact via a simple yet effective strategy of nonuniform binning of the frequency components. This process also leads to reduced space representation of the image thereby enabling the reduced-reference (RR) prospects of the proposed scheme. We employ linear regression to integrate the effects of the changes in phase and magnitude. In this way, the required weights are determined via proper training and hence more convincing and effective. Lastly, using the fact that phase usually conveys more information than magnitude, we use only the phase for RR quality assessment. This provides the crucial advantage of further reduction in the required amount of reference image information. The proposed method is therefore further scalable for RR scenarios. We report extensive experimental results using a total of 9 publicly available databases: 7 image (with a total of 3832 distorted images with diverse distortions) and 2 video databases (totally 228 distorted videos). These show that the proposed method is overall better than several of the existing fullreference (FR) algorithms and two RR algorithms. Additionally, there is a graceful degradation in prediction performance as the amount of reference image information is reduced thereby confirming its scalability prospects. To enable comparisons and future study, a Matlab implementation of the proposed algorithm is available at http://www.ntu.edu.sg/home/wslin/reduced_phase.rar.

  14. Improving diabetes medication adherence: successful, scalable interventions

    Directory of Open Access Journals (Sweden)

    Zullig LL

    2015-01-01

    Full Text Available Leah L Zullig,1,2 Walid F Gellad,3,4 Jivan Moaddeb,2,5 Matthew J Crowley,1,2 William Shrank,6 Bradi B Granger,7 Christopher B Granger,8 Troy Trygstad,9 Larry Z Liu,10 Hayden B Bosworth1,2,7,11 1Center for Health Services Research in Primary Care, Durham Veterans Affairs Medical Center, Durham, NC, USA; 2Department of Medicine, Duke University, Durham, NC, USA; 3Center for Health Equity Research and Promotion, Pittsburgh Veterans Affairs Medical Center, Pittsburgh, PA, USA; 4Division of General Internal Medicine, University of Pittsburgh, Pittsburgh, PA, USA; 5Institute for Genome Sciences and Policy, Duke University, Durham, NC, USA; 6CVS Caremark Corporation; 7School of Nursing, Duke University, Durham, NC, USA; 8Department of Medicine, Division of Cardiology, Duke University School of Medicine, Durham, NC, USA; 9North Carolina Community Care Networks, Raleigh, NC, USA; 10Pfizer, Inc., and Weill Medical College of Cornell University, New York, NY, USA; 11Department of Psychiatry and Behavioral Sciences, Duke University School of Medicine, Durham, NC, USA Abstract: Effective medications are a cornerstone of prevention and disease treatment, yet only about half of patients take their medications as prescribed, resulting in a common and costly public health challenge for the US healthcare system. Since poor medication adherence is a complex problem with many contributing causes, there is no one universal solution. This paper describes interventions that were not only effective in improving medication adherence among patients with diabetes, but were also potentially scalable (ie, easy to implement to a large population. We identify key characteristics that make these interventions effective and scalable. This information is intended to inform healthcare systems seeking proven, low resource, cost-effective solutions to improve medication adherence. Keywords: medication adherence, diabetes mellitus, chronic disease, dissemination research

  15. Advanced technologies for scalable ATLAS conditions database access on the grid

    International Nuclear Information System (INIS)

    Basset, R; Canali, L; Girone, M; Hawkings, R; Valassi, A; Viegas, F; Dimitrov, G; Nevski, P; Vaniachine, A; Walker, R; Wong, A

    2010-01-01

    During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.

  16. Construction of a smart medication dispenser with high degree of scalability and remote manageability.

    Science.gov (United States)

    Pak, JuGeon; Park, KeeHyun

    2012-01-01

    We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispensing tray (MDT). In the proposed dispenser, the medication for each patient is stored in an MDT. One smart medication dispenser contains mainly one MDT; however, the dispenser can be extended to include more MDTs in order to support multiple users using one dispenser. For remote management, the proposed dispenser transmits the medication status and the system configurations to the monitoring server. In the case of a specific event such as a shortage of medication, memory overload, software error, or non-adherence, the event is transmitted immediately. All these operations are performed automatically without the intervention of patients, through the agent program installed in the dispenser. Results of implementation and verification show that the proposed dispenser operates normally and performs the management operations from the medication monitoring server suitably.

  17. Construction of a Smart Medication Dispenser with High Degree of Scalability and Remote Manageability

    Directory of Open Access Journals (Sweden)

    JuGeon Pak

    2012-01-01

    Full Text Available We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispensing tray (MDT. In the proposed dispenser, the medication for each patient is stored in an MDT. One smart medication dispenser contains mainly one MDT; however, the dispenser can be extended to include more MDTs in order to support multiple users using one dispenser. For remote management, the proposed dispenser transmits the medication status and the system configurations to the monitoring server. In the case of a specific event such as a shortage of medication, memory overload, software error, or non-adherence, the event is transmitted immediately. All these operations are performed automatically without the intervention of patients, through the agent program installed in the dispenser. Results of implementation and verification show that the proposed dispenser operates normally and performs the management operations from the medication monitoring server suitably.

  18. Novel flat datacenter network architecture based on scalable and flow-controlled optical switch system.

    Science.gov (United States)

    Miao, Wang; Luo, Jun; Di Lucente, Stefano; Dorren, Harm; Calabretta, Nicola

    2014-02-10

    We propose and demonstrate an optical flat datacenter network based on scalable optical switch system with optical flow control. Modular structure with distributed control results in port-count independent optical switch reconfiguration time. RF tone in-band labeling technique allowing parallel processing of the label bits ensures the low latency operation regardless of the switch port-count. Hardware flow control is conducted at optical level by re-using the label wavelength without occupying extra bandwidth, space, and network resources which further improves the performance of latency within a simple structure. Dynamic switching including multicasting operation is validated for a 4 x 4 system. Error free operation of 40 Gb/s data packets has been achieved with only 1 dB penalty. The system could handle an input load up to 0.5 providing a packet loss lower that 10(-5) and an average latency less that 500 ns when a buffer size of 16 packets is employed. Investigation on scalability also indicates that the proposed system could potentially scale up to large port count with limited power penalty.

  19. Accelerating Climate Simulations Through Hybrid Computing

    Science.gov (United States)

    Zhou, Shujia; Sinno, Scott; Cruz, Carlos; Purcell, Mark

    2009-01-01

    Unconventional multi-core processors (e.g., IBM Cell B/E and NYIDIDA GPU) have emerged as accelerators in climate simulation. However, climate models typically run on parallel computers with conventional processors (e.g., Intel and AMD) using MPI. Connecting accelerators to this architecture efficiently and easily becomes a critical issue. When using MPI for connection, we identified two challenges: (1) identical MPI implementation is required in both systems, and; (2) existing MPI code must be modified to accommodate the accelerators. In response, we have extended and deployed IBM Dynamic Application Virtualization (DAV) in a hybrid computing prototype system (one blade with two Intel quad-core processors, two IBM QS22 Cell blades, connected with Infiniband), allowing for seamlessly offloading compute-intensive functions to remote, heterogeneous accelerators in a scalable, load-balanced manner. Currently, a climate solar radiation model running with multiple MPI processes has been offloaded to multiple Cell blades with approx.10% network overhead.

  20. Fuel cell hardware-in-loop

    Energy Technology Data Exchange (ETDEWEB)

    Moore, R.M.; Randolf, G.; Virji, M. [University of Hawaii, Hawaii Natural Energy Institute (United States); Hauer, K.H. [Xcellvision (Germany)

    2006-11-08

    Hardware-in-loop (HiL) methodology is well established in the automotive industry. One typical application is the development and validation of control algorithms for drive systems by simulating the vehicle plus the vehicle environment in combination with specific control hardware as the HiL component. This paper introduces the use of a fuel cell HiL methodology for fuel cell and fuel cell system design and evaluation-where the fuel cell (or stack) is the unique HiL component that requires evaluation and development within the context of a fuel cell system designed for a specific application (e.g., a fuel cell vehicle) in a typical use pattern (e.g., a standard drive cycle). Initial experimental results are presented for the example of a fuel cell within a fuel cell vehicle simulation under a dynamic drive cycle. (author)

  1. Advanced hardware design for error correcting codes

    CERN Document Server

    Coussy, Philippe

    2015-01-01

    This book provides thorough coverage of error correcting techniques. It includes essential basic concepts and the latest advances on key topics in design, implementation, and optimization of hardware/software systems for error correction. The book’s chapters are written by internationally recognized experts in this field. Topics include evolution of error correction techniques, industrial user needs, architectures, and design approaches for the most advanced error correcting codes (Polar Codes, Non-Binary LDPC, Product Codes, etc). This book provides access to recent results, and is suitable for graduate students and researchers of mathematics, computer science, and engineering. • Examines how to optimize the architecture of hardware design for error correcting codes; • Presents error correction codes from theory to optimized architecture for the current and the next generation standards; • Provides coverage of industrial user needs advanced error correcting techniques.

  2. FMIT accelerator

    International Nuclear Information System (INIS)

    Armstrong, D.D.

    1983-01-01

    A 35-MeV 100-mA cw linear accelerator is being designed by Los Alamos for use in the Fusion Materials Irradiation Test (FMIT) Facility. Essential to this program is the design, construction, and evaluation of performance of the accelerator's injector, low-energy beam transport, and radio-frequency quadrupole sections before they are shipped to the facility site. The installation and testing of some of these sections have begun as well as the testing of the rf, noninterceptive beam diagnostics, computer control, dc power, and vacuum systems. An overview of the accelerator systems and the performance to date is given

  3. Electron accelerator

    International Nuclear Information System (INIS)

    Abramyan.

    1981-01-01

    The USSR produces an electron accelerator family of a simple design powered straight from the mains. The specifications are given of accelerators ELITA-400, ELITA-3, ELT-2, TEUS-3 and RIUS-5 with maximum electron energies of 0.3 to 5 MeV, a mean power of 10 to 70 kW operating in both the pulsed and the continuous (TEUS-3) modes. Pulsed accelerators ELITA-400 and ELITA-3 and RIUS-5 in which TESLA resonance transformers are used are characterized by their compact size. (Ha)

  4. Scalable and Media Aware Adaptive Video Streaming over Wireless Networks

    Directory of Open Access Journals (Sweden)

    Béatrice Pesquet-Popescu

    2008-07-01

    Full Text Available This paper proposes an advanced video streaming system based on scalable video coding in order to optimize resource utilization in wireless networks with retransmission mechanisms at radio protocol level. The key component of this system is a packet scheduling algorithm which operates on the different substreams of a main scalable video stream and which is implemented in a so-called media aware network element. The concerned type of transport channel is a dedicated channel subject to parameters (bitrate, loss rate variations on the long run. Moreover, we propose a combined scalability approach in which common temporal and SNR scalability features can be used jointly with a partitioning of the image into regions of interest. Simulation results show that our approach provides substantial quality gain compared to classical packet transmission methods and they demonstrate how ROI coding combined with SNR scalability allows to improve again the visual quality.

  5. Hardware Design of a Smart Meter

    OpenAIRE

    Ganiyu A. Ajenikoko; Anthony A. Olaomi

    2014-01-01

    Smart meters are electronic measurement devices used by utilities to communicate information for billing customers and operating their electric systems. This paper presents the hardware design of a smart meter. Sensing and circuit protection circuits are included in the design of the smart meter in which resistors are naturally a fundamental part of the electronic design. Smart meters provides a route for energy savings, real-time pricing, automated data collection and elimina...

  6. Optimization Strategies for Hardware-Based Cofactorization

    Science.gov (United States)

    Loebenberger, Daniel; Putzka, Jens

    We use the specific structure of the inputs to the cofactorization step in the general number field sieve (GNFS) in order to optimize the runtime for the cofactorization step on a hardware cluster. An optimal distribution of bitlength-specific ECM modules is proposed and compared to existing ones. With our optimizations we obtain a speedup between 17% and 33% of the cofactorization step of the GNFS when compared to the runtime of an unoptimized cluster.

  7. Particle Transport Simulation on Heterogeneous Hardware

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    CPUs and GPGPUs. About the speaker Vladimir Koylazov is CTO and founder of Chaos Software and one of the original developers of the V-Ray raytracing software. Passionate about 3D graphics and programming, Vlado is the driving force behind Chaos Group's software solutions. He participated in the implementation of algorithms for accurate light simulations and support for different hardware platforms, including CPU and GPGPU, as well as distributed calculat...

  8. High exposure rate hardware ALARA plan

    International Nuclear Information System (INIS)

    Nellesen, A.L.

    1996-10-01

    This as low as reasonably achievable review provides a description of the engineering and administrative controls used to manage personnel exposure and to control contamination levels and airborne radioactivity concentrations. HERH waste is hardware found in the N-Fuel Storage Basin, which has a contact dose rate greater than 1 R/hr and used filters. This waste will be collected in the fuel baskets at various locations in the basins

  9. From experiment to design -- Fault characterization and detection in parallel computer systems using computational accelerators

    Science.gov (United States)

    Yim, Keun Soo

    This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure characteristics of CPU-GPU hybrid computer systems under various types of hardware faults. The main characterization targets were faults that are difficult to detect and/or recover from, e.g., faults that cause long latency failures (Ch. 3), faults in dynamically allocated resources (Ch. 4), faults in GPUs (Ch. 5), faults in MPI programs (Ch. 6), and microarchitecture-level faults with specific timing features (Ch. 7). The co-design studies were based on the characterization results. One of the co-designed systems has a set of source-to-source translators that customize and strategically place error detectors in the source code of target GPU programs (Ch. 5). Another co-designed system uses an extension card to learn the normal behavioral and semantic execution patterns of message-passing processes executing on CPUs, and to detect abnormal behaviors of those parallel processes (Ch. 6). The third co-designed system is a co-processor that has a set of new instructions in order to support software-implemented fault detection techniques (Ch. 7). The work described in this dissertation gains more importance because heterogeneous processors have become an essential component of state-of-the-art supercomputers. GPUs were used in three of the five fastest supercomputers that were operating in 2011. Our work included comprehensive fault characterization studies in CPU-GPU hybrid computers. In CPUs, we monitored the target systems for a long period of time after injecting faults (a temporally comprehensive experiment), and injected faults into various types of

  10. Trends in computer hardware and software.

    Science.gov (United States)

    Frankenfeld, F M

    1993-04-01

    Previously identified and current trends in the development of computer systems and in the use of computers for health care applications are reviewed. Trends identified in a 1982 article were increasing miniaturization and archival ability, increasing software costs, increasing software independence, user empowerment through new software technologies, shorter computer-system life cycles, and more rapid development and support of pharmaceutical services. Most of these trends continue today. Current trends in hardware and software include the increasing use of reduced instruction-set computing, migration to the UNIX operating system, the development of large software libraries, microprocessor-based smart terminals that allow remote validation of data, speech synthesis and recognition, application generators, fourth-generation languages, computer-aided software engineering, object-oriented technologies, and artificial intelligence. Current trends specific to pharmacy and hospitals are the withdrawal of vendors of hospital information systems from the pharmacy market, improved linkage of information systems within hospitals, and increased regulation by government. The computer industry and its products continue to undergo dynamic change. Software development continues to lag behind hardware, and its high cost is offsetting the savings provided by hardware.

  11. Software error masking effect on hardware faults

    International Nuclear Information System (INIS)

    Choi, Jong Gyun; Seong, Poong Hyun

    1999-01-01

    Based on the Very High Speed Integrated Circuit (VHSIC) Hardware Description Language (VHDL), in this work, a simulation model for fault injection is developed to estimate the dependability of the digital system in operational phase. We investigated the software masking effect on hardware faults through the single bit-flip and stuck-at-x fault injection into the internal registers of the processor and memory cells. The fault location reaches all registers and memory cells. Fault distribution over locations is randomly chosen based on a uniform probability distribution. Using this model, we have predicted the reliability and masking effect of an application software in a digital system-Interposing Logic System (ILS) in a nuclear power plant. We have considered four the software operational profiles. From the results it was found that the software masking effect on hardware faults should be properly considered for predicting the system dependability accurately in operation phase. It is because the masking effect was formed to have different values according to the operational profile

  12. A Hardware Lab Anywhere At Any Time

    Directory of Open Access Journals (Sweden)

    Tobias Schubert

    2004-12-01

    Full Text Available Scientific technical courses are an important component in any student's education. These courses are usually characterised by the fact that the students execute experiments in special laboratories. This leads to extremely high costs and a reduction in the maximum number of possible participants. From this traditional point of view, it doesn't seem possible to realise the concepts of a Virtual University in the context of sophisticated technical courses since the students must be "on the spot". In this paper we introduce the so-called Mobile Hardware Lab which makes student participation possible at any time and from any place. This lab nevertheless transfers a feeling of being present in a laboratory. This is accomplished with a special Learning Management System in combination with hardware components which correspond to a fully equipped laboratory workstation that are lent out to the students for the duration of the lab. The experiments are performed and solved at home, then handed in electronically. Judging and marking are also both performed electronically. Since 2003 the Mobile Hardware Lab is now offered in a completely web based form.

  13. Instrument hardware and software upgrades at IPNS

    International Nuclear Information System (INIS)

    Worlton, Thomas; Hammonds, John; Mikkelson, D.; Mikkelson, Ruth; Porter, Rodney; Tao, Julian; Chatterjee, Alok

    2006-01-01

    IPNS is in the process of upgrading their time-of-flight neutron scattering instruments with improved hardware and software. The hardware upgrades include replacing old VAX Qbus and Multibus-based data acquisition systems with new systems based on VXI and VME. Hardware upgrades also include expanded detector banks and new detector electronics. Old VAX Fortran-based data acquisition and analysis software is being replaced with new software as part of the ISAW project. ISAW is written in Java for ease of development and portability, and is now used routinely for data visualization, reduction, and analysis on all upgraded instruments. ISAW provides the ability to process and visualize the data from thousands of detector pixels, each having thousands of time channels. These operations can be done interactively through a familiar graphical user interface or automatically through simple scripts. Scripts and operators provided by end users are automatically included in the ISAW menu structure, along with those distributed with ISAW, when the application is started

  14. Optimized hardware framework of MLP with random hidden layers for classification applications

    Science.gov (United States)

    Zyarah, Abdullah M.; Ramesh, Abhishek; Merkel, Cory; Kudithipudi, Dhireesha

    2016-05-01

    Multilayer Perceptron Networks with random hidden layers are very efficient at automatic feature extraction and offer significant performance improvements in the training process. They essentially employ large collection of fixed, random features, and are expedient for form-factor constrained embedded platforms. In this work, a reconfigurable and scalable architecture is proposed for the MLPs with random hidden layers with a customized building block based on CORDIC algorithm. The proposed architecture also exploits fixed point operations for area efficiency. The design is validated for classification on two different datasets. An accuracy of ~ 90% for MNIST dataset and 75% for gender classification on LFW dataset was observed. The hardware has 299 speed-up over the corresponding software realization.

  15. Horizontal Accelerator

    Data.gov (United States)

    Federal Laboratory Consortium — The Horizontal Accelerator (HA) Facility is a versatile research tool available for use on projects requiring simulation of the crash environment. The HA Facility is...

  16. Acceleration theorems

    International Nuclear Information System (INIS)

    Palmer, R.

    1994-06-01

    Electromagnetic fields can be separated into near and far components. Near fields are extensions of static fields. They do not radiate, and they fall off more rapidly from a source than far fields. Near fields can accelerate particles, but the ratio of acceleration to source fields at a distance R, is always less than R/λ or 1, whichever is smaller. Far fields can be represented as sums of plane parallel, transversely polarized waves that travel at the velocity of light. A single such wave in a vacuum cannot give continuous acceleration, and it is shown that no sums of such waves can give net first order acceleration. This theorem is proven in three different ways; each method showing a different aspect of the situation

  17. LINEAR ACCELERATOR

    Science.gov (United States)

    Christofilos, N.C.; Polk, I.J.

    1959-02-17

    Improvements in linear particle accelerators are described. A drift tube system for a linear ion accelerator reduces gap capacity between adjacent drift tube ends. This is accomplished by reducing the ratio of the diameter of the drift tube to the diameter of the resonant cavity. Concentration of magnetic field intensity at the longitudinal midpoint of the external sunface of each drift tube is reduced by increasing the external drift tube diameter at the longitudinal center region.

  18. Fast volume reconstruction in positron emission tomography: Implementation of four algorithms on a high-performance scalable parallel platform

    International Nuclear Information System (INIS)

    Egger, M.L.; Scheurer, A.H.; Joseph, C.

    1996-01-01

    The issue of long reconstruction times in PET has been addressed from several points of view, resulting in an affordable dedicated system capable of handling routine 3D reconstruction in a few minutes per frame: on the hardware side using fast processors and a parallel architecture, and on the software side, using efficient implementations of computationally less intensive algorithms. Execution times obtained for the PRT-1 data set on a parallel system of five hybrid nodes, each combining an Alpha processor for computation and a transputer for communication, are the following (256 sinograms of 96 views by 128 radial samples): Ramp algorithm 56 s, Favor 81 s and reprojection algorithm of Kinahan and Rogers 187 s. The implementation of fast rebinning algorithms has shown our hardware platform to become communications-limited; they execute faster on a conventional single-processor Alpha workstation: single-slice rebinning 7 s, Fourier rebinning 22 s, 2D filtered backprojection 5 s. The scalability of the system has been demonstrated, and a saturation effect at network sizes above ten nodes has become visible; new T9000-based products lifting most of the constraints on network topology and link throughput are expected to result in improved parallel efficiency and scalability properties

  19. Center for Programming Models for Scalable Parallel Computing - Towards Enhancing OpenMP for Manycore and Heterogeneous Nodes

    Energy Technology Data Exchange (ETDEWEB)

    Barbara Chapman

    2012-02-01

    OpenMP was not well recognized at the beginning of the project, around year 2003, because of its limited use in DoE production applications and the inmature hardware support for an efficient implementation. Yet in the recent years, it has been graduately adopted both in HPC applications, mostly in the form of MPI+OpenMP hybrid code, and in mid-scale desktop applications for scientific and experimental studies. We have observed this trend and worked deligiently to improve our OpenMP compiler and runtimes, as well as to work with the OpenMP standard organization to make sure OpenMP are evolved in the direction close to DoE missions. In the Center for Programming Models for Scalable Parallel Computing project, the HPCTools team at the University of Houston (UH), directed by Dr. Barbara Chapman, has been working with project partners, external collaborators and hardware vendors to increase the scalability and applicability of OpenMP for multi-core (and future manycore) platforms and for distributed memory systems by exploring different programming models, language extensions, compiler optimizations, as well as runtime library support.

  20. Hardware implementation of a GFSR pseudo-random number generator

    Science.gov (United States)

    Aiello, G. R.; Budinich, M.; Milotti, E.

    1989-12-01

    We describe the hardware implementation of a pseudo-random number generator of the "Generalized Feedback Shift Register" (GFSR) type. After brief theoretical considerations we describe two versions of the hardware, the tests done and the performances achieved.

  1. Building Scalable Knowledge Graphs for Earth Science

    Science.gov (United States)

    Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

    2017-01-01

    Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.

  2. Broadband accelerator control network

    International Nuclear Information System (INIS)

    Skelly, J.; Clifford, T.; Frankel, R.

    1983-01-01

    A broadband data communications network has been implemented at BNL for control of the Alternating Gradient Synchrotron (AG) proton accelerator, using commercial CATV hardware, dual coaxial cables as the communications medium, and spanning 2.0 km. A 4 MHz bandwidth Digital Control channel using CSMA-CA protocol is provided for digital data transmission, with 8 access nodes available over the length of the RELWAY. Each node consists of an rf modem and a microprocessor-based store-and-forward message handler which interfaces the RELWAY to a branch line implemented in GPIB. A gateway to the RELWAY control channel for the (preexisting) AGS Computerized Accelerator Operating system has been constructed using an LSI-11/23 microprocessor as a device in a GPIB branch line. A multilayer communications protocol has been defined for the Digital Control Channel, based on the ISO Open Systems Interconnect layered model, and a RELWAY Device Language defined as the required universal language for device control on this channel

  3. Open Source Hardware for DIY Environmental Sensing

    Science.gov (United States)

    Aufdenkampe, A. K.; Hicks, S. D.; Damiano, S. G.; Montgomery, D. S.

    2014-12-01

    The Arduino open source electronics platform has been very popular within the DIY (Do It Yourself) community for several years, and it is now providing environmental science researchers with an inexpensive alternative to commercial data logging and transmission hardware. Here we present the designs for our latest series of custom Arduino-based dataloggers, which include wireless communication options like self-meshing radio networks and cellular phone modules. The main Arduino board uses a custom interface board to connect to various research-grade sensors to take readings of turbidity, dissolved oxygen, water depth and conductivity, soil moisture, solar radiation, and other parameters. Sensors with SDI-12 communications can be directly interfaced to the logger using our open Arduino-SDI-12 software library (https://github.com/StroudCenter/Arduino-SDI-12). Different deployment options are shown, like rugged enclosures to house the loggers and rigs for mounting the sensors in both fresh water and marine environments. After the data has been collected and transmitted by the logger, the data is received by a mySQL-PHP stack running on a web server that can be accessed from anywhere in the world. Once there, the data can be visualized on web pages or served though REST requests and Water One Flow (WOF) services. Since one of the main benefits of using open source hardware is the easy collaboration between users, we are introducing a new web platform for discussion and sharing of ideas and plans for hardware and software designs used with DIY environmental sensors and data loggers.

  4. Computer hardware for radiologists: Part 2

    Directory of Open Access Journals (Sweden)

    Indrajit I

    2010-01-01

    Full Text Available Computers are an integral part of modern radiology equipment. In the first half of this two-part article, we dwelt upon some fundamental concepts regarding computer hardware, covering components like motherboard, central processing unit (CPU, chipset, random access memory (RAM, and memory modules. In this article, we describe the remaining computer hardware components that are of relevance to radiology. "Storage drive" is a term describing a "memory" hardware used to store data for later retrieval. Commonly used storage drives are hard drives, floppy drives, optical drives, flash drives, and network drives. The capacity of a hard drive is dependent on many factors, including the number of disk sides, number of tracks per side, number of sectors on each track, and the amount of data that can be stored in each sector. "Drive interfaces" connect hard drives and optical drives to a computer. The connections of such drives require both a power cable and a data cable. The four most popular "input/output devices" used commonly with computers are the printer, monitor, mouse, and keyboard. The "bus" is a built-in electronic signal pathway in the motherboard to permit efficient and uninterrupted data transfer. A motherboard can have several buses, including the system bus, the PCI express bus, the PCI bus, the AGP bus, and the (outdated ISA bus. "Ports" are the location at which external devices are connected to a computer motherboard. All commonly used peripheral devices, such as printers, scanners, and portable drives, need ports. A working knowledge of computers is necessary for the radiologist if the workflow is to realize its full potential and, besides, this knowledge will prepare the radiologist for the coming innovations in the ′ever increasing′ digital future.

  5. Computer hardware for radiologists: Part 2

    International Nuclear Information System (INIS)

    Indrajit, IK; Alam, A

    2010-01-01

    Computers are an integral part of modern radiology equipment. In the first half of this two-part article, we dwelt upon some fundamental concepts regarding computer hardware, covering components like motherboard, central processing unit (CPU), chipset, random access memory (RAM), and memory modules. In this article, we describe the remaining computer hardware components that are of relevance to radiology. “Storage drive” is a term describing a “memory” hardware used to store data for later retrieval. Commonly used storage drives are hard drives, floppy drives, optical drives, flash drives, and network drives. The capacity of a hard drive is dependent on many factors, including the number of disk sides, number of tracks per side, number of sectors on each track, and the amount of data that can be stored in each sector. “Drive interfaces” connect hard drives and optical drives to a computer. The connections of such drives require both a power cable and a data cable. The four most popular “input/output devices” used commonly with computers are the printer, monitor, mouse, and keyboard. The “bus” is a built-in electronic signal pathway in the motherboard to permit efficient and uninterrupted data transfer. A motherboard can have several buses, including the system bus, the PCI express bus, the PCI bus, the AGP bus, and the (outdated) ISA bus. “Ports” are the location at which external devices are connected to a computer motherboard. All commonly used peripheral devices, such as printers, scanners, and portable drives, need ports. A working knowledge of computers is necessary for the radiologist if the workflow is to realize its full potential and, besides, this knowledge will prepare the radiologist for the coming innovations in the ‘ever increasing’ digital future

  6. The Impact of Flight Hardware Scavenging on Space Logistics

    Science.gov (United States)

    Oeftering, Richard C.

    2011-01-01

    For a given fixed launch vehicle capacity the logistics payload delivered to the moon may be only roughly 20 percent of the payload delivered to the International Space Station (ISS). This is compounded by the much lower flight frequency to the moon and thus low availability of spares for maintenance. This implies that lunar hardware is much more scarce and more costly per kilogram than ISS and thus there is much more incentive to preserve hardware. The Constellation Lunar Surface System (LSS) program is considering ways of utilizing hardware scavenged from vehicles including the Altair lunar lander. In general, the hardware will have only had a matter of hours of operation yet there may be years of operational life remaining. By scavenging this hardware the program, in effect, is treating vehicle hardware as part of the payload. Flight hardware may provide logistics spares for system maintenance and reduce the overall logistics footprint. This hardware has a wide array of potential applications including expanding the power infrastructure, and exploiting in-situ resources. Scavenging can also be seen as a way of recovering the value of, literally, billions of dollars worth of hardware that would normally be discarded. Scavenging flight hardware adds operational complexity and steps must be taken to augment the crew s capability with robotics, capabilities embedded in flight hardware itself, and external processes. New embedded technologies are needed to make hardware more serviceable and scavengable. Process technologies are needed to extract hardware, evaluate hardware, reconfigure or repair hardware, and reintegrate it into new applications. This paper also illustrates how scavenging can be used to drive down the cost of the overall program by exploiting the intrinsic value of otherwise discarded flight hardware.

  7. Oracle database performance and scalability a quantitative approach

    CERN Document Server

    Liu, Henry H

    2011-01-01

    A data-driven, fact-based, quantitative text on Oracle performance and scalability With database concepts and theories clearly explained in Oracle's context, readers quickly learn how to fully leverage Oracle's performance and scalability capabilities at every stage of designing and developing an Oracle-based enterprise application. The book is based on the author's more than ten years of experience working with Oracle, and is filled with dependable, tested, and proven performance optimization techniques. Oracle Database Performance and Scalability is divided into four parts that enable reader

  8. Scalable-to-lossless transform domain distributed video coding

    DEFF Research Database (Denmark)

    Huang, Xin; Ukhanova, Ann; Veselov, Anton

    2010-01-01

    Distributed video coding (DVC) is a novel approach providing new features as low complexity encoding by mainly exploiting the source statistics at the decoder based on the availability of decoder side information. In this paper, scalable-tolossless DVC is presented based on extending a lossy Tran...... codec provides frame by frame encoding. Comparing the lossless coding efficiency, the proposed scalable-to-lossless TDWZ video codec can save up to 5%-13% bits compared to JPEG LS and H.264 Intra frame lossless coding and do so as a scalable-to-lossless coding....

  9. Design issues for numerical libraries on scalable multicore architectures

    International Nuclear Information System (INIS)

    Heroux, M A

    2008-01-01

    Future generations of scalable computers will rely on multicore nodes for a significant portion of overall system performance. At present, most applications and libraries cannot exploit multiple cores beyond running addition MPI processes per node. In this paper we discuss important multicore architecture issues, programming models, algorithms requirements and software design related to effective use of scalable multicore computers. In particular, we focus on important issues for library research and development, making recommendations for how to effectively develop libraries for future scalable computer systems

  10. Management of cladding hulls and fuel hardware

    International Nuclear Information System (INIS)

    1985-01-01

    The reprocessing of spent fuel from power reactors based on chop-leach technology produces a solid waste product of cladding hulls and other metallic residues. This report describes the current situation in the management of fuel cladding hulls and hardware. Information is presented on the material composition of such waste together with the heating effects due to neutron-induced activation products and fuel contamination. As no country has established a final disposal route and the corresponding repository, this report also discusses possible disposal routes and various disposal options under consideration at present

  11. Hardware for computing the integral image

    OpenAIRE

    Fernández-Berni, J.; Rodríguez-Vázquez, Ángel; Río, Rocío del; Carmona-Galán, R.

    2015-01-01

    La presente invención, según se expresa en el enunciado de esta memoria descriptiva, consiste en hardware de señal mixta para cómputo de la imagen integral en el plano focal mediante una agrupación de celdas básicas de sensado-procesamiento cuya interconexión puede ser reconfigurada mediante circuitería periférica que hace posible una implementación muy eficiente de una tarea de procesamiento muy útil en visión artificial como es el cálculo de la imagen integral en escenarios tales como monit...

  12. Development of Hardware Dual Modality Tomography System

    Directory of Open Access Journals (Sweden)

    R. M. Zain

    2009-06-01

    Full Text Available The paper describes the hardware development and performance of the Dual Modality Tomography (DMT system. DMT consists of optical and capacitance sensors. The optical sensors consist of 16 LEDs and 16 photodiodes. The Electrical Capacitance Tomography (ECT electrode design use eight electrode plates as the detecting sensor. The digital timing and the control unit have been developing in order to control the light projection of optical emitters, switching the capacitance electrodes and to synchronize the operation of data acquisition. As a result, the developed system is able to provide a maximum 529 set data per second received from the signal conditioning circuit to the computer.

  13. Fast Gridding on Commodity Graphics Hardware

    DEFF Research Database (Denmark)

    Sørensen, Thomas Sangild; Schaeffter, Tobias; Noe, Karsten Østergaard

    2007-01-01

    is the far most time consuming of the three steps (Table 1). Modern graphics cards (GPUs) can be utilised as a fast parallel processor provided that algorithms are reformulated in a parallel solution. The purpose of this work is to test the hypothesis, that a non-cartesian reconstruction can be efficiently...... implemented on graphics hardware giving a significant speedup compared to CPU based alternatives. We present a novel GPU implementation of the convolution step that overcomes the problems of memory bandwidth that has limited the speed of previous GPU gridding algorithms [2]....

  14. List search hardware for interpretive software

    CERN Document Server

    Altaber, Jacques; Mears, B; Rausch, R

    1979-01-01

    Interpreted languages, e.g. BASIC, are simple to learn, easy to use, quick to modify and in general 'user-friendly'. However, a critically time consuming process during interpretation is that of list searching. A special microprogrammed device for fast list searching has therefore been developed at the SPS Division of CERN. It uses bit- sliced hardware. Fast algorithms perform search, insert and delete of a six-character name and its value in a list of up to 1000 pairs. The prototype shows retrieval times of the order of 10-30 microseconds. (11 refs).

  15. Hardware trigger processor for the MDT system

    CERN Document Server

    AUTHOR|(SzGeCERN)757787; The ATLAS collaboration; Hazen, Eric; Butler, John; Black, Kevin; Gastler, Daniel Edward; Ntekas, Konstantinos; Taffard, Anyes; Martinez Outschoorn, Verena; Ishino, Masaya; Okumura, Yasuyuki

    2017-01-01

    We are developing a low-latency hardware trigger processor for the Monitored Drift Tube system in the Muon spectrometer. The processor will fit candidate Muon tracks in the drift tubes in real time, improving significantly the momentum resolution provided by the dedicated trigger chambers. We present a novel pure-FPGA implementation of a Legendre transform segment finder, an associative-memory alternative implementation, an ARM (Zynq) processor-based track fitter, and compact ATCA carrier board architecture. The ATCA architecture is designed to allow a modular, staged approach to deployment of the system and exploration of alternative technologies.

  16. Fast and Scalable Computation of the Forward and Inverse Discrete Periodic Radon Transform.

    Science.gov (United States)

    Carranza, Cesar; Llamocca, Daniel; Pattichis, Marios

    2016-01-01

    The discrete periodic radon transform (DPRT) has extensively been used in applications that involve image reconstructions from projections. Beyond classic applications, the DPRT can also be used to compute fast convolutions that avoids the use of floating-point arithmetic associated with the use of the fast Fourier transform. Unfortunately, the use of the DPRT has been limited by the need to compute a large number of additions and the need for a large number of memory accesses. This paper introduces a fast and scalable approach for computing the forward and inverse DPRT that is based on the use of: a parallel array of fixed-point adder trees; circular shift registers to remove the need for accessing external memory components when selecting the input data for the adder trees; an image block-based approach to DPRT computation that can fit the proposed architecture to available resources; and fast transpositions that are computed in one or a few clock cycles that do not depend on the size of the input image. As a result, for an N × N image (N prime), the proposed approach can compute up to N(2) additions per clock cycle. Compared with the previous approaches, the scalable approach provides the fastest known implementations for different amounts of computational resources. For example, for a 251×251 image, for approximately 25% fewer flip-flops than required for a systolic implementation, we have that the scalable DPRT is computed 36 times faster. For the fastest case, we introduce optimized just 2N + ⌈log(2) N⌉ + 1 and 2N + 3 ⌈log(2) N⌉ + B + 2 cycles, architectures that can compute the DPRT and its inverse in respectively, where B is the number of bits used to represent each input pixel. On the other hand, the scalable DPRT approach requires more 1-b additions than for the systolic implementation and provides a tradeoff between speed and additional 1-b additions. All of the proposed DPRT architectures were implemented in VHSIC Hardware Description Language

  17. Static Scheduling of Periodic Hardware Tasks with Precedence and Deadline Constraints on Reconfigurable Hardware Devices

    Directory of Open Access Journals (Sweden)

    Ikbel Belaid

    2011-01-01

    Full Text Available Task graph scheduling for reconfigurable hardware devices can be defined as finding a schedule for a set of periodic tasks with precedence, dependence, and deadline constraints as well as their optimal allocations on the available heterogeneous hardware resources. This paper proposes a new methodology comprising three main stages. Using these three main stages, dynamic partial reconfiguration and mixed integer programming, pipelined scheduling and efficient placement are achieved and enable parallel computing of the task graph on the reconfigurable devices by optimizing placement/scheduling quality. Experiments on an application of heterogeneous hardware tasks demonstrate an improvement of resource utilization of 12.45% of the available reconfigurable resources corresponding to a resource gain of 17.3% compared to a static design. The configuration overhead is reduced to 2% of the total running time. Due to pipelined scheduling, the task graph spanning is minimized by 4% compared to sequential execution of the graph.

  18. Scalable inference for stochastic block models

    KAUST Repository

    Peng, Chengbin

    2017-12-08

    Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.

  19. Building Scalable Knowledge Graphs for Earth Science

    Science.gov (United States)

    Ramachandran, R.; Maskey, M.; Gatlin, P. N.; Zhang, J.; Duan, X.; Bugbee, K.; Christopher, S. A.; Miller, J. J.

    2017-12-01

    Estimates indicate that the world's information will grow by 800% in the next five years. In any given field, a single researcher or a team of researchers cannot keep up with this rate of knowledge expansion without the help of cognitive systems. Cognitive computing, defined as the use of information technology to augment human cognition, can help tackle large systemic problems. Knowledge graphs, one of the foundational components of cognitive systems, link key entities in a specific domain with other entities via relationships. Researchers could mine these graphs to make probabilistic recommendations and to infer new knowledge. At this point, however, there is a dearth of tools to generate scalable Knowledge graphs using existing corpus of scientific literature for Earth science research. Our project is currently developing an end-to-end automated methodology for incrementally constructing Knowledge graphs for Earth Science. Semantic Entity Recognition (SER) is one of the key steps in this methodology. SER for Earth Science uses external resources (including metadata catalogs and controlled vocabulary) as references to guide entity extraction and recognition (i.e., labeling) from unstructured text, in order to build a large training set to seed the subsequent auto-learning component in our algorithm. Results from several SER experiments will be presented as well as lessons learned.

  20. Ancestors protocol for scalable key management

    Directory of Open Access Journals (Sweden)

    Dieter Gollmann

    2010-06-01

    Full Text Available Group key management is an important functional building block for secure multicast architecture. Thereby, it has been extensively studied in the literature. The main proposed protocol is Adaptive Clustering for Scalable Group Key Management (ASGK. According to ASGK protocol, the multicast group is divided into clusters, where each cluster consists of areas of members. Each cluster uses its own Traffic Encryption Key (TEK. These clusters are updated periodically depending on the dynamism of the members during the secure session. The modified protocol has been proposed based on ASGK with some modifications to balance the number of affected members and the encryption/decryption overhead with any number of the areas when a member joins or leaves the group. This modified protocol is called Ancestors protocol. According to Ancestors protocol, every area receives the dynamism of the members from its parents. The main objective of the modified protocol is to reduce the number of affected members during the leaving and joining members, then 1 affects n overhead would be reduced. A comparative study has been done between ASGK protocol and the modified protocol. According to the comparative results, it found that the modified protocol is always outperforming the ASGK protocol.

  1. Scalable Combinatorial Tools for Health Disparities Research

    Directory of Open Access Journals (Sweden)

    Michael A. Langston

    2014-10-01

    Full Text Available Despite staggering investments made in unraveling the human genome, current estimates suggest that as much as 90% of the variance in cancer and chronic diseases can be attributed to factors outside an individual’s genetic endowment, particularly to environmental exposures experienced across his or her life course. New analytical approaches are clearly required as investigators turn to complicated systems theory and ecological, place-based and life-history perspectives in order to understand more clearly the relationships between social determinants, environmental exposures and health disparities. While traditional data analysis techniques remain foundational to health disparities research, they are easily overwhelmed by the ever-increasing size and heterogeneity of available data needed to illuminate latent gene x environment interactions. This has prompted the adaptation and application of scalable combinatorial methods, many from genome science research, to the study of population health. Most of these powerful tools are algorithmically sophisticated, highly automated and mathematically abstract. Their utility motivates the main theme of this paper, which is to describe real applications of innovative transdisciplinary models and analyses in an effort to help move the research community closer toward identifying the causal mechanisms and associated environmental contexts underlying health disparities. The public health exposome is used as a contemporary focus for addressing the complex nature of this subject.

  2. Percolator: Scalable Pattern Discovery in Dynamic Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Choudhury, Sutanay; Purohit, Sumit; Lin, Peng; Wu, Yinghui; Holder, Lawrence B.; Agarwal, Khushbu

    2018-02-06

    We demonstrate Percolator, a distributed system for graph pattern discovery in dynamic graphs. In contrast to conventional mining systems, Percolator advocates efficient pattern mining schemes that (1) support pattern detection with keywords; (2) integrate incremental and parallel pattern mining; and (3) support analytical queries such as trend analysis. The core idea of Percolator is to dynamically decide and verify a small fraction of patterns and their in- stances that must be inspected in response to buffered updates in dynamic graphs, with a total mining cost independent of graph size. We demonstrate a) the feasibility of incremental pattern mining by walking through each component of Percolator, b) the efficiency and scalability of Percolator over the sheer size of real-world dynamic graphs, and c) how the user-friendly GUI of Percolator inter- acts with users to support keyword-based queries that detect, browse and inspect trending patterns. We also demonstrate two user cases of Percolator, in social media trend analysis and academic collaboration analysis, respectively.

  3. Scalable Notch Antenna System for Multiport Applications

    Directory of Open Access Journals (Sweden)

    Abdurrahim Toktas

    2016-01-01

    Full Text Available A novel and compact scalable antenna system is designed for multiport applications. The basic design is built on a square patch with an electrical size of 0.82λ0×0.82λ0 (at 2.4 GHz on a dielectric substrate. The design consists of four symmetrical and orthogonal triangular notches with circular feeding slots at the corners of the common patch. The 4-port antenna can be simply rearranged to 8-port and 12-port systems. The operating band of the system can be tuned by scaling (S the size of the system while fixing the thickness of the substrate. The antenna system with S: 1/1 in size of 103.5×103.5 mm2 operates at the frequency band of 2.3–3.0 GHz. By scaling the antenna with S: 1/2.3, a system of 45×45 mm2 is achieved, and thus the operating band is tuned to 4.7–6.1 GHz with the same scattering characteristic. A parametric study is also conducted to investigate the effects of changing the notch dimensions. The performance of the antenna is verified in terms of the antenna characteristics as well as diversity and multiplexing parameters. The antenna system can be tuned by scaling so that it is applicable to the multiport WLAN, WIMAX, and LTE devices with port upgradability.

  4. Scalable conditional induction variables (CIV) analysis

    KAUST Repository

    Oancea, Cosmin E.

    2015-02-01

    Subscripts using induction variables that cannot be expressed as a formula in terms of the enclosing-loop indices appear in the low-level implementation of common programming abstractions such as Alter, or stack operations and pose significant challenges to automatic parallelization. Because the complexity of such induction variables is often due to their conditional evaluation across the iteration space of loops we name them Conditional Induction Variables (CIV). This paper presents a flow-sensitive technique that summarizes both such CIV-based and affine subscripts to program level, using the same representation. Our technique requires no modifications of our dependence tests, which is agnostic to the original shape of the subscripts, and is more powerful than previously reported dependence tests that rely on the pairwise disambiguation of read-write references. We have implemented the CIV analysis in our parallelizing compiler and evaluated its impact on five Fortran benchmarks. We have found that that there are many important loops using CIV subscripts and that our analysis can lead to their scalable parallelization. This in turn has led to the parallelization of the benchmark programs they appear in.

  5. A Programmable, Scalable-Throughput Interleaver

    Directory of Open Access Journals (Sweden)

    E. J. C. Rijshouwer

    2010-01-01

    Full Text Available The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleave data rates. Our interleave machine is built around 8 single-port SRAM banks and can be programmed to generate up to 8 addresses every clock cycle. The scalable architecture combines SIMD and VLIW concepts with an efficient resolution of bank conflicts. A wide range of cellular, connectivity, and broadcast interleavers have been mapped on this machine, with throughputs up to more than 0.5 Gsymbol/second. Although it was designed for channel interleaving, the application domain of the interleaver extends also to Turbo interleaving. The presented configuration of the architecture is designed as a part of a programmable outer receiver on a prototype board. It offers (near universal programmability to enable the implementation of new interleavers. The interleaver measures 2.09 mm2 in 65 nm CMOS (including memories and proves functional on silicon.

  6. A Programmable, Scalable-Throughput Interleaver

    Directory of Open Access Journals (Sweden)

    Rijshouwer EJC

    2010-01-01

    Full Text Available The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleave data rates. Our interleave machine is built around 8 single-port SRAM banks and can be programmed to generate up to 8 addresses every clock cycle. The scalable architecture combines SIMD and VLIW concepts with an efficient resolution of bank conflicts. A wide range of cellular, connectivity, and broadcast interleavers have been mapped on this machine, with throughputs up to more than 0.5 Gsymbol/second. Although it was designed for channel interleaving, the application domain of the interleaver extends also to Turbo interleaving. The presented configuration of the architecture is designed as a part of a programmable outer receiver on a prototype board. It offers (near universal programmability to enable the implementation of new interleavers. The interleaver measures 2.09 m in 65 nm CMOS (including memories and proves functional on silicon.

  7. Is Hardware Removal Recommended after Ankle Fracture Repair?

    Directory of Open Access Journals (Sweden)

    Hong-Geun Jung

    2016-01-01

    Full Text Available The indications and clinical necessity for routine hardware removal after treating ankle or distal tibia fracture with open reduction and internal fixation are disputed even when hardware-related pain is insignificant. Thus, we determined the clinical effects of routine hardware removal irrespective of the degree of hardware-related pain, especially in the perspective of patients’ daily activities. This study was conducted on 80 consecutive cases (78 patients treated by surgery and hardware removal after bony union. There were 56 ankle and 24 distal tibia fractures. The hardware-related pain, ankle joint stiffness, discomfort on ambulation, and patient satisfaction were evaluated before and at least 6 months after hardware removal. Pain score before hardware removal was 3.4 (range 0 to 6 and decreased to 1.3 (range 0 to 6 after removal. 58 (72.5% patients experienced improved ankle stiffness and 65 (81.3% less discomfort while walking on uneven ground and 63 (80.8% patients were satisfied with hardware removal. These results suggest that routine hardware removal after ankle or distal tibia fracture could ameliorate hardware-related pain and improves daily activities and patient satisfaction even when the hardware-related pain is minimal.

  8. Scalable Strategies for Computing with Massive Data

    Directory of Open Access Journals (Sweden)

    Michael Kane

    2013-11-01

    Full Text Available This paper presents two complementary statistical computing frameworks that address challenges in parallel processing and the analysis of massive data. First, the foreach package allows users of the R programming environment to define parallel loops that may be run sequentially on a single machine, in parallel on a symmetric multiprocessing (SMP machine, or in cluster environments without platform-specific code. Second, the bigmemory package implements memory- and file-mapped data structures that provide (a access to arbitrarily large data while retaining a look and feel that is familiar to R users and (b data structures that are shared across processor cores in order to support efficient parallel computing techniques. Although these packages may be used independently, this paper shows how they can be used in combination to address challenges that have effectively been beyond the reach of researchers who lack specialized software development skills or expensive hardware.

  9. Engineering scalable fault-tolerant quantum computation

    Science.gov (United States)

    Kimchi-Schwartz, Mollie; Danna, Rosenberg; Kim, David; Yoder, Jonilyn; Kjaergaard, Morten; Das, Rabindra; Grover, Jeff; Gustavsson, Simon; Oliver, William

    Recent demonstrations of quantum protocols comprising on the order of 5-10 superconducting qubits are foundational to the future development of quantum information processors. A next critical step in the development of resilient quantum processors will be the integration of coherent quantum circuits with a hardware platform that is amenable to extending the system size to hundreds of qubits and beyond. In this talk, we will discuss progress toward integrating coherent superconducting qubits with signal routing via the third dimension. This research was funded in part by the Office of the Director of National Intelligence (ODNI), Intelligence Advanced Research Projects Activity (IARPA) and by the Assistant Secretary of Defense for Research & Engineering under Air Force Contract No. FA8721-05-C-0002. The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of ODNI, IARPA, or the US Government.

  10. Accelerator microanalysis

    International Nuclear Information System (INIS)

    Tuniz, C.

    1997-01-01

    Particle accelerators have been developed more than sixty years ago to investigate nuclear and atomic phenomena. A major shift toward applications of accelerators in the study of materials structure and composition in inter-disciplinary projects has been witnessed in the last two decades. The Australian Nuclear Science and Technology Organisation (ANSTO) has developed advanced research programs based on the use of particle and photon beams. Atmospheric pollution problems are investigated at the 3 MV Van de Graff accelerator using ion beam analysis techniques to detect toxic elements in aerosol particles. High temperature superconductor and semiconductor materials are characterised using the recoil of iodine and other heavy ions produced at ANTARES, the 10-MV Tandem accelerator. A heavy-ion microprobe is presently being developed at ANTARES to map elemental concentrations of specific elements with micro-size resolution. An Accelerator mass Spectrometry (AMS) system has been developed at ANSTO for the ultra-sensitive detection of Carbon-14, Iodine-129 and other long-lived radioisotopes. This AMS spectrometer is a key instrument for climate change studies and international safeguards. ANSTO is also managing the Australian Synchrotron Research program based on facilities developed at the Photon Factory (Japan) and at the Advanced Photon Source (USA). Advanced projects in biology, materials chemistry, structural condensed matter and other disciplines are being promoted by a consortium involving Australian universities and research institutions. This paper will review recent advances in the use of particle accelerators, with a particular emphasis on applications developed at ANSTO and related to problems of international concern, such as global environmental change, public health and nuclear proliferation

  11. ISS Logistics Hardware Disposition and Metrics Validation

    Science.gov (United States)

    Rogers, Toneka R.

    2010-01-01

    I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.

  12. CASIS Fact Sheet: Hardware and Facilities

    Science.gov (United States)

    Solomon, Michael R.; Romero, Vergel

    2016-01-01

    Vencore is a proven information solutions, engineering, and analytics company that helps our customers solve their most complex challenges. For more than 40 years, we have designed, developed and delivered mission-critical solutions as our customers' trusted partner. The Engineering Services Contract, or ESC, provides engineering and design services to the NASA organizations engaged in development of new technologies at the Kennedy Space Center. Vencore is the ESC prime contractor, with teammates that include Stinger Ghaffarian Technologies, Sierra Lobo, Nelson Engineering, EASi, and Craig Technologies. The Vencore team designs and develops systems and equipment to be used for the processing of space launch vehicles, spacecraft, and payloads. We perform flight systems engineering for spaceflight hardware and software; develop technologies that serve NASA's mission requirements and operations needs for the future. Our Flight Payload Support (FPS) team at Kennedy Space Center (KSC) provides engineering, development, and certification services as well as payload integration and management services to NASA and commercial customers. Our main objective is to assist principal investigators (PIs) integrate their science experiments into payload hardware for research aboard the International Space Station (ISS), commercial spacecraft, suborbital vehicles, parabolic flight aircrafts, and ground-based studies. Vencore's FPS team is AS9100 certified and a recognized implementation partner for the Center for Advancement of Science in Space (CASIS

  13. ARM assembly language with hardware experiments

    CERN Document Server

    Elahi, Ata

    2015-01-01

    This book provides a hands-on approach to learning ARM assembly language with the use of a TI microcontroller. The book starts with an introduction to computer architecture and then discusses number systems and digital logic. The text covers ARM Assembly Language, ARM Cortex Architecture and its components, and Hardware Experiments using TILM3S1968. Written for those interested in learning embedded programming using an ARM Microcontroller. ·         Introduces number systems and signal transmission methods   ·         Reviews logic gates, registers, multiplexers, decoders and memory   ·         Provides an overview and examples of ARM instruction set   ·         Uses using Keil development tools for writing and debugging ARM assembly language Programs   ·         Hardware experiments using a Mbed NXP LPC1768 microcontroller; including General Purpose Input/Output (GPIO) configuration, real time clock configuration, binary input to 7-segment display, creating ...

  14. Introduction to Hardware Security and Trust

    CERN Document Server

    Wang, Cliff

    2012-01-01

    The emergence of a globalized, horizontal semiconductor business model raises a set of concerns involving the security and trust of the information systems on which modern society is increasingly reliant for mission-critical functionality. Hardware-oriented security and trust issues span a broad range including threats related to the malicious insertion of Trojan circuits designed, e.g.,to act as a ‘kill switch’ to disable a chip, to integrated circuit (IC) piracy,and to attacks designed to extract encryption keys and IP from a chip. This book provides the foundations for understanding hardware security and trust, which have become major concerns for national security over the past decade.  Coverage includes security and trust issues in all types of electronic devices and systems such as ASICs, COTS, FPGAs, microprocessors/DSPs, and embedded systems.  This serves as an invaluable reference to the state-of-the-art research that is of critical significance to the security of,and trust in, modern society�...

  15. Fast image processing on parallel hardware

    International Nuclear Information System (INIS)

    Bittner, U.

    1988-01-01

    Current digital imaging modalities in the medical field incorporate parallel hardware which is heavily used in the stage of image formation like the CT/MR image reconstruction or in the DSA real time subtraction. In order to image post-processing as efficient as image acquisition, new software approaches have to be found which take full advantage of the parallel hardware architecture. This paper describes the implementation of two-dimensional median filter which can serve as an example for the development of such an algorithm. The algorithm is analyzed by viewing it as a complete parallel sort of the k pixel values in the chosen window which leads to a generalization to rank order operators and other closely related filters reported in literature. A section about the theoretical base of the algorithm gives hints for how to characterize operations suitable for implementations on pipeline processors and the way to find the appropriate algorithms. Finally some results that computation time and usefulness of medial filtering in radiographic imaging are given

  16. A low power biomedical signal processor ASIC based on hardware software codesign.

    Science.gov (United States)

    Nie, Z D; Wang, L; Chen, W G; Zhang, T; Zhang, Y T

    2009-01-01

    A low power biomedical digital signal processor ASIC based on hardware and software codesign methodology was presented in this paper. The codesign methodology was used to achieve higher system performance and design flexibility. The hardware implementation included a low power 32bit RISC CPU ARM7TDMI, a low power AHB-compatible bus, and a scalable digital co-processor that was optimized for low power Fast Fourier Transform (FFT) calculations. The co-processor could be scaled for 8-point, 16-point and 32-point FFTs, taking approximate 50, 100 and 150 clock circles, respectively. The complete design was intensively simulated using ARM DSM model and was emulated by ARM Versatile platform, before conducted to silicon. The multi-million-gate ASIC was fabricated using SMIC 0.18 microm mixed-signal CMOS 1P6M technology. The die area measures 5,000 microm x 2,350 microm. The power consumption was approximately 3.6 mW at 1.8 V power supply and 1 MHz clock rate. The power consumption for FFT calculations was less than 1.5 % comparing with the conventional embedded software-based solution.

  17. A Testbed for Highly-Scalable Mission Critical Information Systems

    National Research Council Canada - National Science Library

    Birman, Kenneth P

    2005-01-01

    ... systems in a networked environment. Headed by Professor Ken Birman, the project is exploring a novel fusion of classical protocols for reliable multicast communication with a new style of peer-to-peer protocol called scalable "gossip...

  18. Scalable Partitioning Algorithms for FPGAs With Heterogeneous Resources

    National Research Council Canada - National Science Library

    Selvakkumaran, Navaratnasothie; Ranjan, Abhishek; Raje, Salil; Karypis, George

    2004-01-01

    As FPGA densities increase, partitioning-based FPGA placement approaches are becoming increasingly important as they can be used to provide high-quality and computationally scalable placement solutions...

  19. Hardware Commissioning of the LHC Quality Assurance, follow-up and storing of the test results

    CERN Document Server

    Barbero, E

    2005-01-01

    During the commissioning of the LHC technical systems [1] (the so-called Hardware Commissioning) a large number of test sequences and procedures will be applied to the different systems and components of the accelerator. All the information related to the coordination of the Hardware Commissioning will be structured and managed towards the final objective of integrating all the data produced in the Manufacturing and Test Folders (MTF) [2] at both equipment level (i.e. individual system tests) and commissioning level (i.e.Hardware Commissioning). The MTF for Hardware Commissioning will be mainly used to archive the results of the tests (i.e. status, parameters and waveforms) which will be used later as reference during the operation with beam. Also it is an indispensable tool for monitoring the progress of the different tests and ensuring the proper follow-up of the procedures described in the engineering specifications; in this way, the Quality Assurance process will be completed. This paper describes the spe...

  20. Accelerator operations

    International Nuclear Information System (INIS)

    Anon.

    1980-01-01

    This section is concerned with the operation of both the tandem-linac system and the Dynamitron, two accelerators that are used for entirely different research. Developmental activities associated with the tandem and the Dynamitron are also treated here, but developmental activities associated with the superconducting linac are covered separately because this work is a program of technology development in its own right

  1. CNSTN Accelerator

    International Nuclear Information System (INIS)

    Habbassi, Afifa; Trabelsi, Adel

    2010-01-01

    This project give a big idea about the measurement of the linear accelerator in the CNSTN. During this work we control dose distribution for different product. For this characterisation we have to make an installation qualification ,operational qualification,performance qualification and of course for every step we have to control temperature and the dose ,even the distribution of the last one.

  2. Accelerators course

    CERN Multimedia

    CERN. Geneva HR-RFA; Métral, E

    2006-01-01

    1a) Introduction and motivation 1b) History and accelerator types 2) Transverse beam dynamics 3a) Longitudinal beam dynamics 3b) Figure of merit of a synchrotron/collider 3c) Beam control 4) Main limiting factors 5) Technical challenges

  3. Accelerator operations

    International Nuclear Information System (INIS)

    Anon.

    1979-01-01

    Operations of the SuperHILAC, the Bevatron/Bevalac, and the 184-inch Synchrocyclotron during the period from October 1977 to September 1978 are discussed. These include ion source development, accelerator facilities, the Heavy Ion Spectrometer System, and Bevelac biomedical operations

  4. ORELA data acquisition system hardware. Volume 1: introduction

    International Nuclear Information System (INIS)

    Reynolds, J.W.

    1977-01-01

    The Oak Ridge Electron Linear Accelerator Facility (ORELA) has been specifically designed as a facility for neutron cross-section measurements by the time-of-flight technique. ORELA was designed so that a number of cross-section experiments can be performed simultaneously. This goal of simultaneous operation of several experiments, a maximum of six to date, has been achieved by using the multiple flight paths radiating from the target room, the multiple flight stations on each flight path, the laboratory facilities surrounding the central data area, and a shared data acquisition computer system. The flight stations contain the fast electronics for initial processing of the nuclear detector signals on a time scale of nanoseconds. The laboratories, and in some cases the flight stations, contain the equipment to digitize the nanosecond detector signals on a time scale of a few microseconds. At this point, the data passes into the ORELA Data Acquisition portion of the ORELA Data Handling System. An introduction to the ORELA Data Acquisition System is given, and the component parts of the system are briefly reviewed. Each specifically designed piece of hardware is briefly described with a simplified block diagram. Modifications to standard peripheral devices are reviewed. A list of drawings and programming notes are also included

  5. Accelerator update

    International Nuclear Information System (INIS)

    Anon.

    1995-01-01

    When the Accelerator Conference, combined International High Energy and US Particle versions, held in Dallas in May, was initially scheduled, progress nearby for the US Superconducting Supercollider was high on the preliminary agenda. With the SSC voted down by Congress in October 1993, this was no longer the case. However the content of the meeting, in terms of both its deep implications for ambitious new projects and the breadth of its scope, showed that the worldwide particle accelerator field is far from being moribund. A traditional feature of such accelerator conferences is the multiplicity of parallel sessions. No one person can attend all sessions, so that delegates can follow completely different paths and emerge with totally different impressions. Despite this overload, and despite the SSC cancellation, the general picture is one of encouraging progress over a wide range of major new projects throughout the world. At the same time, spinoff from, and applications of, accelerators and accelerator technology are becoming increasingly important. Centrestage is now CERN's LHC proton-proton collider, where a test string of superconducting magnets is operating over long periods at the nominal LHC field of 8.36 tesla or more. The assignment of the underground areas in the existing 27- kilometre LEP tunnel is now quasidefinitive (see page 3). For CERN's existing big machine, the LEP electron-positron collider, ongoing work concentrates on boosting performance using improved optics and bunch trains. But the main objective is the LEP2 scheme using superconducting accelerating cavities to boost the beam energy (see page 6). After some initial teething problems, production and operation of these cavities appears to have been mastered, at least under test conditions. A highlight at CERN last year was the first run with lead ions (December 1994, page 15). Handling these heavy particles with systems originally designed for protons calls for ingenuity. The SPS

  6. Accelerator update

    Energy Technology Data Exchange (ETDEWEB)

    Anon.

    1995-09-15

    When the Accelerator Conference, combined International High Energy and US Particle versions, held in Dallas in May, was initially scheduled, progress nearby for the US Superconducting Supercollider was high on the preliminary agenda. With the SSC voted down by Congress in October 1993, this was no longer the case. However the content of the meeting, in terms of both its deep implications for ambitious new projects and the breadth of its scope, showed that the worldwide particle accelerator field is far from being moribund. A traditional feature of such accelerator conferences is the multiplicity of parallel sessions. No one person can attend all sessions, so that delegates can follow completely different paths and emerge with totally different impressions. Despite this overload, and despite the SSC cancellation, the general picture is one of encouraging progress over a wide range of major new projects throughout the world. At the same time, spinoff from, and applications of, accelerators and accelerator technology are becoming increasingly important. Centrestage is now CERN's LHC proton-proton collider, where a test string of superconducting magnets is operating over long periods at the nominal LHC field of 8.36 tesla or more. The assignment of the underground areas in the existing 27- kilometre LEP tunnel is now quasidefinitive (see page 3). For CERN's existing big machine, the LEP electron-positron collider, ongoing work concentrates on boosting performance using improved optics and bunch trains. But the main objective is the LEP2 scheme using superconducting accelerating cavities to boost the beam energy (see page 6). After some initial teething problems, production and operation of these cavities appears to have been mastered, at least under test conditions. A highlight at CERN last year was the first run with lead ions (December 1994, page 15). Handling these heavy particles with systems originally designed for protons calls for ingenuity. The SPS has managed

  7. SOL: A Library for Scalable Online Learning Algorithms

    OpenAIRE

    Wu, Yue; Hoi, Steven C. H.; Liu, Chenghao; Lu, Jing; Sahoo, Doyen; Yu, Nenghai

    2016-01-01

    SOL is an open-source library for scalable online learning algorithms, and is particularly suitable for learning with high-dimensional data. The library provides a family of regular and sparse online learning algorithms for large-scale binary and multi-class classification tasks with high efficiency, scalability, portability, and extensibility. SOL was implemented in C++, and provided with a collection of easy-to-use command-line tools, python wrappers and library calls for users and develope...

  8. Modular Universal Scalable Ion-trap Quantum Computer

    Science.gov (United States)

    2016-06-02

    SECURITY CLASSIFICATION OF: The main goal of the original MUSIQC proposal was to construct and demonstrate a modular and universally- expandable ion...Distribution Unlimited UU UU UU UU 02-06-2016 1-Aug-2010 31-Jan-2016 Final Report: Modular Universal Scalable Ion-trap Quantum Computer The views...P.O. Box 12211 Research Triangle Park, NC 27709-2211 Ion trap quantum computation, scalable modular architectures REPORT DOCUMENTATION PAGE 11

  9. Architectures and Applications for Scalable Quantum Information Systems

    Science.gov (United States)

    2007-01-01

    Gershenfeld and I. Chuang. Quantum computing with molecules. Scientific American, June 1998. [16] A. Globus, D. Bailey, J. Han, R. Jaffe, C. Levit , R...AFRL-IF-RS-TR-2007-12 Final Technical Report January 2007 ARCHITECTURES AND APPLICATIONS FOR SCALABLE QUANTUM INFORMATION SYSTEMS...NUMBER 5b. GRANT NUMBER FA8750-01-2-0521 4. TITLE AND SUBTITLE ARCHITECTURES AND APPLICATIONS FOR SCALABLE QUANTUM INFORMATION SYSTEMS 5c

  10. On the scalability of LISP and advanced overlaid services

    OpenAIRE

    Coras, Florin

    2015-01-01

    In just four decades the Internet has gone from a lab experiment to a worldwide, business critical infrastructure that caters to the communication needs of almost a half of the Earth's population. With these figures on its side, arguing against the Internet's scalability would seem rather unwise. However, the Internet's organic growth is far from finished and, as billions of new devices are expected to be joined in the not so distant future, scalability, or lack thereof, is commonly believed ...

  11. Scalable, full-colour and controllable chromotropic plasmonic printing

    OpenAIRE

    Xue, Jiancai; Zhou, Zhang-Kai; Wei, Zhiqiang; Su, Rongbin; Lai, Juan; Li, Juntao; Li, Chao; Zhang, Tengwei; Wang, Xue-Hua

    2015-01-01

    Plasmonic colour printing has drawn wide attention as a promising candidate for the next-generation colour-printing technology. However, an efficient approach to realize full colour and scalable fabrication is still lacking, which prevents plasmonic colour printing from practical applications. Here we present a scalable and full-colour plasmonic printing approach by combining conjugate twin-phase modulation with a plasmonic broadband absorber. More importantly, our approach also demonstrates ...

  12. Responsive, Flexible and Scalable Broader Impacts (Invited)

    Science.gov (United States)

    Decharon, A.; Companion, C.; Steinman, M.

    2010-12-01

    In many educator professional development workshops, scientists present content in a slideshow-type format and field questions afterwards. Drawbacks of this approach include: inability to begin the lecture with content that is responsive to audience needs; lack of flexible access to specific material within the linear presentation; and “Q&A” sessions are not easily scalable to broader audiences. Often this type of traditional interaction provides little direct benefit to the scientists. The Centers for Ocean Sciences Education Excellence - Ocean Systems (COSEE-OS) applies the technique of concept mapping with demonstrated effectiveness in helping scientists and educators “get on the same page” (deCharon et al., 2009). A key aspect is scientist professional development geared towards improving face-to-face and online communication with non-scientists. COSEE-OS promotes scientist-educator collaboration, tests the application of scientist-educator maps in new contexts through webinars, and is piloting the expansion of maps as long-lived resources for the broader community. Collaboration - COSEE-OS has developed and tested a workshop model bringing scientists and educators together in a peer-oriented process, often clarifying common misconceptions. Scientist-educator teams develop online concept maps that are hyperlinked to “assets” (i.e., images, videos, news) and are responsive to the needs of non-scientist audiences. In workshop evaluations, 91% of educators said that the process of concept mapping helped them think through science topics and 89% said that concept mapping helped build a bridge of communication with scientists (n=53). Application - After developing a concept map, with COSEE-OS staff assistance, scientists are invited to give webinar presentations that include live “Q&A” sessions. The webinars extend the reach of scientist-created concept maps to new contexts, both geographically and topically (e.g., oil spill), with a relatively small

  13. Development of a modular and scalable sensor system for the gathering of position and orientation of moved objects

    International Nuclear Information System (INIS)

    Klingbeil, L.

    2006-02-01

    A modular and scalable sensor system for the estimation of position and orientation of moving objects has been developed and characterized. A sensor unit, which is mounted to the moving object, consists of acceleration -, angular rate - and magnetic field sensors for every spatial axis. Customized Kalman filter algorithms provide a robust and low latency reconstruction of the sensor's orientation. Additionally an ultrasound transducer network is used to measure the distance of a sensor unit with respect to several reference points in the room. This allows reconstruction of the absolute position using trilateration methods. The system is scalable with respect to the number of sensor units and the covered tracking volume. It is suitable for various applications for example the analysis of body movements or head tracking in augmented or virtual reality environments. (orig.)

  14. Microscopic Characterization of Scalable Coherent Rydberg Superatoms

    Directory of Open Access Journals (Sweden)

    Johannes Zeiher

    2015-08-01

    Full Text Available Strong interactions can amplify quantum effects such that they become important on macroscopic scales. Controlling these coherently on a single-particle level is essential for the tailored preparation of strongly correlated quantum systems and opens up new prospects for quantum technologies. Rydberg atoms offer such strong interactions, which lead to extreme nonlinearities in laser-coupled atomic ensembles. As a result, multiple excitation of a micrometer-sized cloud can be blocked while the light-matter coupling becomes collectively enhanced. The resulting two-level system, often called a “superatom,” is a valuable resource for quantum information, providing a collective qubit. Here, we report on the preparation of 2 orders of magnitude scalable superatoms utilizing the large interaction strength provided by Rydberg atoms combined with precise control of an ensemble of ultracold atoms in an optical lattice. The latter is achieved with sub-shot-noise precision by local manipulation of a two-dimensional Mott insulator. We microscopically confirm the superatom picture by in situ detection of the Rydberg excitations and observe the characteristic square-root scaling of the optical coupling with the number of atoms. Enabled by the full control over the atomic sample, including the motional degrees of freedom, we infer the overlap of the produced many-body state with a W state from the observed Rabi oscillations and deduce the presence of entanglement. Finally, we investigate the breakdown of the superatom picture when two Rydberg excitations are present in the system, which leads to dephasing and a loss of coherence.

  15. Myria: Scalable Analytics as a Service

    Science.gov (United States)

    Howe, B.; Halperin, D.; Whitaker, A.

    2014-12-01

    At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.

  16. CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

    Science.gov (United States)

    Jiang, Hanyu; Ganesan, Narayan

    2016-02-27

    HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Since the latest version is highly optimized for performance on modern multi-core CPUs with SSE capabilities, only a few acceleration attempts report speedup. However, the most compute intensive tasks within the pipeline (viz., MSV/SSV and P7Viterbi stages) still stand to benefit from the computational capabilities of massively parallel processors. A Multi-Tiered Parallel Framework (CUDAMPF) implemented on CUDA-enabled GPUs presented here, offers a finer-grained parallelism for MSV/SSV and Viterbi algorithms. We couple SIMT (Single Instruction Multiple Threads) mechanism with SIMD (Single Instructions Multiple Data) video instructions with warp-synchronism to achieve high-throughput processing and eliminate thread idling. We also propose a hardware-aware optimal allocation scheme of scarce resources like on-chip memory and caches in order to boost performance and scalability of CUDAMPF. In addition, runtime compilation via NVRTC available with CUDA 7.0 is incorporated into the presented framework that not only helps unroll innermost loop to yield upto 2 to 3-fold speedup than static compilation but also enables dynamic loading and switching of kernels depending on the query model size, in order to achieve optimal performance. CUDAMPF is designed as a hardware-aware parallel framework for accelerating computational hotspots within the hmmsearch pipeline as well as other sequence alignment applications. It achieves significant speedup by exploiting hierarchical parallelism on single GPU and takes full advantage of limited resources based on their own performance features. In addition to exceeding performance of other

  17. The ATOMKI Accelerator Center

    International Nuclear Information System (INIS)

    Biri, S.; Kormany, Z.; Berzi, I.; Hunyadi, M.

    2009-01-01

    used for systematic maintenance was 328 hours. The breakdown periods amounted to only 2 hours last year. So the cyclotron was available for users during 1679 hours and the users required 1242 hours. The effectively used beam-on-target time is summarized in Table 2. The accelerated particles and the most frequent or typical energies were: proton (44%, 14.5 MeV), deuteron ( 29%, 10 MeV), alpha (16%, 12.5-15 MeV), He-3 ( 11%, 24 Mev). The renewal project of the nuclear magnetic resonance (NMR) based field stabilization system for the energy analyzing magnet has been completed. The original analogue signal processing units were replaced by a newly developed digital system. It first digitizes the low-frequency amplitude modulated signal of the marginal oscillator and then searches for the NMR-resonance. If the resonance is found then the phase of the resonance signal is measured compared to the field modulating sine wave. Based on this measured phase value the control voltage of the analyzing magnet power supply is adjusted automatically. The system hardware was built applying a C8051F041 mixed signal microcontroller, containing all the peripherals required for the measurement and control of the different components. Signal processing and power supply control is implemented by the microcontroller program and the operator can control the system through a GUI program running on a Windows PC. The new system has been in real operation since March 2009. It provides easy handling, user friendly and reliable operation with measured magnetic field stability values on the 2-3 x 10 -5 , level. The project of developing new power supplies for the low-current magnet coils of the cyclotron and the beam transport system has been continued. The prototype circuit of the new power supplies was designed, built and tested thoroughly. It contains modern high voltage tolerant IGBT components in the final stage. They are cooled by forced air, so we can get rid of the water cooling circuits used

  18. Exploring Hardware Support For Scaling Irregular Applications on Multi-node Multi-core Architectures

    Energy Technology Data Exchange (ETDEWEB)

    Secchi, Simone; Ceriani, Marco; Tumeo, Antonino; Villa, Oreste; Palermo, Gianluca; Raffo, Luigi

    2013-06-05

    With the recent emergence of large-scale knowledge dis- covery, data mining and social network analysis, irregular applications have gained renewed interest. Classic cache-based high-performance architectures do not provide optimal performances with such kind of workloads, mainly due to the very low spatial and temporal locality of the irregular control and memory access patterns. In this paper, we present a multi-node, multi-core, fine-grained multi-threaded shared-memory system architecture specifically designed for the execution of large-scale irregular applications, and built on top of three pillars, that we believe are fundamental to support these workloads. First, we offer transparent hardware support for Partitioned Global Address Space (PGAS) to provide a large globally-shared address space with no software library overhead. Second, we employ multi-threaded multi-core processing nodes to achieve the necessary latency tolerance required by accessing global memory, which potentially resides in a remote node. Finally, we devise hardware support for inter-thread synchronization on the whole global address space. We first model the performances by using an analytical model that takes into account the main architecture and application characteristics. We describe the hardware design of the proposed cus- tom architectural building blocks that provide support for the above- mentioned three pillars. Finally, we present a limited-scale evaluation of the system on a multi-board FPGA prototype with typical irregular kernels and benchmarks. The experimental evaluation demonstrates the architecture performance scalability for different configurations of the whole system.

  19. Handbook of hardware/software codesign

    CERN Document Server

    Teich, Jürgen

    2017-01-01

    This handbook presents fundamental knowledge on the hardware/software (HW/SW) codesign methodology. Contributing expert authors look at key techniques in the design flow as well as selected codesign tools and design environments, building on basic knowledge to consider the latest techniques. The book enables readers to gain real benefits from the HW/SW codesign methodology through explanations and case studies which demonstrate its usefulness. Readers are invited to follow the progress of design techniques through this work, which assists readers in following current research directions and learning about state-of-the-art techniques. Students and researchers will appreciate the wide spectrum of subjects that belong to the design methodology from this handbook. .

  20. Battery Management System Hardware Concepts: An Overview

    Directory of Open Access Journals (Sweden)

    Markus Lelie

    2018-03-01

    Full Text Available This paper focuses on the hardware aspects of battery management systems (BMS for electric vehicle and stationary applications. The purpose is giving an overview on existing concepts in state-of-the-art systems and enabling the reader to estimate what has to be considered when designing a BMS for a given application. After a short analysis of general requirements, several possible topologies for battery packs and their consequences for the BMS’ complexity are examined. Four battery packs that were taken from commercially available electric vehicles are shown as examples. Later, implementation aspects regarding measurement of needed physical variables (voltage, current, temperature, etc. are discussed, as well as balancing issues and strategies. Finally, safety considerations and reliability aspects are investigated.

  1. EPICS: Allen-Bradley hardware reference manual

    International Nuclear Information System (INIS)

    Nawrocki, G.

    1993-01-01

    This manual covers the following hardware: Allen-Bradley 6008 -- SV VMEbus I/O scanner; Allen-Bradley universal I/O chassis 1771-A1B, -A2B, -A3B, and -A4B; Allen-Bradley power supply module 1771-P4S; Allen-Bradley 1771-ASB remote I/O adapter module; Allen-Bradley 1771-IFE analog input module; Allen-Bradley 1771-OFE analog output module; Allen-Bradley 1771-IG(D) TTL input module; Allen-Bradley 1771-OG(d) TTL output; Allen-Bradley 1771-IQ DC selectable input module; Allen-Bradley 1771-OW contact output module; Allen-Bradley 1771-IBD DC (10--30V) input module; Allen-Bradley 1771-OBD DC (10--60V) output module; Allen-Bradley 1771-IXE thermocouple/millivolt input module; and the Allen-Bradley 2705 RediPANEL push button module

  2. Locating hardware faults in a parallel computer

    Science.gov (United States)

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-04-13

    Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.

  3. Theorem Proving in Intel Hardware Design

    Science.gov (United States)

    O'Leary, John

    2009-01-01

    For the past decade, a framework combining model checking (symbolic trajectory evaluation) and higher-order logic theorem proving has been in production use at Intel. Our tools and methodology have been used to formally verify execution cluster functionality (including floating-point operations) for a number of Intel products, including the Pentium(Registered TradeMark)4 and Core(TradeMark)i7 processors. Hardware verification in 2009 is much more challenging than it was in 1999 - today s CPU chip designs contain many processor cores and significant firmware content. This talk will attempt to distill the lessons learned over the past ten years, discuss how they apply to today s problems, outline some future directions.

  4. Hardware implementation of stochastic spiking neural networks.

    Science.gov (United States)

    Rosselló, Josep L; Canals, Vincent; Morro, Antoni; Oliver, Antoni

    2012-08-01

    Spiking Neural Networks, the last generation of Artificial Neural Networks, are characterized by its bio-inspired nature and by a higher computational capacity with respect to other neural models. In real biological neurons, stochastic processes represent an important mechanism of neural behavior and are responsible of its special arithmetic capabilities. In this work we present a simple hardware implementation of spiking neurons that considers this probabilistic nature. The advantage of the proposed implementation is that it is fully digital and therefore can be massively implemented in Field Programmable Gate Arrays. The high computational capabilities of the proposed model are demonstrated by the study of both feed-forward and recurrent networks that are able to implement high-speed signal filtering and to solve complex systems of linear equations.

  5. Communication Estimation for Hardware/Software Codesign

    DEFF Research Database (Denmark)

    Knudsen, Peter Voigt; Madsen, Jan

    1998-01-01

    This paper presents a general high level estimation model of communication throughput for the implementation of a given communication protocol. The model, which is part of a larger model that includes component price, software driver object code size and hardware driver area, is intended...... to be general enough to be able to capture the characteristics of a wide range of communication protocols and yet to be sufficiently detailed as to allow the designer or design tool to efficiently explore tradeoffs between throughput, bus widths, burst/non-burst transfers and data packing strategies. Thus...... it provides a basis for decision making with respect to communication protocols/components and communication driver design in the initial design space exploration phase of a co-synthesis process where a large number of possibilities must be examined and where fast estimators are therefore necessary. The fill...

  6. The double Chooz hardware trigger system

    Energy Technology Data Exchange (ETDEWEB)

    Cucoanes, Andi; Beissel, Franz; Reinhold, Bernd; Roth, Stefan; Stahl, Achim; Wiebusch, Christopher [RWTH Aachen (Germany)

    2008-07-01

    The double Chooz neutrino experiment aims to improve the present knowledge on {theta}{sub 13} mixing angle using two similar detectors placed at {proportional_to}280 m and respectively 1 km from the Chooz power plant reactor cores. The detectors measure the disappearance of reactor antineutrinos. The hardware trigger has to be very efficient for antineutrinos as well as for various types of background events. The triggering condition is based on discriminated PMT sum signals and the multiplicity of groups of PMTs. The talk gives an outlook to the double Chooz experiment and explains the requirements of the trigger system. The resulting concept and its performance is shown as well as first results from a prototype system.

  7. Accelerating Value Creation with Accelerators

    DEFF Research Database (Denmark)

    Jonsson, Eythor Ivar

    2015-01-01

    and developing the best business ideas and support the due diligence process. Even universities are noticing that the learning experience of the action learning approach is an effective way to develop capabilities and change cultures. Accelerators related to what has historically been associated...

  8. Hardware/Software Co-Design of a Traffic Sign Recognition System Using Zynq FPGAs

    Directory of Open Access Journals (Sweden)

    Yan Han

    2015-12-01

    Full Text Available Traffic sign recognition (TSR, taken as an important component of an intelligent vehicle system, has been an emerging research topic in recent years. In this paper, a traffic sign detection system based on color segmentation, speeded-up robust features (SURF detection and the k-nearest neighbor classifier is introduced. The proposed system benefits from the SURF detection algorithm, which achieves invariance to rotated, skewed and occluded signs. In addition to the accuracy and robustness issues, a TSR system should target a real-time implementation on an embedded system. Therefore, a hardware/software co-design architecture for a Zynq-7000 FPGA is presented as a major objective of this work. The sign detection operations are accelerated by programmable hardware logic that searches the potential candidates for sign classification. Sign recognition and classification uses a feature extraction and matching algorithm, which is implemented as a software component that runs on the embedded ARM CPU.

  9. Data Intensive Architecture for Scalable Cyber Analytics

    Energy Technology Data Exchange (ETDEWEB)

    Olsen, Bryan K.; Johnson, John R.; Critchlow, Terence J.

    2011-12-19

    Cyber analysts are tasked with the identification and mitigation of network exploits and threats. These compromises are difficult to identify due to the characteristics of cyber communication, the volume of traffic, and the duration of possible attack. In this paper, we describe a prototype implementation designed to provide cyber analysts an environment where they can interactively explore a month’s worth of cyber security data. This prototype utilized On-Line Analytical Processing (OLAP) techniques to present a data cube to the analysts. The cube provides a summary of the data, allowing trends to be easily identified as well as the ability to easily pull up the original records comprising an event of interest. The cube was built using SQL Server Analysis Services (SSAS), with the interface to the cube provided by Tableau. This software infrastructure was supported by a novel hardware architecture comprising a Netezza TwinFin® for the underlying data warehouse and a cube server with a FusionIO drive hosting the data cube. We evaluated this environment on a month’s worth of artificial, but realistic, data using multiple queries provided by our cyber analysts. As our results indicate, OLAP technology has progressed to the point where it is in a unique position to provide novel insights to cyber analysts, as long as it is supported by an appropriate data intensive architecture.

  10. Laser acceleration

    Science.gov (United States)

    Tajima, T.; Nakajima, K.; Mourou, G.

    2017-02-01

    The fundamental idea of Laser Wakefield Acceleration (LWFA) is reviewed. An ultrafast intense laser pulse drives coherent wakefield with a relativistic amplitude robustly supported by the plasma. While the large amplitude of wakefields involves collective resonant oscillations of the eigenmode of the entire plasma electrons, the wake phase velocity ˜ c and ultrafastness of the laser pulse introduce the wake stability and rigidity. A large number of worldwide experiments show a rapid progress of this concept realization toward both the high-energy accelerator prospect and broad applications. The strong interest in this has been spurring and stimulating novel laser technologies, including the Chirped Pulse Amplification, the Thin Film Compression, the Coherent Amplification Network, and the Relativistic Mirror Compression. These in turn have created a conglomerate of novel science and technology with LWFA to form a new genre of high field science with many parameters of merit in this field increasing exponentially lately. This science has triggered a number of worldwide research centers and initiatives. Associated physics of ion acceleration, X-ray generation, and astrophysical processes of ultrahigh energy cosmic rays are reviewed. Applications such as X-ray free electron laser, cancer therapy, and radioisotope production etc. are considered. A new avenue of LWFA using nanomaterials is also emerging.

  11. Laser acceleration

    International Nuclear Information System (INIS)

    Tajima, T.; Nakajima, K.; Mourou, G.

    2017-01-01

    The fundamental idea of LaserWakefield Acceleration (LWFA) is reviewed. An ultrafast intense laser pulse drives coherent wakefield with a relativistic amplitude robustly supported by the plasma. While the large amplitude of wake fields involves collective resonant oscillations of the eigenmode of the entire plasma electrons, the wake phase velocity ∼ c and ultra fastness of the laser pulse introduce the wake stability and rigidity. A large number of worldwide experiments show a rapid progress of this concept realization toward both the high-energy accelerator prospect and broad applications. The strong interest in this has been spurring and stimulating novel laser technologies, including the Chirped Pulse Amplification, the Thin Film Compression, the Coherent Amplification Network, and the Relativistic Mirror Compression. These in turn have created a conglomerate of novel science and technology with LWFA to form a new genre of high field science with many parameters of merit in this field increasing exponentially lately. This science has triggered a number of worldwide research centers and initiatives. Associated physics of ion acceleration, X-ray generation, and astrophysical processes of ultrahigh energy cosmic rays are reviewed. Applications such as X-ray free electron laser, cancer therapy, and radioisotope production etc. are considered. A new avenue of LWFA using nano materials is also emerging.

  12. Accelerating networks

    International Nuclear Information System (INIS)

    Smith, David M D; Onnela, Jukka-Pekka; Johnson, Neil F

    2007-01-01

    Evolving out-of-equilibrium networks have been under intense scrutiny recently. In many real-world settings the number of links added per new node is not constant but depends on the time at which the node is introduced in the system. This simple idea gives rise to the concept of accelerating networks, for which we review an existing definition and-after finding it somewhat constrictive-offer a new definition. The new definition provided here views network acceleration as a time dependent property of a given system as opposed to being a property of the specific algorithm applied to grow the network. The definition also covers both unweighted and weighted networks. As time-stamped network data becomes increasingly available, the proposed measures may be easily applied to such empirical datasets. As a simple case study we apply the concepts to study the evolution of three different instances of Wikipedia, namely, those in English, German, and Japanese, and find that the networks undergo different acceleration regimes in their evolution

  13. Optimizing Nanoelectrode Arrays for Scalable Intracellular Electrophysiology.

    Science.gov (United States)

    Abbott, Jeffrey; Ye, Tianyang; Ham, Donhee; Park, Hongkun

    2018-03-20

    , clarifying how the nanoelectrode attains intracellular access. This understanding will be translated into a circuit model for the nanobio interface, which we will then use to lay out the strategies for improving the interface. The intracellular interface of the nanoelectrode is currently inferior to that of the patch clamp electrode; reaching this benchmark will be an exciting challenge that involves optimization of electrode geometries, materials, chemical modifications, electroporation protocols, and recording/stimulation electronics, as we describe in the Account. Another important theme of this Account, beyond the optimization of the individual nanoelectrode-cell interface, is the scalability of the nanoscale electrodes. We will discuss this theme using a recent development from our groups as an example, where an array of ca. 1000 nanoelectrode pixels fabricated on a CMOS integrated circuit chip performs parallel intracellular recording from a few hundreds of cardiomyocytes, which marks a new milestone in electrophysiology.

  14. The Fermilab Accelerator control system

    Science.gov (United States)

    Bogert, Dixon

    1986-06-01

    With the advent of the Tevatron, considerable upgrades have been made to the controls of all the Fermilab Accelerators. The current system is based on making as large an amount of data as possible available to many operators or end-users. Specifically there are about 100 000 separate readings, settings, and status and control registers in the various machines, all of which can be accessed by seventeen consoles, some in the Main Control Room and others distributed throughout the complex. A "Host" computer network of approximately eighteen PDP-11/34's, seven PDP-11/44's, and three VAX-11/785's supports a distributed data acquisition system including Lockheed MAC-16's left from the original Main Ring and Booster instrumentation and upwards of 1000 Z80, Z8002, and M68000 microprocessors in dozens of configurations. Interaction of the various parts of the system is via a central data base stored on the disk of one of the VAXes. The primary computer-hardware communication is via CAMAC for the new Tevatron and Antiproton Source; certain subsystems, among them vacuum, refrigeration, and quench protection, reside in the distributed microprocessors and communicate via GAS, an in-house protocol. An important hardware feature is an accurate clock system making a large number of encoded "events" in the accelerator supercycle available for both hardware modules and computers. System software features include the ability to save the current state of the machine or any subsystem and later restore it or compare it with the state at another time, a general logging facility to keep track of specific variables over long periods of time, detection of "exception conditions" and the posting of alarms, and a central filesharing capability in which files on VAX disks are available for access by any of the "Host" processors.

  15. The Fermilab accelerator control system

    International Nuclear Information System (INIS)

    Bogert, D.

    1986-01-01

    With the advent of the Tevatron, considerable upgrades have been made to the controls of all the Fermilab Accelerators. The current system is based on making as large an amount of data as possible available to many operators or end-users. Specifically there are about 100000 separate readings, settings, and status and control registers in the various machines, all of which can be accessed by seventeen consoles, some in the Main Control Room and others distributed throughout the complex. A ''Host'' computer network of approximately eighteen PDP-11/34's, seven PDP-11/44's, and three VAX-11/785's supports a distributed data acquisition system including Lockheed MAC-16's left from the original Main Ring and Booster instrumentation and upwards of 1000 Z80, Z8002, and M68000 microprocessors in dozens of configurations. Interaction of the various parts of the system is via a central data base stored on the disk of one of the VAXes. The primary computer-hardware communication is via CAMAC for the new Tevatron and Antiproton Source; certain subsystems, among them vacuum, refrigeration and quench protection, reside in the distributed microprocessors and communicate via GAS, an in-house protocol. An important hardware feature is an accurate clock system making a large number of encoded ''events'' in the accelerator supercycle available for both hardware modules and computers. System software features include the ability to save the current state of the machine or any subsystem and later restore it or compare it with the state at another time, a general logging facility to keep track of specific variables over long periods of time, detection of 'exception conditions' and the posting of alarms, and a central filesharing capability in which files on VAX disks are available for access by any of the ''Host'' processors. (orig.)

  16. A portable accelerator control toolkit

    Energy Technology Data Exchange (ETDEWEB)

    Watson, W.A. III

    1997-06-01

    In recent years, the expense of creating good control software has led to a number of collaborative efforts among laboratories to share this cost. The EPICS collaboration is a particularly successful example of this trend. More recently another collaborative effort has addressed the need for sophisticated high level software, including model driven accelerator controls. This work builds upon the CDEV (Common DEVice) software framework, which provides a generic abstraction of a control system, and maps that abstraction onto a number of site-specific control systems including EPICS, the SLAC control system, CERN/PS and others. In principle, it is now possible to create portable accelerator control applications which have no knowledge of the underlying and site-specific control system. Applications based on CDEV now provide a growing suite of tools for accelerator operations, including general purpose displays, an on-line accelerator model, beamline steering, machine status displays incorporating both hardware and model information (such as beam positions overlaid with beta functions) and more. A survey of CDEV compatible portable applications will be presented, as well as plans for future development.

  17. A portable accelerator control toolkit

    International Nuclear Information System (INIS)

    Watson, W.A. III.

    1997-01-01

    In recent years, the expense of creating good control software has led to a number of collaborative efforts among laboratories to share this cost. The EPICS collaboration is a particularly successful example of this trend. More recently another collaborative effort has addressed the need for sophisticated high level software, including model driven accelerator controls. This work builds upon the CDEV (Common DEVice) software framework, which provides a generic abstraction of a control system, and maps that abstraction onto a number of site-specific control systems including EPICS, the SLAC control system, CERN/PS and others. In principle, it is now possible to create portable accelerator control applications which have no knowledge of the underlying and site-specific control system. Applications based on CDEV now provide a growing suite of tools for accelerator operations, including general purpose displays, an on-line accelerator model, beamline steering, machine status displays incorporating both hardware and model information (such as beam positions overlaid with beta functions) and more. A survey of CDEV compatible portable applications will be presented, as well as plans for future development

  18. Advanced concepts for acceleration

    International Nuclear Information System (INIS)

    Keefe, D.

    1986-07-01

    Selected examples of advanced accelerator concepts are reviewed. Such plasma accelerators as plasma beat wave accelerator, plasma wake field accelerator, and plasma grating accelerator are discussed particularly as examples of concepts for accelerating relativistic electrons or positrons. Also covered are the pulsed electron-beam, pulsed laser accelerator, inverse Cherenkov accelerator, inverse free-electron laser, switched radial-line accelerators, and two-beam accelerator. Advanced concepts for ion acceleration discussed include the electron ring accelerator, excitation of waves on intense electron beams, and two-wave combinations

  19. Microprocessor controller for phasing the accelerator

    International Nuclear Information System (INIS)

    Howry, S.K.; Wilmunder, A.R.

    1977-03-01

    A microprocessor controller is being developed to perform automatic phasing of the SLAC accelerator. It will replace the existing relay/analog boxes which are ten years old. The new system is all solid state except for the stepping motors that drive the phase shifters. A description is given of the components of the system, the control algorithm, microprocessor hardware and software design and development, and interaction with SLAC's computer control system

  20. Hardware descriptions of the I and C systems for NPP

    International Nuclear Information System (INIS)

    Lee, Cheol Kwon; Oh, In Suk; Park, Joo Hyun; Kim, Dong Hoon; Han, Jae Bok; Shin, Jae Whal; Kim, Young Bak

    2003-09-01

    The hardware specifications for I and C Systems of SNPP(Standard Nuclear Power Plant) are reviewed in order to acquire the hardware requirement and specification of KNICS (Korea Nuclear Instrumentation and Control System). In the study, we investigated hardware requirements, hardware configuration, hardware specifications, man-machine hardware requirements, interface requirements with the other system, and data communication requirements that are applicable to SNP. We reviewed those things of control systems, protection systems, monitoring systems, information systems, and process instrumentation systems. Through the study, we described the requirements and specifications of digital systems focusing on a microprocessor and a communication interface, and repeated it for analog systems focusing on the manufacturing companies. It is expected that the experience acquired from this research will provide vital input for the development of the KNICS

  1. KOLAM: a cross-platform architecture for scalable visualization and tracking in wide-area imagery

    Science.gov (United States)

    Fraser, Joshua; Haridas, Anoop; Seetharaman, Guna; Rao, Raghuveer M.; Palaniappan, Kannappan

    2013-05-01

    KOLAM is an open, cross-platform, interoperable, scalable and extensible framework supporting a novel multi- scale spatiotemporal dual-cache data structure for big data visualization and visual analytics. This paper focuses on the use of KOLAM for target tracking in high-resolution, high throughput wide format video also known as wide-area motion imagery (WAMI). It was originally developed for the interactive visualization of extremely large geospatial imagery of high spatial and spectral resolution. KOLAM is platform, operating system and (graphics) hardware independent, and supports embedded datasets scalable from hundreds of gigabytes to feasibly petabytes in size on clusters, workstations, desktops and mobile computers. In addition to rapid roam, zoom and hyper- jump spatial operations, a large number of simultaneously viewable embedded pyramid layers (also referred to as multiscale or sparse imagery), interactive colormap and histogram enhancement, spherical projection and terrain maps are supported. The KOLAM software architecture was extended to support airborne wide-area motion imagery by organizing spatiotemporal tiles in very large format video frames using a temporal cache of tiled pyramid cached data structures. The current version supports WAMI animation, fast intelligent inspection, trajectory visualization and target tracking (digital tagging); the latter by interfacing with external automatic tracking software. One of the critical needs for working with WAMI is a supervised tracking and visualization tool that allows analysts to digitally tag multiple targets, quickly review and correct tracking results and apply geospatial visual analytic tools on the generated trajectories. One-click manual tracking combined with multiple automated tracking algorithms are available to assist the analyst and increase human effectiveness.

  2. Upgrade of the TOTEM DAQ using the Scalable Readout System (SRS)

    International Nuclear Information System (INIS)

    Quinto, M; Cafagna, F; Fiergolski, A; Radicioni, E

    2013-01-01

    The main goals of the TOTEM Experiment at the LHC are the measurements of the elastic and total p-p cross sections and the studies of the diffractive dissociation processes. At LHC, collisions are produced at a rate of 40 MHz, imposing strong requirements for the Data Acquisition Systems (DAQ) in terms of trigger rate and data throughput. The TOTEM DAQ adopts a modular approach that, in standalone mode, is based on VME bus system. The VME based Front End Driver (FED) modules, host mezzanines that receive data through optical fibres directly from the detectors. After data checks and formatting are applied in the mezzanine, data is retransmitted to the VME interface and to another mezzanine card plugged in the FED module. The VME bus maximum bandwidth limits the maximum first level trigger (L1A) to 1 kHz rate. In order to get rid of the VME bottleneck and improve scalability and the overall capabilities of the DAQ, a new system was designed and constructed based on the Scalable Readout System (SRS), developed in the framework of the RD51 Collaboration. The project aims to increase the efficiency of the actual readout system providing higher bandwidth, and increasing data filtering, implementing a second-level trigger event selection based on hardware pattern recognition algorithms. This goal is to be achieved preserving the maximum back compatibility with the LHC Timing, Trigger and Control (TTC) system as well as with the CMS DAQ. The obtained results and the perspectives of the project are reported. In particular, we describe the system architecture and the new Opto-FEC adapter card developed to connect the SRS with the FED mezzanine modules. A first test bench was built and validated during the last TOTEM data taking period (February 2013). Readout of a set of 3 TOTEM Roman Pot silicon detectors was carried out to verify performance in the real LHC environment. In addition, the test allowed a check of data consistency and quality

  3. Accelerators and the Accelerator Community

    Energy Technology Data Exchange (ETDEWEB)

    Malamud, Ernest; Sessler, Andrew

    2008-06-01

    In this paper, standing back--looking from afar--and adopting a historical perspective, the field of accelerator science is examined. How it grew, what are the forces that made it what it is, where it is now, and what it is likely to be in the future are the subjects explored. Clearly, a great deal of personal opinion is invoked in this process.

  4. Accelerators and the Accelerator Community

    International Nuclear Information System (INIS)

    Malamud, Ernest; Sessler, Andrew

    2008-01-01

    In this paper, standing back--looking from afar--and adopting a historical perspective, the field of accelerator science is examined. How it grew, what are the forces that made it what it is, where it is now, and what it is likely to be in the future are the subjects explored. Clearly, a great deal of personal opinion is invoked in this process

  5. Expert System analysis of non-fuel assembly hardware and spent fuel disassembly hardware: Its generation and recommended disposal

    International Nuclear Information System (INIS)

    Williamson, D.A.

    1991-01-01

    Almost all of the effort being expended on radioactive waste disposal in the United States is being focused on the disposal of spent Nuclear Fuel, with little consideration for other areas that will have to be disposed of in the same facilities. one area of radioactive waste that has not been addressed adequately because it is considered a secondary part of the waste issue is the disposal of the various Non-Fuel Bearing Components of the reactor core. These hardware components fall somewhat arbitrarily into two categories: Non-Fuel Assembly (NFA) hardware and Spent Fuel Disassembly (SFD) hardware. This work provides a detailed examination of the generation and disposal of NFA hardware and SFD hardware by the nuclear utilities of the United States as it relates to the Civilian Radioactive Waste Management Program. All available sources of data on NFA and SFD hardware are analyzed with particular emphasis given to the Characteristics Data Base developed by Oak Ridge National Laboratory and the characterization work performed by Pacific Northwest Laboratories and Rochester Gas ampersand Electric. An Expert System developed as a portion of this work is used to assist in the prediction of quantities of NFA hardware and SFD hardware that will be generated by the United States' utilities. Finally, the hardware waste management practices of the United Kingdom, France, Germany, Sweden, and Japan are studied for possible application to the disposal of domestic hardware wastes. As a result of this work, a general classification scheme for NFA and SFD hardware was developed. Only NFA and SFD hardware constructed of zircaloy and experiencing a burnup of less than 70,000 MWD/MTIHM and PWR control rods constructed of stainless steel are considered Low-Level Waste. All other hardware is classified as Greater-ThanClass-C waste

  6. Why Open Source Hardware matters and why you should care

    OpenAIRE

    Gürkaynak, Frank K.

    2017-01-01

    Open source hardware is currently where open source software was about 30 years ago. The idea is well received by enthusiasts, there is interest and the open source hardware has gained visible momentum recently, with several well-known universities including UC Berkeley, Cambridge and ETH Zürich actively working on large projects involving open source hardware, attracting the attention of companies big and small. But it is still not quite there yet. In this talk, based on my experience on the...

  7. Support for NUMA hardware in HelenOS

    OpenAIRE

    Horký, Vojtěch

    2011-01-01

    The goal of this master thesis is to extend HelenOS operating system with the support for ccNUMA hardware. The text of the thesis contains a brief introduction to ccNUMA hardware, an overview of NUMA features and relevant features of HelenOS (memory management, scheduling, etc.). The thesis analyses various design decisions of the implementation of NUMA support -- introducing the hardware topology into the kernel data structures, propagating this information to user space, thread affinity to ...

  8. Development of a simple, low cost, indirect ion beam fluence measurement system for ion implanters, accelerators

    Science.gov (United States)

    Suresh, K.; Balaji, S.; Saravanan, K.; Navas, J.; David, C.; Panigrahi, B. K.

    2018-02-01

    We developed a simple, low cost user-friendly automated indirect ion beam fluence measurement system for ion irradiation and analysis experiments requiring indirect beam fluence measurements unperturbed by sample conditions like low temperature, high temperature, sample biasing as well as in regular ion implantation experiments in the ion implanters and electrostatic accelerators with continuous beam. The system, which uses simple, low cost, off-the-shelf components/systems and two distinct layers of in-house built softwarenot only eliminates the need for costly data acquisition systems but also overcomes difficulties in using properietry software. The hardware of the system is centered around a personal computer, a PIC16F887 based embedded system, a Faraday cup drive cum monitor circuit, a pair of Faraday Cups and a beam current integrator and the in-house developed software include C based microcontroller firmware and LABVIEW based virtual instrument automation software. The automatic fluence measurement involves two important phases, a current sampling phase lasting over 20-30 seconds during which the ion beam current is continuously measured by intercepting the ion beam and the averaged beam current value is computed. A subsequent charge computation phase lasting 700-900 seconds is executed making the ion beam to irradiate the samples and the incremental fluence received by the sampleis estimated usingthe latest averaged beam current value from the ion beam current sampling phase. The cycle of current sampling-charge computation is repeated till the required fluence is reached. Besides simplicity and cost-effectiveness, other important advantages of the developed system include easy reconfiguration of the system to suit customisation of experiments, scalability, easy debug and maintenance of the hardware/software, ability to work as a standalone system. The system was tested with different set of samples and ion fluences and the results were verified using

  9. Development of a distributed control system for the JAERI tandem accelerator facility

    International Nuclear Information System (INIS)

    Hanashima, Susumu

    2005-01-01

    In the JAERI tandem accelerator facility, we are building accelerator complex aiming generation and acceleration of radio nuclear beam. Several accelerators, ion sources and a charge breeder are installed in the facility. We are developing a distributed control system enabling smooth operation of the facility. We report basic concepts of the control system in this article. We also describe about a control hardware using plastic optical fiber, which is developed for the control system. (author)

  10. HISTRAP [Heavy Ion Storage Ring for Atomic Physics] prototype hardware studies

    International Nuclear Information System (INIS)

    Olsen, D.K.; Atkins, W.H.; Dowling, D.T.; Johnson, J.W.; Lord, R.S.; McConnell, J.W.; Milner, W.T.; Mosko, S.W.; Tatum, B.A.

    1989-01-01

    HISTRAP, Heavy Ion Storage Ring for Atomic Physics, is a proposed 2.67-Tm synchrotron/cooler/storage ring optimized for advanced atomic physics research which will be injected with ions from either the HHIRF 25-MV tandem accelerator or a dedicated ECR source and RFQ linac. Over the last two years, hardware prototypes have been developed for difficult and long lead-time components. A vacuum test stand, the rf cavity, and a prototype dipole magnet have been designed, constructed, and tested. 7 refs., 8 figs., 2 tabs

  11. Ring accelerators

    International Nuclear Information System (INIS)

    Gisler, G.; Faehl, R.

    1983-01-01

    We present two-dimensional simulations in (r-z) and r-theta) cylinderical geometries of imploding-liner-driven accelerators of rings of charged particles. We address issues of azimuthal and longitudinal stability of the rings. We discuss self-trapping designs in which beam injection and extraction is aided by means of external cusp fields. Our simulations are done with the 2-1/2-D particle-in-cell plasma simulation code CLINER, which combines collisionless, electromagnetic PIC capabilities with a quasi-MHD finite element package

  12. accelerating cavity

    CERN Multimedia

    On the inside of the cavity there is a layer of niobium. Operating at 4.2 degrees above absolute zero, the niobium is superconducting and carries an accelerating field of 6 million volts per metre with negligible losses. Each cavity has a surface of 6 m2. The niobium layer is only 1.2 microns thick, ten times thinner than a hair. Such a large area had never been coated to such a high accuracy. A speck of dust could ruin the performance of the whole cavity so the work had to be done in an extremely clean environment.

  13. Progress Report 2008: A Scalable and Extensible Earth System Model for Climate Change Science

    Energy Technology Data Exchange (ETDEWEB)

    Drake, John B [ORNL; Worley, Patrick H [ORNL; Hoffman, Forrest M [ORNL; Jones, Phil [Los Alamos National Laboratory (LANL)

    2009-01-01

    This project employs multi-disciplinary teams to accelerate development of the Community Climate System Model (CCSM), based at the National Center for Atmospheric Research (NCAR). A consortium of eight Department of Energy (DOE) National Laboratories collaborate with NCAR and the NASA Global Modeling and Assimilation Office (GMAO). The laboratories are Argonne (ANL), Brookhaven (BNL) Los Alamos (LANL), Lawrence Berkeley (LBNL), Lawrence Livermore (LLNL), Oak Ridge (ORNL), Pacific Northwest (PNNL) and Sandia (SNL). The work plan focuses on scalablity for petascale computation and extensibility to a more comprehensive earth system model. Our stated goal is to support the DOE mission in climate change research by helping ... To determine the range of possible climate changes over the 21st century and beyond through simulations using a more accurate climate system model that includes the full range of human and natural climate feedbacks with increased realism and spatial resolution.

  14. Reliable software for unreliable hardware a cross layer perspective

    CERN Document Server

    Rehman, Semeen; Henkel, Jörg

    2016-01-01

    This book describes novel software concepts to increase reliability under user-defined constraints. The authors’ approach bridges, for the first time, the reliability gap between hardware and software. Readers will learn how to achieve increased soft error resilience on unreliable hardware, while exploiting the inherent error masking characteristics and error (stemming from soft errors, aging, and process variations) mitigations potential at different software layers. · Provides a comprehensive overview of reliability modeling and optimization techniques at different hardware and software levels; · Describes novel optimization techniques for software cross-layer reliability, targeting unreliable hardware.

  15. Environmental Friendly Coatings and Corrosion Prevention For Flight Hardware Project

    Science.gov (United States)

    Calle, Luz

    2014-01-01

    Identify, test and develop qualification criteria for environmentally friendly corrosion protective coatings and corrosion preventative compounds (CPC's) for flight hardware an ground support equipment.

  16. Magnetic qubits as hardware for quantum computers

    International Nuclear Information System (INIS)

    Tejada, J.; Chudnovsky, E.; Barco, E. del

    2000-01-01

    We propose two potential realisations for quantum bits based on nanometre scale magnetic particles of large spin S and high anisotropy molecular clusters. In case (1) the bit-value basis states vertical bar-0> and vertical bar-1> are the ground and first excited spin states S z = S and S-1, separated by an energy gap given by the ferromagnetic resonance (FMR) frequency. In case (2), when there is significant tunnelling through the anisotropy barrier, the qubit states correspond to the symmetric, vertical bar-0>, and antisymmetric, vertical bar-1>, combinations of the two-fold degenerate ground state S z = ± S. In each case the temperature of operation must be low compared to the energy gap, Δ, between the states vertical bar-0> and vertical bar-1>. The gap Δ in case (2) can be controlled with an external magnetic field perpendicular to the easy axis of the molecular cluster. The states of different molecular clusters and magnetic particles may be entangled by connecting them by superconducting lines with Josephson switches, leading to the potential for quantum computing hardware. (author)

  17. Magnetic qubits as hardware for quantum computers

    Energy Technology Data Exchange (ETDEWEB)

    Tejada, J.; Chudnovsky, E.; Barco, E. del [and others

    2000-07-01

    We propose two potential realisations for quantum bits based on nanometre scale magnetic particles of large spin S and high anisotropy molecular clusters. In case (1) the bit-value basis states vertical bar-0> and vertical bar-1> are the ground and first excited spin states S{sub z} = S and S-1, separated by an energy gap given by the ferromagnetic resonance (FMR) frequency. In case (2), when there is significant tunnelling through the anisotropy barrier, the qubit states correspond to the symmetric, vertical bar-0>, and antisymmetric, vertical bar-1>, combinations of the two-fold degenerate ground state S{sub z} = {+-} S. In each case the temperature of operation must be low compared to the energy gap, {delta}, between the states vertical bar-0> and vertical bar-1>. The gap {delta} in case (2) can be controlled with an external magnetic field perpendicular to the easy axis of the molecular cluster. The states of different molecular clusters and magnetic particles may be entangled by connecting them by superconducting lines with Josephson switches, leading to the potential for quantum computing hardware. (author)

  18. Nanorobot Hardware Architecture for Medical Defense

    Directory of Open Access Journals (Sweden)

    Luiz C. Kretly

    2008-05-01

    Full Text Available This work presents a new approach with details on the integrated platform and hardware architecture for nanorobots application in epidemic control, which should enable real time in vivo prognosis of biohazard infection. The recent developments in the field of nanoelectronics, with transducers progressively shrinking down to smaller sizes through nanotechnology and carbon nanotubes, are expected to result in innovative biomedical instrumentation possibilities, with new therapies and efficient diagnosis methodologies. The use of integrated systems, smart biosensors, and programmable nanodevices are advancing nanoelectronics, enabling the progressive research and development of molecular machines. It should provide high precision pervasive biomedical monitoring with real time data transmission. The use of nanobioelectronics as embedded systems is the natural pathway towards manufacturing methodology to achieve nanorobot applications out of laboratories sooner as possible. To demonstrate the practical application of medical nanorobotics, a 3D simulation based on clinical data addresses how to integrate communication with nanorobots using RFID, mobile phones, and satellites, applied to long distance ubiquitous surveillance and health monitoring for troops in conflict zones. Therefore, the current model can also be used to prevent and save a population against the case of some targeted epidemic disease.

  19. Hardware upgrade for A2 data acquisition

    Energy Technology Data Exchange (ETDEWEB)

    Ostrick, Michael; Gradl, Wolfgang; Otte, Peter-Bernd; Neiser, Andreas; Steffen, Oliver; Wolfes, Martin; Koerner, Tito [Institut fuer Kernphysik, Mainz (Germany); Collaboration: A2-Collaboration

    2014-07-01

    The A2 Collaboration uses an energy tagged photon beam which is produced via bremsstrahlung off the MAMI electron beam. The detector system consists of Crystal Ball and TAPS and covers almost the whole solid angle. A frozen-spin polarized target allows to perform high precision measurements of polarization observables in meson photo-production. During the last summer, a major upgrade of the data acquisition system was performed, both on the hardware and the software side. The goal of this upgrade was increased reliability of the system and an improvement in the data rate to disk. By doubling the number of readout CPUs and employing special VME crates with a split backplane, the number of bus accesses per readout cycle and crate was cut by a factor of two, giving almost a factor of two gain in the readout rate. In the course of the upgrade, we also switched most of the detector control system to using the distributed control system EPICS. For the upgraded control system, some new tools were developed to make full use of the capabilities of this decentralised slow control and monitoring system. The poster presents some of the major contributions to this project.

  20. Sindbad: a multi-purpose and scalable X-ray simulation tool for NDE and medical imaging

    Energy Technology Data Exchange (ETDEWEB)

    Guillemaud, R.; Tabary, J.; Hugonnard, P.; Mathy, F.; Koenig, A.; Gliere, A

    2003-07-01

    In a unified framework, S.i.n.d.b.a.d. is a multipurpose X-ray simulation software which provides scalable approach of computation and very efficient results by combining analytical and monte Carlo simulations. The software has been validated experimentally. it is also a easy to use software with a strong emphasize on user friendly GUI, simple description of object (CAD or volume) and visualization tools. The next developments will be focused on acceleration of Monte Carlo simulation for scatter fraction computation and the addition of new types of detector. (N.C.)

  1. Cosmic ray acceleration mechanisms

    International Nuclear Information System (INIS)

    Cesarsky, C.J.

    1982-09-01

    We present a brief summary of some of the most popular theories of cosmic ray acceleration: Fermi acceleration, its application to acceleration by shocks in a scattering medium, and impulsive acceleration by relativistic shocks

  2. Quality Scalability Compression on Single-Loop Solution in HEVC

    Directory of Open Access Journals (Sweden)

    Mengmeng Zhang

    2014-01-01

    Full Text Available This paper proposes a quality scalable extension design for the upcoming high efficiency video coding (HEVC standard. In the proposed design, the single-loop decoder solution is extended into the proposed scalable scenario. A novel interlayer intra/interprediction is added to reduce the amount of bits representation by exploiting the correlation between coding layers. The experimental results indicate that the average Bjøntegaard delta rate decrease of 20.50% can be gained compared with the simulcast encoding. The proposed technique achieved 47.98% Bjøntegaard delta rate reduction compared with the scalable video coding extension of the H.264/AVC. Consequently, significant rate savings confirm that the proposed method achieves better performance.

  3. A Scalable Parallel PWTD-Accelerated SIE Solver for Analyzing Transient Scattering from Electrically Large Objects

    KAUST Repository

    Liu, Yang; Yucel, Abdulkadir; Bagci, Hakan; Michielssen, Eric

    2015-01-01

    of processors by leveraging two mechanisms: (i) a hierarchical parallelization strategy to evenly distribute the computation and memory loads at all levels of the PWTD tree among processors, and (ii) a novel asynchronous communication scheme to reduce the cost

  4. Scalable DeNoise-and-Forward in Bidirectional Relay Networks

    DEFF Research Database (Denmark)

    Sørensen, Jesper Hemming; Krigslund, Rasmus; Popovski, Petar

    2010-01-01

    In this paper a scalable relaying scheme is proposed based on an existing concept called DeNoise-and-Forward, DNF. We call it Scalable DNF, S-DNF, and it targets the scenario with multiple communication flows through a single common relay. The idea of the scheme is to combine packets at the relay...... in order to save transmissions. To ensure decodability at the end-nodes, a priori information about the content of the combined packets must be available. This is gathered during the initial transmissions to the relay. The trade-off between decodability and number of necessary transmissions is analysed...

  5. FPGA BASED HARDWARE KEY FOR TEMPORAL ENCRYPTION

    Directory of Open Access Journals (Sweden)

    B. Lakshmi

    2010-09-01

    Full Text Available In this paper, a novel encryption scheme with time based key technique on an FPGA is presented. Time based key technique ensures right key to be entered at right time and hence, vulnerability of encryption through brute force attack is eliminated. Presently available encryption systems, suffer from Brute force attack and in such a case, the time taken for breaking a code depends on the system used for cryptanalysis. The proposed scheme provides an effective method in which the time is taken as the second dimension of the key so that the same system can defend against brute force attack more vigorously. In the proposed scheme, the key is rotated continuously and four bits are drawn from the key with their concatenated value representing the delay the system has to wait. This forms the time based key concept. Also the key based function selection from a pool of functions enhances the confusion and diffusion to defend against linear and differential attacks while the time factor inclusion makes the brute force attack nearly impossible. In the proposed scheme, the key scheduler is implemented on FPGA that generates the right key at right time intervals which is then connected to a NIOS – II processor (a virtual microcontroller which is brought out from Altera FPGA that communicates with the keys to the personal computer through JTAG (Joint Test Action Group communication and the computer is used to perform encryption (or decryption. In this case the FPGA serves as hardware key (dongle for data encryption (or decryption.

  6. Bayesian Estimation and Inference using Stochastic Hardware

    Directory of Open Access Journals (Sweden)

    Chetan Singh Thakur

    2016-03-01

    Full Text Available In this paper, we present the implementation of two types of Bayesian inference problems to demonstrate the potential of building probabilistic algorithms in hardware using single set of building blocks with the ability to perform these computations in real time. The first implementation, referred to as the BEAST (Bayesian Estimation and Stochastic Tracker, demonstrates a simple problem where an observer uses an underlying Hidden Markov Model (HMM to track a target in one dimension. In this implementation, sensors make noisy observations of the target position at discrete time steps. The tracker learns the transition model for target movement, and the observation model for the noisy sensors, and uses these to estimate the target position by solving the Bayesian recursive equation online. We show the tracking performance of the system and demonstrate how it can learn the observation model, the transition model, and the external distractor (noise probability interfering with the observations. In the second implementation, referred to as the Bayesian INference in DAG (BIND, we show how inference can be performed in a Directed Acyclic Graph (DAG using stochastic circuits. We show how these building blocks can be easily implemented using simple digital logic gates. An advantage of the stochastic electronic implementation is that it is robust to certain types of noise, which may become an issue in integrated circuit (IC technology with feature sizes in the order of tens of nanometers due to their low noise margin, the effect of high-energy cosmic rays and the low supply voltage. In our framework, the flipping of random individual bits would not affect the system performance because information is encoded in a bit stream.

  7. Performance and scalability of the back-end sub-system in the ATLAS DAQ/EF prototype

    CERN Document Server

    Alexandrov, I N; Badescu, E; Burckhart, Doris; Caprini, M; Cohen, L; Duval, P Y; Hart, R; Jones, R; Kazarov, A; Kolos, S; Kotov, V; Laugier, D; Mapelli, Livio P; Moneta, L; Qian, Z; Radu, A A; Ribeiro, C A; Roumiantsev, V; Ryabov, Yu; Schweiger, D; Soloviev, I V

    2000-01-01

    The DAQ group of the future ATLAS experiment has developed a prototype system based on the trigger/DAQ architecture described in the ATLAS Technical Proposal to support studies of the full system functionality, architecture as well as available hardware and software technologies. One sub-system of this prototype is the back- end which encompasses the software needed to configure, control and monitor the DAQ, but excludes the processing and transportation of physics data. The back-end consists of a number of components including run control, configuration databases and message reporting system. The software has been developed using standard, external software technologies such as OO databases and CORBA. It has been ported to several C++ compilers and operating systems including Solaris, Linux, WNT and LynxOS. This paper gives an overview of the back-end software, its performance, scalability and current status. (17 refs).

  8. Sharing open hardware through ROP, the robotic open platform

    NARCIS (Netherlands)

    Lunenburg, J.; Soetens, R.P.T.; Schoenmakers, F.; Metsemakers, P.M.G.; van de Molengraft, M.J.G.; Steinbuch, M.; Behnke, S.; Veloso, M.; Visser, A.; Xiong, R.

    2014-01-01

    The robot open source software community, in particular ROS, drastically boosted robotics research. However, a centralized place to exchange open hardware designs does not exist. Therefore we launched the Robotic Open Platform (ROP). A place to share and discuss open hardware designs. Among others

  9. Sharing open hardware through ROP, the Robotic Open Platform

    NARCIS (Netherlands)

    Lunenburg, J.J.M.; Soetens, R.P.T.; Schoenmakers, Ferry; Metsemakers, P.M.G.; Molengraft, van de M.J.G.; Steinbuch, M.

    2013-01-01

    The robot open source software community, in particular ROS, drastically boosted robotics research. However, a centralized place to exchange open hardware designs does not exist. Therefore we launched the Robotic Open Platform (ROP). A place to share and discuss open hardware designs. Among others

  10. The role of the visual hardware system in rugby performance ...

    African Journals Online (AJOL)

    This study explores the importance of the 'hardware' factors of the visual system in the game of rugby. A group of professional and club rugby players were tested and the results compared. The results were also compared with the established norms for elite athletes. The findings indicate no significant difference in hardware ...

  11. Hardware packet pacing using a DMA in a parallel computer

    Science.gov (United States)

    Chen, Dong; Heidelberger, Phillip; Vranas, Pavlos

    2013-08-13

    Method and system for hardware packet pacing using a direct memory access controller in a parallel computer which, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter.

  12. Hardware/software virtualization for the reconfigurable multicore platform.

    NARCIS (Netherlands)

    Ferger, M.; Al Kadi, M.; Hübner, M.; Koedam, M.L.P.J.; Sinha, S.S.; Goossens, K.G.W.; Marchesan Almeida, Gabriel; Rodrigo Azambuja, J.; Becker, Juergen

    2012-01-01

    This paper presents the Flex Tiles approach for the virtualization of hardware and software for a reconfigurable multicore architecture. The approach enables the virtualization of a dynamic tile-based hardware architecture consisting of processing tiles connected via a network-on-chip and a

  13. Hardware and software for image acquisition in nuclear medicine

    International Nuclear Information System (INIS)

    Fideles, E.L.; Vilar, G.; Silva, H.S.

    1992-01-01

    A system for image acquisition and processing in nuclear medicine is presented, including the hardware and software referring to acquisition. The hardware is consisted of an analog-digital conversion card, developed in wire-wape. Its function is digitate the analogic signs provided by gamma camera. The acquisitions are made in list or frame mode. (C.G.C.)

  14. Hardware Abstraction and Protocol Optimization for Coded Sensor Networks

    DEFF Research Database (Denmark)

    Nistor, Maricica; Roetter, Daniel Enrique Lucani; Barros, João

    2015-01-01

    The design of the communication protocols in wireless sensor networks (WSNs) often neglects several key characteristics of the sensor's hardware, while assuming that the number of transmitted bits is the dominating factor behind the system's energy consumption. A closer look at the hardware speci...

  15. PCIU: Hardware Implementations of an Efficient Packet Classification Algorithm with an Incremental Update Capability

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2011-01-01

    Full Text Available Packet classification plays a crucial role for a number of network services such as policy-based routing, firewalls, and traffic billing, to name a few. However, classification can be a bottleneck in the above-mentioned applications if not implemented properly and efficiently. In this paper, we propose PCIU, a novel classification algorithm, which improves upon previously published work. PCIU provides lower preprocessing time, lower memory consumption, ease of incremental rule update, and reasonable classification time compared to state-of-the-art algorithms. The proposed algorithm was evaluated and compared to RFC and HiCut using several benchmarks. Results obtained indicate that PCIU outperforms these algorithms in terms of speed, memory usage, incremental update capability, and preprocessing time. The algorithm, furthermore, was improved and made more accessible for a variety of applications through implementation in hardware. Two such implementations are detailed and discussed in this paper. The results indicate that a hardware/software codesign approach results in a slower, but easier to optimize and improve within time constraints, PCIU solution. A hardware accelerator based on an ESL approach using Handel-C, on the other hand, resulted in a 31x speed-up over a pure software implementation running on a state of the art Xeon processor.

  16. A Practical Introduction to HardwareSoftware Codesign

    CERN Document Server

    Schaumont, Patrick R

    2013-01-01

    This textbook provides an introduction to embedded systems design, with emphasis on integration of custom hardware components with software. The key problem addressed in the book is the following: how can an embedded systems designer strike a balance between flexibility and efficiency? The book describes how combining hardware design with software design leads to a solution to this important computer engineering problem. The book covers four topics in hardware/software codesign: fundamentals, the design space of custom architectures, the hardware/software interface and application examples. The book comes with an associated design environment that helps the reader to perform experiments in hardware/software codesign. Each chapter also includes exercises and further reading suggestions. Improvements in this second edition include labs and examples using modern FPGA environments from Xilinx and Altera, which make the material applicable to a greater number of courses where these tools are already in use.  Mo...

  17. Accelerating ATM Optimization Algorithms Using High Performance Computing Hardware, Phase II

    Data.gov (United States)

    National Aeronautics and Space Administration — NASA is developing algorithms and methodologies for efficient air-traffic management. Several researchers have adopted an optimization framework for solving problems...

  18. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA

    NARCIS (Netherlands)

    Laan, Wladimir J. van der; Roerdink, Jos B.T.M.; Jalba, Andrei C.; Zinterhof, P; Loncaric, S; Uhl, A; Carini, A

    2009-01-01

    The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. This transform, by means of the lifting scheme, can be performed in a memory mid computation efficient way on modern, programmable GPUs, which can be regarded as massively

  19. More power : Accelerating sequential Computer Vision algorithms using commodity parallel hardware

    NARCIS (Netherlands)

    Jaap van de Loosdrecht; K. Dijkstra

    2014-01-01

    The last decade has seen an increasing demand from the industrial field of computerized visual inspection. Applications rapidly become more complex and often with more demanding real time constraints. However, from 2004 onwards the clock frequency of CPUs has not increased significantly. Computer

  20. A Hardware-Accelerated Fast Adaptive Vortex-Based Flow Simulation Software, Phase I

    Data.gov (United States)

    National Aeronautics and Space Administration — Applied Scientific Research has recently developed a Lagrangian vortex-boundary element method for the grid-free simulation of unsteady incompressible...