WorldWideScience

Sample records for hardware accelerated scalable

  1. Hardware Accelerated Simulated Radiography

    Energy Technology Data Exchange (ETDEWEB)

    Laney, D; Callahan, S; Max, N; Silva, C; Langer, S; Frank, R

    2005-04-12

    We present the application of hardware accelerated volume rendering algorithms to the simulation of radiographs as an aid to scientists designing experiments, validating simulation codes, and understanding experimental data. The techniques presented take advantage of 32 bit floating point texture capabilities to obtain validated solutions to the radiative transport equation for X-rays. An unsorted hexahedron projection algorithm is presented for curvilinear hexahedra that produces simulated radiographs in the absorption-only regime. A sorted tetrahedral projection algorithm is presented that simulates radiographs of emissive materials. We apply the tetrahedral projection algorithm to the simulation of experimental diagnostics for inertial confinement fusion experiments on a laser at the University of Rochester. We show that the hardware accelerated solution is faster than the current technique used by scientists.

  2. Hardware Accelerated Sequence Alignment with Traceback

    Directory of Open Access Journals (Sweden)

    Scott Lloyd

    2009-01-01

    in a timely manner. Known methods to accelerate alignment on reconfigurable hardware only address sequence comparison, limit the sequence length, or exhibit memory and I/O bottlenecks. A space-efficient, global sequence alignment algorithm and architecture is presented that accelerates the forward scan and traceback in hardware without memory and I/O limitations. With 256 processing elements in FPGA technology, a performance gain over 300 times that of a desktop computer is demonstrated on sequence lengths of 16000. For greater performance, the architecture is scalable to more processing elements.

  3. Hardware-Accelerated Simulated Radiography

    Energy Technology Data Exchange (ETDEWEB)

    Laney, D; Callahan, S; Max, N; Silva, C; Langer, S; Frank, R

    2005-08-04

    We present the application of hardware accelerated volume rendering algorithms to the simulation of radiographs as an aid to scientists designing experiments, validating simulation codes, and understanding experimental data. The techniques presented take advantage of 32-bit floating point texture capabilities to obtain solutions to the radiative transport equation for X-rays. The hardware accelerated solutions are accurate enough to enable scientists to explore the experimental design space with greater efficiency than the methods currently in use. An unsorted hexahedron projection algorithm is presented for curvilinear hexahedral meshes that produces simulated radiographs in the absorption-only regime. A sorted tetrahedral projection algorithm is presented that simulates radiographs of emissive materials. We apply the tetrahedral projection algorithm to the simulation of experimental diagnostics for inertial confinement fusion experiments on a laser at the University of Rochester.

  4. Scalable fast multipole accelerated vortex methods

    KAUST Repository

    Hu, Qi

    2014-05-01

    The fast multipole method (FMM) is often used to accelerate the calculation of particle interactions in particle-based methods to simulate incompressible flows. To evaluate the most time-consuming kernels - the Biot-Savart equation and stretching term of the vorticity equation, we mathematically reformulated it so that only two Laplace scalar potentials are used instead of six. This automatically ensuring divergence-free far-field computation. Based on this formulation, we developed a new FMM-based vortex method on heterogeneous architectures, which distributed the work between multicore CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm uses new data structures which can dynamically manage inter-node communication and load balance efficiently, with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching calculation for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s.

  5. Hybrid Interconnect Design for Heterogeneous Hardware Accelerators

    NARCIS (Netherlands)

    Pham-Quoc Cuong, P.

    2015-01-01

    Heterogeneous multicore systems are becoming increasingly important as the need for computation power grows, especially when we are entering into the big data era. As one of the main trends in heterogeneous multicore, hardware accelerator systems provide application specific hardware circuits and

  6. Implementation of Hardware Accelerators on Zynq

    OpenAIRE

    Toft, Jakob Kenn; Nannarelli, Alberto

    2016-01-01

    In the recent years it has become obvious that the performance of general purpose processors are having trouble meeting the requirements of high performance computing applications of today. This is partly due to the relatively high power consumption, compared to the performance, of general purpose processors, which has made hardware accelerators an essential part of several datacentres and the worlds fastest super-computers. In this work, two different hardware accelerators were implemented o...

  7. The hardware accelerator array for logic simulation

    Energy Technology Data Exchange (ETDEWEB)

    Hansen, N H [Washington State Univ., Pullman, WA (USA)

    1991-05-01

    Hardware acceleration exploits the parallelism inherent in large circuit simulations to achieve significant increases in performance. Simulation accelerators have been developed based on the compiled code algorithm or the event-driven algorithm. The greater flexibility of the event-driven algorithm has resulted in several important developments in hardware acceleration architecture. Some popular commercial products have been developed based on the event-driven algorithm and data-flow architectures. Conventional data-flow architectures require complex switching networks to distribute operands among processing elements resulting in considerable overhead. An accelerator array architecture based on a nearest-neighbor communication has been developed in this thesis. The design is simulated in detail at the behavioral level. Its performance is evaluated and shown to be superior to that of a conventional data-flow accelerator. 14 refs., 48 figs., 5 tabs.

  8. Hardware Accelerated Point Rendering of Isosurfaces

    DEFF Research Database (Denmark)

    Bærentzen, Jakob Andreas; Christensen, Niels Jørgen

    2003-01-01

    and that the advantage of rendering points as opposed to triangles increases with the size and complexity of the volumes. To gauge the visual quality of future hardware accelerated point rendering schemes, we have implemented a software based point rendering method and compare the quality to both MC and our OpenGL based...

  9. Hardware Acceleration of Adaptive Neural Algorithms.

    Energy Technology Data Exchange (ETDEWEB)

    James, Conrad D. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2017-11-01

    As tradit ional numerical computing has faced challenges, researchers have turned towards alternative computing approaches to reduce power - per - computation metrics and improve algorithm performance. Here, we describe an approach towards non - conventional computing that strengthens the connection between machine learning and neuroscience concepts. The Hardware Acceleration of Adaptive Neural Algorithms (HAANA) project ha s develop ed neural machine learning algorithms and hardware for applications in image processing and cybersecurity. While machine learning methods are effective at extracting relevant features from many types of data, the effectiveness of these algorithms degrades when subjected to real - world conditions. Our team has generated novel neural - inspired approa ches to improve the resiliency and adaptability of machine learning algorithms. In addition, we have also designed and fabricated hardware architectures and microelectronic devices specifically tuned towards the training and inference operations of neural - inspired algorithms. Finally, our multi - scale simulation framework allows us to assess the impact of microelectronic device properties on algorithm performance.

  10. Protection of Accelerator Hardware: RF systems

    CERN Document Server

    Kim, S.-H.

    2016-01-01

    The radio-frequency (RF) system is the key element that generates electric fields for beam acceleration. To keep the system reliable, a highly sophisticated protection scheme is required, which also should be designed to ensure a good balance between beam availability and machine safety. Since RF systems are complex, incorporating high-voltage and high-power equipment, a good portion of machine downtime typically comes from RF systems. Equipment and component damage in RF systems results in long and expensive repairs. Protection of RF system hardware is one of the oldest machine protection concepts, dealing with the protection of individual high-power RF equipment from breakdowns. As beam power increases in modern accelerators, the protection of accelerating structures from beam-induced faults also becomes a critical aspect of protection schemes. In this article, an overview of the RF system is given, and selected topics of failure mechanisms and examples of protection requirements are introduced.

  11. Implementing a hardware-friendly wavelet entropy codec for scalable video

    Science.gov (United States)

    Eeckhaut, Hendrik; Christiaens, Mark; Devos, Harald; Stroobandt, Dirk

    2005-11-01

    In the RESUME project (Reconfigurable Embedded Systems for Use in Multimedia Environments) we explore the benefits of an implementation of scalable multimedia applications using reconfigurable hardware by building an FPGA implementation of a scalable wavelet-based video decoder. The term "scalable" refers to a design that can easily accommodate changes in quality of service with minimal computational overhead. This is important for portable devices that have different Quality of Service (QoS) requirements and have varying power restrictions. The scalable video decoder consists of three major blocks: a Wavelet Entropy Decoder (WED), an Inverse Discrete Wavelet Transformer (IDWT) and a Motion Compensator (MC). The WED decodes entropy encoded parts of the video stream into wavelet transformed frames. These frames are decoded bitlayer per bitlayer. The more bitlayers are decoded the higher the image quality (scalability in image quality). Resolution scalability is obtained as an inherent property of the IDWT. Finally framerate scalability is achieved through hierarchical motion compensation. In this article we present the results of our investigation into the hardware implementation of such a scalable video codec. In particular we found that the implementation of the entropy codec is a significant bottleneck. We present an alternative, hardware-friendly algorithm for entropy coding with excellent data locality (both temporal and spatial), streaming capabilities, a high degree of parallelism, a smaller memory footprint and state-of-the-art compression while maintaining all required scalability properties. These claims are supported by an effective hardware implementation on an FPGA.

  12. Open Hardware For CERN's Accelerator Control Systems

    CERN Document Server

    van der Bij, E; Ayass, M; Boccardi, A; Cattin, M; Gil Soriano, C; Gousiou, E; Iglesias Gonsálvez, S; Penacoba Fernandez, G; Serrano, J; Voumard, N; Wlostowski, T

    2011-01-01

    The accelerator control systems at CERN will be renovated and many electronics modules will be redesigned as the modules they will replace cannot be bought anymore or use obsolete components. The modules used in the control systems are diverse: analog and digital I/O, level converters and repeaters, serial links and timing modules. Overall around 120 modules are supported that are used in systems such as beam instrumentation, cryogenics and power converters. Only a small percentage of the currently used modules are commercially available, while most of them had been specifically designed at CERN. The new developments are based on VITA and PCI-SIG standards such as FMC (FPGA Mezzanine Card), PCI Express and VME64x using transition modules. As system-on-chip interconnect, the public domain Wishbone specification is used. For the renovation, it is considered imperative to have for each board access to the full hardware design and its firmware so that problems could quickly be resolved by CERN engineers or its ...

  13. FPGA Acceleration by Dynamically-Loaded Hardware Libraries

    DEFF Research Database (Denmark)

    Lomuscio, Andrea; Nannarelli, Alberto; Re, Marco

    Hardware acceleration is a viable solution to obtain energy efficiency in data intensive computation. In this work, we present a hardware framework to dynamically load hardware libraries, HLL, on reconfigurable platforms (FPGAs). Provided a library of application-specific processors, we load on...

  14. A Software and Hardware IPTV Architecture for Scalable DVB Distribution

    Directory of Open Access Journals (Sweden)

    Georg Acher

    2009-01-01

    Full Text Available Many standards and even more proprietary technologies deal with IP-based television (IPTV. But none of them can transparently map popular public broadcast services such as DVB or ATSC to IPTV with acceptable effort. In this paper we explain why we believe that such a mapping using a light weight framework is an important step towards all-IP multimedia. We then present the NetCeiver architecture: it is based on well-known standards such as IPv6, and it allows zero configuration. The use of multicast streaming makes NetCeiver highly scalable. We also describe a low cost FPGA implementation of the proposed NetCeiver architecture, which can concurrently stream services from up to six full transponders.

  15. Cache-based memory copy hardware accelerator for multicore systems

    NARCIS (Netherlands)

    Duarte, F.; Wong, S.

    2010-01-01

    In this paper, we present a new architecture of the cache-based memory copy hardware accelerator in a multicore system supporting message passing. The accelerator is able to accelerate memory data movements, in particular memory copies. We perform an analytical analysis based on open-queuing theory

  16. A Cache-Based Hardware Accelerator for Memory Data Movements

    NARCIS (Netherlands)

    Duarte, F.

    2008-01-01

    This dissertation presents a hardware accelerator that is able to accelerate large (including non-parallel) memory data movements, in particular memory copies, performed traditionally by the processors. As todays processors are tied with or have integrated caches with varying sizes (from several

  17. Nios II hardware acceleration of the epsilon quadratic sieve algorithm

    Science.gov (United States)

    Meyer-Bäse, Uwe; Botella, Guillermo; Castillo, Encarnacion; García, Antonio

    2010-04-01

    The quadratic sieve (QS) algorithm is one of the most powerful algorithms to factor large composite primes used to break RSA cryptographic systems. The hardware structure of the QS algorithm seems to be a good fit for FPGA acceleration. Our new ɛ-QS algorithm further simplifies the hardware architecture making it an even better candidate for C2H acceleration. This paper shows our design results in FPGA resource and performance when implementing very long arithmetic on the Nios microprocessor platform with C2H acceleration for different libraries (GMP, LIP, FLINT, NRMP) and QS architecture choices for factoring 32-2048 bit RSA numbers.

  18. Implementation of Hardware Accelerators on Zynq

    DEFF Research Database (Denmark)

    Toft, Jakob Kenn

    benchmarks, a Monte Carlo simulation of European stock options and a Telco telephone billing application. Each of the accelerators test different aspects of the Zynq platform in terms of floating-point and binary coded decimal processing speed. The two accelerators are compared with the performance......In the recent years it has become obvious that the performance of general purpose processors are having trouble meeting the requirements of high performance computing applications of today. This is partly due to the relatively high power consumption, compared to the performance, of general purpose...

  19. An Impulse-C Hardware Accelerator for Packet Classification Based on Fine/Coarse Grain Optimization

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2013-01-01

    Full Text Available Current software-based packet classification algorithms exhibit relatively poor performance, prompting many researchers to concentrate on novel frameworks and architectures that employ both hardware and software components. The Packet Classification with Incremental Update (PCIU algorithm, Ahmed et al. (2010, is a novel and efficient packet classification algorithm with a unique incremental update capability that demonstrated excellent results and was shown to be scalable for many different tasks and clients. While a pure software implementation can generate powerful results on a server machine, an embedded solution may be more desirable for some applications and clients. Embedded, specialized hardware accelerator based solutions are typically much more efficient in speed, cost, and size than solutions that are implemented on general-purpose processor systems. This paper seeks to explore the design space of translating the PCIU algorithm into hardware by utilizing several optimization techniques, ranging from fine grain to coarse grain and parallel coarse grain approaches. The paper presents a detailed implementation of a hardware accelerator of the PCIU based on an Electronic System Level (ESL approach. Results obtained indicate that the hardware accelerator achieves on average 27x speedup over a state-of-the-art Xeon processor.

  20. Evaluating the scalability of HEP software and multi-core hardware

    CERN Document Server

    Jarp, S; Leduc, J; Nowak, A

    2011-01-01

    As researchers have reached the practical limits of processor performance improvements by frequency scaling, it is clear that the future of computing lies in the effective utilization of parallel and multi-core architectures. Since this significant change in computing is well underway, it is vital for HEP programmers to understand the scalability of their software on modern hardware and the opportunities for potential improvements. This work aims to quantify the benefit of new mainstream architectures to the HEP community through practical benchmarking on recent hardware solutions, including the usage of parallelized HEP applications.

  1. A Low-Power Scalable Stream Compute Accelerator for General Matrix Multiply (GEMM

    Directory of Open Access Journals (Sweden)

    Antony Savich

    2014-01-01

    play an important role in determining the performance of such applications. This paper proposes a novel efficient, highly scalable hardware accelerator that is of equivalent performance to a 2 GHz quad core PC but can be used in low-power applications targeting embedded systems requiring high performance computation. Power, performance, and resource consumption are demonstrated on a fully-functional prototype. The proposed hardware accelerator is 36× more energy efficient per unit of computation compared to state-of-the-art Xeon processor of equal vintage and is 14× more efficient as a stand-alone platform with equivalent performance. An important comparison between simulated system estimates and real system performance is carried out.

  2. A Hardware Framework for on-Chip FPGA Acceleration

    DEFF Research Database (Denmark)

    Lomuscio, Andrea; Cardarilli, Gian Carlo; Nannarelli, Alberto

    2016-01-01

    In this work, we present a new framework to dynamically load hardware accelerators on reconfigurable platforms (FPGAs). Provided a library of application-specific processors, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA-based accele......In this work, we present a new framework to dynamically load hardware accelerators on reconfigurable platforms (FPGAs). Provided a library of application-specific processors, we load on-the-fly the specific processor in the FPGA, and we transfer the execution from the CPU to the FPGA......-based accelerator. Results show that significant speed-up can be obtained by the proposed acceleration framework on system-on-chips where reconfigurable fabric is placed next to the CPUs. The speed-up is due to both the intrinsic acceleration in the application-specific processors, and to the increased parallelism....

  3. 3D IBFV : Hardware-Accelerated 3D Flow Visualization

    NARCIS (Netherlands)

    Telea, Alexandru; Wijk, Jarke J. van

    2003-01-01

    We present a hardware-accelerated method for visualizing 3D flow fields. The method is based on insertion, advection, and decay of dye. To this aim, we extend the texture-based IBFV technique for 2D flow visualization in two main directions. First, we decompose the 3D flow visualization problem in a

  4. FHAST: FPGA-Based Acceleration of Bowtie in Hardware.

    Science.gov (United States)

    Fernandez, Edward B; Villarreal, Jason; Lonardi, Stefano; Najjar, Walid A

    2015-01-01

    While the sequencing capability of modern instruments continues to increase exponentially, the computational problem of mapping short sequenced reads to a reference genome still constitutes a bottleneck in the analysis pipeline. A variety of mapping tools (e.g., Bowtie, BWA) is available for general-purpose computer architectures. These tools can take many hours or even days to deliver mapping results, depending on the number of input reads, the size of the reference genome and the number of allowed mismatches or insertion/deletions, making the mapping problem an ideal candidate for hardware acceleration. In this paper, we present FHAST (FPGA hardware accelerated sequence-matching tool), a drop-in replacement for Bowtie that uses a hardware design based on field programmable gate arrays (FPGA). Our architecture masks memory latency by executing multiple concurrent hardware threads accessing memory simultaneously. FHAST is composed by multiple parallel engines to exploit the parallelism available to us on an FPGA. We have implemented and tested FHAST on the Convey HC-1 and later ported on the Convey HC-2ex, taking advantage of the large memory bandwidth available to these systems and the shared memory image between hardware and software. A preliminary version of FHAST running on the Convey HC-1 achieved up to 70x speedup compared to Bowtie (single-threaded). An improved version of FHAST running on the Convey HC-2ex FPGAs achieved up to 12x fold speed gain compared to Bowtie running eight threads on an eight-core conventional architecture, while maintaining almost identical mapping accuracy. FHAST is a drop-in replacement for Bowtie, so it can be incorporated in any analysis pipeline that uses Bowtie (e.g., TopHat).

  5. The NIDS Cluster: Scalable, Stateful Network Intrusion Detection on Commodity Hardware

    Energy Technology Data Exchange (ETDEWEB)

    Tierney, Brian L; Vallentin, Matthias; Sommer, Robin; Lee, Jason; Leres, Craig; Paxson, Vern; Tierney, Brian

    2007-09-19

    In this work we present a NIDS cluster as a scalable solution for realizing high-performance, stateful network intrusion detection on commodity hardware. The design addresses three challenges: (i) distributing traffic evenly across an extensible set of analysis nodes in a fashion that minimizes the communication required for coordination, (ii) adapting the NIDS's operation to support coordinating its low-level analysis rather than just aggregating alerts; and (iii) validating that the cluster produces sound results. Prototypes of our NIDS cluster now operate at the Lawrence Berkeley National Laboratory and the University of California at Berkeley. In both environments the clusters greatly enhance the power of the network security monitoring.

  6. Interfacing Hardware Accelerators to a Time-Division Multiplexing Network-on-Chip

    DEFF Research Database (Denmark)

    Pezzarossa, Luca; Sørensen, Rasmus Bo; Schoeberl, Martin

    2015-01-01

    This paper addresses the integration of stateless hardware accelerators into time-predictable multi-core platforms based on time-division multiplexing networks-on-chip. Stateless hardware accelerators, like floating-point units, are typically attached as co-processors to individual processors...... in the platform. Our design takes a different approach and connects the hardware accelerators to the network-on-chip in the same way as processor cores. Each processor that uses a hardware accelerator is assigned a virtual channel for sending instructions to the hardware accelerator and a virtual channel...

  7. Generating clock signals for a cycle accurate, cycle reproducible FPGA based hardware accelerator

    Science.gov (United States)

    Asaad, Sameth W.; Kapur, Mohit

    2016-01-05

    A method, system and computer program product are disclosed for generating clock signals for a cycle accurate FPGA based hardware accelerator used to simulate operations of a device-under-test (DUT). In one embodiment, the DUT includes multiple device clocks generating multiple device clock signals at multiple frequencies and at a defined frequency ratio; and the FPG hardware accelerator includes multiple accelerator clocks generating multiple accelerator clock signals to operate the FPGA hardware accelerator to simulate the operations of the DUT. In one embodiment, operations of the DUT are mapped to the FPGA hardware accelerator, and the accelerator clock signals are generated at multiple frequencies and at the defined frequency ratio of the frequencies of the multiple device clocks, to maintain cycle accuracy between the DUT and the FPGA hardware accelerator. In an embodiment, the FPGA hardware accelerator may be used to control the frequencies of the multiple device clocks.

  8. GASPRNG: GPU accelerated scalable parallel random number generator library

    Science.gov (United States)

    Gao, Shuang; Peterson, Gregory D.

    2013-04-01

    Graphics processors represent a promising technology for accelerating computational science applications. Many computational science applications require fast and scalable random number generation with good statistical properties, so they use the Scalable Parallel Random Number Generators library (SPRNG). We present the GPU Accelerated SPRNG library (GASPRNG) to accelerate SPRNG in GPU-based high performance computing systems. GASPRNG includes code for a host CPU and CUDA code for execution on NVIDIA graphics processing units (GPUs) along with a programming interface to support various usage models for pseudorandom numbers and computational science applications executing on the CPU, GPU, or both. This paper describes the implementation approach used to produce high performance and also describes how to use the programming interface. The programming interface allows a user to be able to use GASPRNG the same way as SPRNG on traditional serial or parallel computers as well as to develop tightly coupled programs executing primarily on the GPU. We also describe how to install GASPRNG and use it. To help illustrate linking with GASPRNG, various demonstration codes are included for the different usage models. GASPRNG on a single GPU shows up to 280x speedup over SPRNG on a single CPU core and is able to scale for larger systems in the same manner as SPRNG. Because GASPRNG generates identical streams of pseudorandom numbers as SPRNG, users can be confident about the quality of GASPRNG for scalable computational science applications. Catalogue identifier: AEOI_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEOI_v1_0.html Program obtainable from: CPC Program Library, Queen’s University, Belfast, N. Ireland Licensing provisions: UTK license. No. of lines in distributed program, including test data, etc.: 167900 No. of bytes in distributed program, including test data, etc.: 1422058 Distribution format: tar.gz Programming language: C and CUDA. Computer: Any PC or

  9. Optimizing memory-bound SYMV kernel on GPU hardware accelerators

    KAUST Repository

    Abdelfattah, Ahmad

    2013-01-01

    Hardware accelerators are becoming ubiquitous high performance scientific computing. They are capable of delivering an unprecedented level of concurrent execution contexts. High-level programming language extensions (e.g., CUDA), profiling tools (e.g., PAPI-CUDA, CUDA Profiler) are paramount to improve productivity, while effectively exploiting the underlying hardware. We present an optimized numerical kernel for computing the symmetric matrix-vector product on nVidia Fermi GPUs. Due to its inherent memory-bound nature, this kernel is very critical in the tridiagonalization of a symmetric dense matrix, which is a preprocessing step to calculate the eigenpairs. Using a novel design to address the irregular memory accesses by hiding latency and increasing bandwidth, our preliminary asymptotic results show 3.5x and 2.5x fold speedups over the similar CUBLAS 4.0 kernel, and 7-8% and 30% fold improvement over the Matrix Algebra on GPU and Multicore Architectures (MAGMA) library in single and double precision arithmetics, respectively. © 2013 Springer-Verlag.

  10. Final Scientific/Technical Report for "Enabling Exascale Hardware and Software Design through Scalable System Virtualization"

    Energy Technology Data Exchange (ETDEWEB)

    Dinda, Peter August [Northwestern Univ., Evanston, IL (United States)

    2015-03-17

    This report describes the activities, findings, and products of the Northwestern University component of the "Enabling Exascale Hardware and Software Design through Scalable System Virtualization" project. The purpose of this project has been to extend the state of the art of systems software for high-end computing (HEC) platforms, and to use systems software to better enable the evaluation of potential future HEC platforms, for example exascale platforms. Such platforms, and their systems software, have the goal of providing scientific computation at new scales, thus enabling new research in the physical sciences and engineering. Over time, the innovations in systems software for such platforms also become applicable to more widely used computing clusters, data centers, and clouds. This was a five-institution project, centered on the Palacios virtual machine monitor (VMM) systems software, a project begun at Northwestern, and originally developed in a previous collaboration between Northwestern University and the University of New Mexico. In this project, Northwestern (including via our subcontract to the University of Pittsburgh) contributed to the continued development of Palacios, along with other team members. We took the leadership role in (1) continued extension of support for emerging Intel and AMD hardware, (2) integration and performance enhancement of overlay networking, (3) connectivity with architectural simulation, (4) binary translation, and (5) support for modern Non-Uniform Memory Access (NUMA) hosts and guests. We also took a supporting role in support for specialized hardware for I/O virtualization, profiling, configurability, and integration with configuration tools. The efforts we led (1-5) were largely successful and executed as expected, with code and papers resulting from them. The project demonstrated the feasibility of a virtualization layer for HEC computing, similar to such layers for cloud or datacenter computing. For effort (3

  11. A Hardware-Efficient Scalable Spike Sorting Neural Signal Processor Module for Implantable High-Channel-Count Brain Machine Interfaces.

    Science.gov (United States)

    Yang, Yuning; Boling, Sam; Mason, Andrew J

    2017-08-01

    Next-generation brain machine interfaces demand a high-channel-count neural recording system to wirelessly monitor activities of thousands of neurons. A hardware efficient neural signal processor (NSP) is greatly desirable to ease the data bandwidth bottleneck for a fully implantable wireless neural recording system. This paper demonstrates a complete multichannel spike sorting NSP module that incorporates all of the necessary spike detector, feature extractor, and spike classifier blocks. To meet high-channel-count and implantability demands, each block was designed to be highly hardware efficient and scalable while sharing resources efficiently among multiple channels. To process multiple channels in parallel, scalability analysis was performed, and the utilization of each block was optimized according to its input data statistics and the power, area and/or speed of each block. Based on this analysis, a prototype 32-channel spike sorting NSP scalable module was designed and tested on an FPGA using synthesized datasets over a wide range of signal to noise ratios. The design was mapped to 130 nm CMOS to achieve 0.75 μW power and 0.023 mm2 area consumptions per channel based on post synthesis simulation results, which permits scalability of digital processing to 690 channels on a 4×4 mm2 electrode array.

  12. sBWT: memory efficient implementation of the hardware-acceleration-friendly Schindler transform for the fast biological sequence mapping.

    Science.gov (United States)

    Chang, Chia-Hua; Chou, Min-Te; Wu, Yi-Chung; Hong, Ting-Wei; Li, Yun-Lung; Yang, Chia-Hsiang; Hung, Jui-Hung

    2016-11-15

    The Full-text index in Minute space (FM-index) derived from the Burrows-Wheeler transform (BWT) is broadly used for fast string matching in large genomes or a huge set of sequencing reads. Several graphic processing unit (GPU) accelerated aligners based on the FM-index have been proposed recently; however, the construction of the index is still handled by central processing unit (CPU), only parallelized in data level (e.g. by performing blockwise suffix sorting in GPU), or not scalable for large genomes. To fulfill the need for a more practical, hardware-parallelizable indexing and matching approach, we herein propose sBWT based on a BWT variant (i.e. Schindler transform) that can be built with highly simplified hardware-acceleration-friendly algorithms and still suffices accurate and fast string matching in repetitive references. In our tests, the implementation achieves significant speedups in indexing and searching compared with other BWT-based tools and can be applied to a variety of domains. sBWT is implemented in C ++ with CPU-only and GPU-accelerated versions. sBWT is open-source software and is available at http://jhhung.github.io/sBWT/Supplementary information: Supplementary data are available at Bioinformatics online. chyee@ntu.edu.tw or jhhung@nctu.edu.tw (also juihunghung@gmail.com). © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  13. Energy Efficient FPGA based Hardware Accelerators for Financial Applications

    DEFF Research Database (Denmark)

    Kenn Toft, Jakob; Nannarelli, Alberto

    2014-01-01

    Field Programmable Gate Arrays (FPGAs) based accelerators are very suitable to implement application-specific processors using uncommon operations or number systems. In this work, we design FPGA-based accelerators for two financial computations with different characteristics and we compare...

  14. FPGA Implementation of Decimal Processors for Hardware Acceleration

    DEFF Research Database (Denmark)

    Borup, Nicolas; Dindorp, Jonas; Nannarelli, Alberto

    2011-01-01

    Applications in non-conventional number systems can benefit from accelerators implemented on reconfigurable platforms, such as Field Programmable Gate-Arrays (FPGAs). In this paper, we show that applications requiring decimal operations, such as the ones necessary in accounting or financial trans...... execution on the CPU of the hosting computer....

  15. Accelerating epistasis analysis in human genetics with consumer graphics hardware

    Directory of Open Access Journals (Sweden)

    Cancare Fabio

    2009-07-01

    Full Text Available Abstract Background Human geneticists are now capable of measuring more than one million DNA sequence variations from across the human genome. The new challenge is to develop computationally feasible methods capable of analyzing these data for associations with common human disease, particularly in the context of epistasis. Epistasis describes the situation where multiple genes interact in a complex non-linear manner to determine an individual's disease risk and is thought to be ubiquitous for common diseases. Multifactor Dimensionality Reduction (MDR is an algorithm capable of detecting epistasis. An exhaustive analysis with MDR is often computationally expensive, particularly for high order interactions. This challenge has previously been met with parallel computation and expensive hardware. The option we examine here exploits commodity hardware designed for computer graphics. In modern computers Graphics Processing Units (GPUs have more memory bandwidth and computational capability than Central Processing Units (CPUs and are well suited to this problem. Advances in the video game industry have led to an economy of scale creating a situation where these powerful components are readily available at very low cost. Here we implement and evaluate the performance of the MDR algorithm on GPUs. Of primary interest are the time required for an epistasis analysis and the price to performance ratio of available solutions. Findings We found that using MDR on GPUs consistently increased performance per machine over both a feature rich Java software package and a C++ cluster implementation. The performance of a GPU workstation running a GPU implementation reduces computation time by a factor of 160 compared to an 8-core workstation running the Java implementation on CPUs. This GPU workstation performs similarly to 150 cores running an optimized C++ implementation on a Beowulf cluster. Furthermore this GPU system provides extremely cost effective

  16. Tuple spaces in hardware for accelerated implicit routing

    Energy Technology Data Exchange (ETDEWEB)

    Baker, Zachary Kent [Los Alamos National Laboratory; Tripp, Justin [Los Alamos National Laboratory

    2010-12-01

    Organizing and optimizing data objects on networks with support for data migration and failing nodes is a complicated problem to handle as systems grow. The goal of this work is to demonstrate that high levels of speedup can be achieved by moving responsibility for finding, fetching, and staging data into an FPGA-based network card. We present a system for implicit routing of data via FPGA-based network cards. In this system, data structures are requested by name, and the network of FPGAs finds the data within the network and relays the structure to the requester. This is acheived through successive examination of hardware hash tables implemented in the FPGA. By avoiding software stacks between nodes, the data is quickly fetched entirely through FPGA-FPGA interaction. The performance of this system is orders of magnitude faster than software implementations due to the improved speed of the hash tables and lowered latency between the network nodes.

  17. A High Performance QDWH-SVD Solver using Hardware Accelerators

    KAUST Repository

    Sukkari, Dalal E.

    2015-04-08

    This paper describes a new high performance implementation of the QR-based Dynamically Weighted Halley Singular Value Decomposition (QDWH-SVD) solver on multicore architecture enhanced with multiple GPUs. The standard QDWH-SVD algorithm was introduced by Nakatsukasa and Higham (SIAM SISC, 2013) and combines three successive computational stages: (1) the polar decomposition calculation of the original matrix using the QDWH algorithm, (2) the symmetric eigendecomposition of the resulting polar factor to obtain the singular values and the right singular vectors and (3) the matrix-matrix multiplication to get the associated left singular vectors. A comprehensive test suite highlights the numerical robustness of the QDWH-SVD solver. Although it performs up to two times more flops when computing all singular vectors compared to the standard SVD solver algorithm, our new high performance implementation on single GPU results in up to 3.8x improvements for asymptotic matrix sizes, compared to the equivalent routines from existing state-of-the-art open-source and commercial libraries. However, when only singular values are needed, QDWH-SVD is penalized by performing up to 14 times more flops. The singular value only implementation of QDWH-SVD on single GPU can still run up to 18% faster than the best existing equivalent routines. Integrating mixed precision techniques in the solver can additionally provide up to 40% improvement at the price of losing few digits of accuracy, compared to the full double precision floating point arithmetic. We further leverage the single GPU QDWH-SVD implementation by introducing the first multi-GPU SVD solver to study the scalability of the QDWH-SVD framework.

  18. Final Report: Enabling Exascale Hardware and Software Design through Scalable System Virtualization

    Energy Technology Data Exchange (ETDEWEB)

    Bridges, Patrick G.

    2015-02-01

    In this grant, we enhanced the Palacios virtual machine monitor to increase its scalability and suitability for addressing exascale system software design issues. This included a wide range of research on core Palacios features, large-scale system emulation, fault injection, perfomrance monitoring, and VMM extensibility. This research resulted in large number of high-impact publications in well-known venues, the support of a number of students, and the graduation of two Ph.D. students and one M.S. student. In addition, our enhanced version of the Palacios virtual machine monitor has been adopted as a core element of the Hobbes operating system under active DOE-funded research and development.

  19. Evaluation of GNU Radio Platform Enhanced for Hardware Accelerated Radio Design

    OpenAIRE

    Karve, Mrudula Prabhakar

    2010-01-01

    The advent of software radio technology has enabled radio developers to design and imple- ment radios with great ease and flexibility. Software radios are effective in experimentation and development of radio designs. However, they have limitations when it comes to high- speed, high-throughput designs. This limitation can be overcome by introducing a hardware element to the software radio platform. Enhancing GNU Radio for Hardware Accelerated Radio Design project implements suc...

  20. Hardware accelerator of convolution with exponential function for image processing applications

    Science.gov (United States)

    Panchenko, Ivan; Bucha, Victor

    2015-12-01

    In this paper we describe a Hardware Accelerator (HWA) for fast recursive approximation of separable convolution with exponential function. This filter can be used in many Image Processing (IP) applications, e.g. depth-dependent image blur, image enhancement and disparity estimation. We have adopted this filter RTL implementation to provide maximum throughput in constrains of required memory bandwidth and hardware resources to provide a power-efficient VLSI implementation.

  1. Towards Batched Linear Solvers on Accelerated Hardware Platforms

    Energy Technology Data Exchange (ETDEWEB)

    Haidar, Azzam [University of Tennessee (UT); Dong, Tingzing Tim [University of Tennessee (UT); Tomov, Stanimire [University of Tennessee (UT); Dongarra, Jack J [ORNL

    2015-01-01

    As hardware evolves, an increasingly effective approach to develop energy efficient, high-performance solvers, is to design them to work on many small and independent problems. Indeed, many applications already need this functionality, especially for GPUs, which are known to be currently about four to five times more energy efficient than multicore CPUs for every floating-point operation. In this paper, we describe the development of the main one-sided factorizations: LU, QR, and Cholesky; that are needed for a set of small dense matrices to work in parallel. We refer to such algorithms as batched factorizations. Our approach is based on representing the algorithms as a sequence of batched BLAS routines for GPU-contained execution. Note that this is similar in functionality to the LAPACK and the hybrid MAGMA algorithms for large-matrix factorizations. But it is different from a straightforward approach, whereby each of GPU's symmetric multiprocessors factorizes a single problem at a time. We illustrate how our performance analysis together with the profiling and tracing tools guided the development of batched factorizations to achieve up to 2-fold speedup and 3-fold better energy efficiency compared to our highly optimized batched CPU implementations based on the MKL library on a two-sockets, Intel Sandy Bridge server. Compared to a batched LU factorization featured in the NVIDIA's CUBLAS library for GPUs, we achieves up to 2.5-fold speedup on the K40 GPU.

  2. Hardware Accelerator for the Multifractal Analysis of DNA Sequences.

    Science.gov (United States)

    Duarte-Sanchez, Jorge E; Velasco-Medina, Jaime; Moreno, Pedro A

    2017-07-24

    The multifractal analysis has allowed to quantify the genetic variability and non-linear stability along the human genome sequence. It has some implications in explaining several genetic diseases given by some chromosome abnormalities, among other genetic particularities. The multifractal analysis of a genome is carried out by dividing the complete DNA sequence in smaller fragments and calculating the generalized dimension spectrum of each fragment using the chaos game representation and the box-counting method. This is a time consuming process because it involves the processing of large data sets using floating-point representation. In order to reduce the computation time, we designed an application-specific processor, here called multifractal processor, which is based on our proposed hardware-oriented algorithm for calculating efficiently the generalized dimension spectrum of DNA sequences. The multifractal processor was implemented on a low-cost SoC-FPGA and was verified by processing a complete human genome. The execution time and numeric results of the Multifractal processor were compared with the results obtained from the software implementation executed in a 20-core workstation, achieving a speed up of 2.6x and an average error of 0.0003%.

  3. Accelerating Popular Tomographic Reconstruction Algorithms on Commodity PC Graphics Hardware

    Science.gov (United States)

    Xu, Fang; Mueller, K.

    2005-06-01

    The task of reconstructing an object from its projections via tomographic methods is a time-consuming process due to the vast complexity of the data. For this reason, manufacturers of equipment for medical computed tomography (CT) rely mostly on special application specified integrated circuits (ASICs) to obtain the fast reconstruction times required in clinical settings. Although modern CPUs have gained sufficient power in recent years to be competitive for two-dimensional (2D) reconstruction, this is not the case for three-dimensional (3D) reconstructions, especially not when iterative algorithms must be applied. The recent evolution of commodity PC computer graphics boards (GPUs) has the potential to change this picture in a very dramatic way. In this paper we will show how the new floating point GPUs can be exploited to perform both analytical and iterative reconstruction from X-ray and functional imaging data. For this purpose, we decompose three popular three-dimensional (3D) reconstruction algorithms (Feldkamp filtered backprojection, the simultaneous algebraic reconstruction technique, and expectation maximization) into a common set of base modules, which all can be executed on the GPU and their output linked internally. Visualization of the reconstructed object is easily achieved since the object already resides in the graphics hardware, allowing one to run a visualization module at any time to view the reconstruction results. Our implementation allows speedups of over an order of magnitude with respect to CPU implementations, at comparable image quality.

  4. Logic and fault simulation on the Mach1000 hardware accelerator

    Energy Technology Data Exchange (ETDEWEB)

    Hofstadler, P.

    1988-03-01

    This document describes an interface between Mentor Graphics brand workstations and the Mach1000 simulation accelerator from Silicon Solutions (Zycad) Corp. It discusses logic and fault simulation concepts, system administration, dimulation models, hierarchical fault reporting, and hierarchical fault analysis. The tools that are presented perform simulations (MX), report fault simulation results hierarchically (Faultreverse arrowAnalyze), and schedule jobs to share the Mach1000 (SSD). In addition, a method of hierarchical fault analysis is developed that allows the use to determine whether a fault is undetectable or at best possibly detectable (divergent). The tools that implement the hierarchical fault analysis method are also presented. Numerous example problems are worked throughout to clarify and demonstrate the concepts that are being discussed.

  5. Accelerating string set matching in FPGA hardware for bioinformatics research.

    Science.gov (United States)

    Dandass, Yoginder S; Burgess, Shane C; Lawrence, Mark; Bridges, Susan M

    2008-04-15

    This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA) devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM) is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences). This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM) resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation.

  6. Accelerating String Set Matching in FPGA Hardware for Bioinformatics Research

    Directory of Open Access Journals (Sweden)

    Burgess Shane C

    2008-04-01

    Full Text Available Abstract Background This paper describes techniques for accelerating the performance of the string set matching problem with particular emphasis on applications in computational proteomics. The process of matching peptide sequences against a genome translated in six reading frames is part of a proteogenomic mapping pipeline that is used as a case-study. The Aho-Corasick algorithm is adapted for execution in field programmable gate array (FPGA devices in a manner that optimizes space and performance. In this approach, the traditional Aho-Corasick finite state machine (FSM is split into smaller FSMs, operating in parallel, each of which matches up to 20 peptides in the input translated genome. Each of the smaller FSMs is further divided into five simpler FSMs such that each simple FSM operates on a single bit position in the input (five bits are sufficient for representing all amino acids and special symbols in protein sequences. Results This bit-split organization of the Aho-Corasick implementation enables efficient utilization of the limited random access memory (RAM resources available in typical FPGAs. The use of on-chip RAM as opposed to FPGA logic resources for FSM implementation also enables rapid reconfiguration of the FPGA without the place and routing delays associated with complex digital designs. Conclusion Experimental results show storage efficiencies of over 80% for several data sets. Furthermore, the FPGA implementation executing at 100 MHz is nearly 20 times faster than an implementation of the traditional Aho-Corasick algorithm executing on a 2.67 GHz workstation.

  7. A compact linear accelerator based on a scalable microelectromechanical-system RF-structure

    Science.gov (United States)

    Persaud, A.; Ji, Q.; Feinberg, E.; Seidl, P. A.; Waldron, W. L.; Schenkel, T.; Lal, A.; Vinayakumar, K. B.; Ardanuc, S.; Hammer, D. A.

    2017-06-01

    A new approach for a compact radio-frequency (RF) accelerator structure is presented. The new accelerator architecture is based on the Multiple Electrostatic Quadrupole Array Linear Accelerator (MEQALAC) structure that was first developed in the 1980s. The MEQALAC utilized RF resonators producing the accelerating fields and providing for higher beam currents through parallel beamlets focused using arrays of electrostatic quadrupoles (ESQs). While the early work obtained ESQs with lateral dimensions on the order of a few centimeters, using a printed circuit board (PCB), we reduce the characteristic dimension to the millimeter regime, while massively scaling up the potential number of parallel beamlets. Using Microelectromechanical systems scalable fabrication approaches, we are working on further reducing the characteristic dimension to the sub-millimeter regime. The technology is based on RF-acceleration components and ESQs implemented in the PCB or silicon wafers where each beamlet passes through beam apertures in the wafer. The complete accelerator is then assembled by stacking these wafers. This approach has the potential for fast and inexpensive batch fabrication of the components and flexibility in system design for application specific beam energies and currents. For prototyping the accelerator architecture, the components have been fabricated using the PCB. In this paper, we present proof of concept results of the principal components using the PCB: RF acceleration and ESQ focusing. Ongoing developments on implementing components in silicon and scaling of the accelerator technology to high currents and beam energies are discussed.

  8. Protein-protein docking on hardware accelerators: comparison of GPU and MIC architectures

    Science.gov (United States)

    2015-01-01

    Background The hardware accelerators will provide solutions to computationally complex problems in bioinformatics fields. However, the effect of acceleration depends on the nature of the application, thus selection of an appropriate accelerator requires some consideration. Results In the present study, we compared the effects of acceleration using graphics processing unit (GPU) and many integrated core (MIC) on the speed of fast Fourier transform (FFT)-based protein-protein docking calculation. The GPU implementation performed the protein-protein docking calculations approximately five times faster than the MIC offload mode implementation. The MIC native mode implementation has the advantage in the implementation costs. However, the performance was worse with larger protein pairs because of memory limitations. Conclusion The results suggest that GPU is more suitable than MIC for accelerating FFT-based protein-protein docking applications. PMID:25707855

  9. Design of Application-Specific Instructions and Hardware Accelerator for Reed-Solomon Codecs

    Directory of Open Access Journals (Sweden)

    Lee Jaesung

    2003-01-01

    Full Text Available This paper presents new application-specific digital signal processor (ASDSP instructions and their hardware accelerator to efficiently implement Reed-Solomon (RS encoding and decoding, which is one of the most widely used forward error control (FEC algorithms. The proposed ASDSP architecture can implement various programmable primitive polynomials, and thus, hardwired RS codecs can be replaced. The new instructions and their hardware accelerator perform Galois field (GF operations using the proposed GF multiplier and adder. Therefore, the proposed digital signal processor (DSP architecture can significantly reduce the number of clock cycles compared with existing DSP chips. The proposed GF multiplier was implemented using the Faraday 0.25 m standard cell library and it can perform RS decoding at a rate up to 228.1 Mbps at 130 MHz.

  10. Hardware design to accelerate PNG encoder for binary mask compression on FPGA

    Science.gov (United States)

    Kachouri, Rostom; Akil, Mohamed

    2015-02-01

    PNG (Portable Network Graphics) is a lossless compression method for real-world pictures. Since its specification, it continues to attract the interest of the image processing community. Indeed, PNG is an extensible file format for portable and well-compressed storage of raster images. In addition, it supports all of Black and White (binary mask), grayscale, indexed-color, and truecolor images. Within the framework of the Demat+ project which intend to propose a complete solution for storage and retrieval of scanned documents, we address in this paper a hardware design to accelerate the PNG encoder for binary mask compression on FPGA. For this, an optimized architecture is proposed as part of an hybrid software and hardware co-operating system. For its evaluation, the new designed PNG IP has been implemented on the ALTERA Arria II GX EP2AGX125EF35" FPGA. The experimental results show a good match between the achieved compression ratio, the computational cost and the used hardware resources.

  11. Fuzzy Logic Based Hardware Accelerator with Partially Reconfigurable Defuzzification Stage for Image Edge Detection

    Directory of Open Access Journals (Sweden)

    Aous H. Kurdi

    2017-01-01

    Full Text Available In this paper, the design and the implementation of a pipelined hardware accelerator based on a fuzzy logic approach for an edge detection system are presented. The fuzzy system comprises a preprocessing stage, a fuzzifier with four fuzzy inputs, an inference system with seven rules, and a defuzzification stage delivering a single crisp output, which represents the intensity value of a pixel in the output image. The hardware accelerator consists of seven stages with one clock cycle latency per stage. The defuzzification stage was implemented using three different defuzzification methods. These methods are the mean of maxima, the smallest of maxima, and the largest of maxima. The defuzzification modules are interchangeable while the system runs using partial reconfiguration design methodology. System development was carried out using Vivado High-Level Synthesis, Vivado Design Suite, Vivado Simulator, and a set of Xilinx 7000 FPGA devices. Depending upon the speed grade of the device that is employed, the system can operate at a frequency range from 83 MHz to 125 MHz. Its peak performance is up to 58 high definition frames per second. A comparison of this system’s performance and its software counterpart shows a significant speedup in the magnitude of hundred thousand times.

  12. SensoTube: A Scalable Hardware Design Architecture for Wireless Sensors and Actuators Networks Nodes in the Agricultural Domain

    Science.gov (United States)

    Piromalis, Dimitrios; Arvanitis, Konstantinos

    2016-01-01

    Wireless Sensor and Actuators Networks (WSANs) constitute one of the most challenging technologies with tremendous socio-economic impact for the next decade. Functionally and energy optimized hardware systems and development tools maybe is the most critical facet of this technology for the achievement of such prospects. Especially, in the area of agriculture, where the hostile operating environment comes to add to the general technological and technical issues, reliable and robust WSAN systems are mandatory. This paper focuses on the hardware design architectures of the WSANs for real-world agricultural applications. It presents the available alternatives in hardware design and identifies their difficulties and problems for real-life implementations. The paper introduces SensoTube, a new WSAN hardware architecture, which is proposed as a solution to the various existing design constraints of WSANs. The establishment of the proposed architecture is based, firstly on an abstraction approach in the functional requirements context, and secondly, on the standardization of the subsystems connectivity, in order to allow for an open, expandable, flexible, reconfigurable, energy optimized, reliable and robust hardware system. The SensoTube implementation reference model together with its encapsulation design and installation are analyzed and presented in details. Furthermore, as a proof of concept, certain use cases have been studied in order to demonstrate the benefits of migrating existing designs based on the available open-source hardware platforms to SensoTube architecture. PMID:27527180

  13. SensoTube: A Scalable Hardware Design Architecture for Wireless Sensors and Actuators Networks Nodes in the Agricultural Domain.

    Science.gov (United States)

    Piromalis, Dimitrios; Arvanitis, Konstantinos

    2016-08-04

    Wireless Sensor and Actuators Networks (WSANs) constitute one of the most challenging technologies with tremendous socio-economic impact for the next decade. Functionally and energy optimized hardware systems and development tools maybe is the most critical facet of this technology for the achievement of such prospects. Especially, in the area of agriculture, where the hostile operating environment comes to add to the general technological and technical issues, reliable and robust WSAN systems are mandatory. This paper focuses on the hardware design architectures of the WSANs for real-world agricultural applications. It presents the available alternatives in hardware design and identifies their difficulties and problems for real-life implementations. The paper introduces SensoTube, a new WSAN hardware architecture, which is proposed as a solution to the various existing design constraints of WSANs. The establishment of the proposed architecture is based, firstly on an abstraction approach in the functional requirements context, and secondly, on the standardization of the subsystems connectivity, in order to allow for an open, expandable, flexible, reconfigurable, energy optimized, reliable and robust hardware system. The SensoTube implementation reference model together with its encapsulation design and installation are analyzed and presented in details. Furthermore, as a proof of concept, certain use cases have been studied in order to demonstrate the benefits of migrating existing designs based on the available open-source hardware platforms to SensoTube architecture.

  14. Software implementation and hardware acceleration of retinal vessel segmentation for diabetic retinopathy screening tests.

    Science.gov (United States)

    Cavinato, L; Fidone, I; Bacis, M; Del Sozzo, E; Durelli, G C; Santambrogio, M D

    2017-07-01

    Screening tests are an effective tool for the diagnosis and prevention of several diseases. Unfortunately, in order to produce an early diagnosis, the huge number of collected samples has to be processed faster than before. In particular this issue concerns image processing procedures, as they require a high computational complexity, which is not satisfied by modern software architectures. To this end, Field Programmable Gate Arrays (FPGAs) can be used to accelerate partially or entirely the computation. In this work, we demonstrate that the use of FPGAs is suitable for biomedical application, by proposing a case of study concerning the implementation of a vessels segmentation algorithm. The experimental results, computed on DRIVE and STARE databases, show remarkable improvements in terms of both execution time and power efficiency (6X and 5.7X respectively) compared to the software implementation. On the other hand, the proposed hardware approach outperforms literature works (3X speedup) without affecting the overall accuracy and sensitivity measures.

  15. SensoTube: A Scalable Hardware Design Architecture for Wireless Sensors and Actuators Networks Nodes in the Agricultural Domain

    OpenAIRE

    Piromalis, Dimitrios; Arvanitis, Konstantinos

    2016-01-01

    Wireless Sensor and Actuators Networks (WSANs) constitute one of the most challenging technologies with tremendous socio-economic impact for the next decade. Functionally and energy optimized hardware systems and development tools maybe is the most critical facet of this technology for the achievement of such prospects. Especially, in the area of agriculture, where the hostile operating environment comes to add to the general technological and technical issues, reliable and robust WSAN system...

  16. A Hardware-Accelerated Quantum Monte Carlo framework (HAQMC) for N-body systems

    Science.gov (United States)

    Gothandaraman, Akila; Peterson, Gregory D.; Warren, G. Lee; Hinde, Robert J.; Harrison, Robert J.

    2009-12-01

    Interest in the study of structural and energetic properties of highly quantum clusters, such as inert gas clusters has motivated the development of a hardware-accelerated framework for Quantum Monte Carlo simulations. In the Quantum Monte Carlo method, the properties of a system of atoms, such as the ground-state energies, are averaged over a number of iterations. Our framework is aimed at accelerating the computations in each iteration of the QMC application by offloading the calculation of properties, namely energy and trial wave function, onto reconfigurable hardware. This gives a user the capability to run simulations for a large number of iterations, thereby reducing the statistical uncertainty in the properties, and for larger clusters. This framework is designed to run on the Cray XD1 high performance reconfigurable computing platform, which exploits the coarse-grained parallelism of the processor along with the fine-grained parallelism of the reconfigurable computing devices available in the form of field-programmable gate arrays. In this paper, we illustrate the functioning of the framework, which can be used to calculate the energies for a model cluster of helium atoms. In addition, we present the capabilities of the framework that allow the user to vary the chemical identities of the simulated atoms. Program summaryProgram title: Hardware Accelerated Quantum Monte Carlo (HAQMC) Catalogue identifier: AEEP_v1_0 Program summary URL:http://cpc.cs.qub.ac.uk/summaries/AEEP_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 691 537 No. of bytes in distributed program, including test data, etc.: 5 031 226 Distribution format: tar.gz Programming language: C/C++ for the QMC application, VHDL and Xilinx 8.1 ISE/EDK tools for FPGA design and development Computer: Cray XD

  17. A hardware acceleration based on high-level synthesis approach for glucose-insulin analysis

    Science.gov (United States)

    Daud, Nur Atikah Mohd; Mahmud, Farhanahani; Jabbar, Muhamad Hairol

    2017-01-01

    In this paper, the research is focusing on Type 1 Diabetes Mellitus (T1DM). Since this disease requires a full attention on the blood glucose concentration with the help of insulin injection, it is important to have a tool that able to predict that level when consume a certain amount of carbohydrate during meal time. Therefore, to make it realizable, a Hovorka model which is aiming towards T1DM is chosen in this research. A high-level language is chosen that is C++ to construct the mathematical model of the Hovorka model. Later, this constructed code is converted into intellectual property (IP) which is also known as a hardware accelerator by using of high-level synthesis (HLS) approach which able to improve in terms of design and performance for glucose-insulin analysis tool later as will be explained further in this paper. This is the first step in this research before implementing the design into system-on-chip (SoC) to achieve a high-performance system for the glucose-insulin analysis tool.

  18. Scalable devices

    KAUST Repository

    Krüger, Jens J.

    2014-01-01

    In computer science in general and in particular the field of high performance computing and supercomputing the term scalable plays an important role. It indicates that a piece of hardware, a concept, an algorithm, or an entire system scales with the size of the problem, i.e., it can not only be used in a very specific setting but it\\'s applicable for a wide range of problems. From small scenarios to possibly very large settings. In this spirit, there exist a number of fixed areas of research on scalability. There are works on scalable algorithms, scalable architectures but what are scalable devices? In the context of this chapter, we are interested in a whole range of display devices, ranging from small scale hardware such as tablet computers, pads, smart-phones etc. up to large tiled display walls. What interests us mostly is not so much the hardware setup but mostly the visualization algorithms behind these display systems that scale from your average smart phone up to the largest gigapixel display walls.

  19. A Scalable Parallel PWTD-Accelerated SIE Solver for Analyzing Transient Scattering from Electrically Large Objects

    KAUST Repository

    Liu, Yang

    2015-12-17

    A scalable parallel plane-wave time-domain (PWTD) algorithm for efficient and accurate analysis of transient scattering from electrically large objects is presented. The algorithm produces scalable communication patterns on very large numbers of processors by leveraging two mechanisms: (i) a hierarchical parallelization strategy to evenly distribute the computation and memory loads at all levels of the PWTD tree among processors, and (ii) a novel asynchronous communication scheme to reduce the cost and memory requirement of the communications between the processors. The efficiency and accuracy of the algorithm are demonstrated through its applications to the analysis of transient scattering from a perfect electrically conducting (PEC) sphere with a diameter of 70 wavelengths and a PEC square plate with a dimension of 160 wavelengths. Furthermore, the proposed algorithm is used to analyze transient fields scattered from realistic airplane and helicopter models under high frequency excitation.

  20. Acceleration of fluoro-CT reconstruction for a mobile C-Arm on GPU and FPGA hardware: a simulation study

    Science.gov (United States)

    Xue, Xinwei; Cheryauka, Arvi; Tubbs, David

    2006-03-01

    CT imaging in interventional and minimally-invasive surgery requires high-performance computing solutions that meet operational room demands, healthcare business requirements, and the constraints of a mobile C-arm system. The computational requirements of clinical procedures using CT-like data are increasing rapidly, mainly due to the need for rapid access to medical imagery during critical surgical procedures. The highly parallel nature of Radon transform and CT algorithms enables embedded computing solutions utilizing a parallel processing architecture to realize a significant gain of computational intensity with comparable hardware and program coding/testing expenses. In this paper, using a sample 2D and 3D CT problem, we explore the programming challenges and the potential benefits of embedded computing using commodity hardware components. The accuracy and performance results obtained on three computational platforms: a single CPU, a single GPU, and a solution based on FPGA technology have been analyzed. We have shown that hardware-accelerated CT image reconstruction can be achieved with similar levels of noise and clarity of feature when compared to program execution on a CPU, but gaining a performance increase at one or more orders of magnitude faster. 3D cone-beam or helical CT reconstruction and a variety of volumetric image processing applications will benefit from similar accelerations.

  1. Ab initio nonadiabatic dynamics of multichromophore complexes: a scalable graphical-processing-unit-accelerated exciton framework.

    Science.gov (United States)

    Sisto, Aaron; Glowacki, David R; Martinez, Todd J

    2014-09-16

    Conspectus Although advances in computer hardware and algorithms tuned for novel computer architectures are leading to significant increases in the size and time scale for molecular simulations, it remains true that new methods and algorithms will be needed to address some of the problems in complex chemical systems, such as electrochemistry, excitation energy transport, proton transport, and condensed phase reactivity. Ideally, these new methods would exploit the strengths of emerging architectures. Fragment based approaches for electronic structure theory decompose the problem of solving the electronic Schrodinger equation into a series of much smaller problems. Because each of these smaller problems is largely independent, this strategy is particularly well-suited to parallel architectures. It appears that the most significant advances in computer architectures will be toward increased parallelism, and therefore fragment-based approaches are an ideal match to these trends. When the computational effort involved scales with the third (or higher) power of the molecular size, there is a large benefit to fragment-based approaches even on serial architectures. This is the case for many of the well-known methods for solving the electronic structure theory problem, especially when wave function-based approaches including electron correlation are considered. A major issue in fragment-based approaches is determining or improving their accuracy. Since the Achilles' heel of any such method lies in the approximations used to stitch the smaller problems back together (i.e., in the treatment of the cross-fragment interactions), it can often be important to ensure that the size of the smaller problems is "large enough." Thus, there are two frontiers that need to be extended in order to enable molecular simulations for large systems and long times: the strongly coupled problem of medium sized molecules (100-500 atoms) and the more weakly coupled problem of decomposing

  2. Accelerated Degradation for Hardware in the Loop Simulation of Fuel Cell-Gas Turbine Hybrid System

    DEFF Research Database (Denmark)

    Abreu-Sepulveda, Maria A.; Harun, Nor Farida; Hackett, Gregory

    2015-01-01

    The U.S. Department of Energy (DOE)-National Energy Technology Laboratory (NETL) in Morgantown, WV has developed the hybrid performance (HyPer) project in which a solid oxide fuel cell (SOFC) one-dimensional (1D), real-time operating model is coupled to a gas turbine hardware system by utilizing...... hardware-in-the-loop simulation. To assess the long-term stability of the SOFC part of the system, electrochemical degradation due to operating conditions such as current density and fuel utilization have been incorporated into the SOFC model and successfully recreated in real time. The mathematical...

  3. Accelerating the Non-equispaced Fast Fourier Transform on Commodity Graphics Hardware

    DEFF Research Database (Denmark)

    Sørensen, Thomas Sangild; Schaeffter, Tobias; Noe, Karsten Østergaard

    2008-01-01

    We present a fast parallel algorithm to compute the Non-equispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform, which was previously its most time consuming part. We describe the performa......We present a fast parallel algorithm to compute the Non-equispaced fast Fourier transform on commodity graphics hardware (the GPU). We focus particularly on a novel implementation of the convolution step in the transform, which was previously its most time consuming part. We describe...

  4. Projected Life of the SLAC Linac Braze Joints: Braze integrity and corrosion of cooling water hardware on accelerator sections

    Energy Technology Data Exchange (ETDEWEB)

    Glesener, W.F.; Garwin, E.L.; /SLAC

    2006-07-17

    The objective of this study was to ascertain the condition of braze joints and cooling water hardware from an accelerator section after prolonged use. Metallographic analysis was used to examine critical sites on an accelerator section that had been in use for more than 30 years. The end flange assembly showed no internal operational damage or external environmental effects. The cavity cylinder stack showed no internal operational damage however the internal surface was highly oxidized. The internal surface of the cooling water tubing was uniformly corroding at a rate of about 1 mil per year and showed no evidence of pitting. Tee fitting internal surfaces are corroding at non-uniform rates due to general corrosion and pitting. Remaining service life of the cooling water jacket is estimated to be about 20 years or year 2027. At this time, water supply pressure will exceed allowable fitting pressure due to corrosion of tubing walls.

  5. Design of Power Efficient FPGA based Hardware Accelerators for Financial Applications

    DEFF Research Database (Denmark)

    Hegner, Jonas Stenbæk; Sindholt, Joakim; Nannarelli, Alberto

    2012-01-01

    Using Field Programmable Gate Arrays (FPGAs) to accelerate financial derivative calculations is becoming very common. In this work, we implement an FPGA-based specific processor for European option pricing using Monte Carlo simulations, and we compare its performance and power dissipation to the ...... to the execution on a CPU. The experimental results show that impressive results, in terms of speed-up and energy savings, can be obtained by using FPGA-based accelerators at expenses of a longer development time....

  6. Hardware-accelerated Point Generation and Rendering of Point-based Impostors

    DEFF Research Database (Denmark)

    Bærentzen, Jakob Andreas

    2005-01-01

    This paper presents a novel scheme for generating points from triangle models. The method is fast and lends itself well to implementation using graphics hardware. The triangle to point conversion is done by rendering the models, and the rendering may be performed procedurally or by a black box API....... I describe the technique in detail and discuss how the generated point sets can easily be used as impostors for the original triangle models used to create the points. Since the points reside solely in GPU memory, these impostors are fairly efficient. Source code is available online....

  7. Hardware Acceleration of SQL-Queries Processing in MDM-Systems Based on MISDSolution

    Directory of Open Access Journals (Sweden)

    V. E. Podol'skii

    2015-01-01

    Full Text Available In this article we examine the possibility of hardware support for functions of mobile device management platform (MDM-platform using a Multiple Instructions and Single Data stream computer system, developed within the framework of the project in Bauman Moscow State Technical University. At the universities the MDM-platform is used to provide various mobile services for the faculty, students and administration to facilitate the learning process: a mobile schedule, document sharing, text messages, and other interactive activities. Most of these services are provided by the extensive use of data stored in MDM-platform databases. When accessing the databases SQL- queries are commonly used. These queries comprise operators of SQL-language that are based on mathematical sets theory. Hardware support for operations on sets is implemented in Multiple Instructions and Single Data stream computer system (MISD System. This allows performance improvement of algorithms and operations on sets. Thus, the hardware support for the processing of SQL-queries in MISD system allows us to benefit from the implementation of SQL-queries in the MISD paradigm.The scientific novelty of the work lies in the fact that it is the first time a set of algorithms for basic SQL statements has been presented in a format supported by MISD system. In addition, for the first time operators INNER JOIN, LEFT JOIN and LEFT OUTER JOIN have been implemented for MISD system and tested for it (testing was done for FPGA Xilinx Virtex-II Pro XC2VP30 implementation of MISD system. The practical significance of the work lies in the fact that the results of the study will be used in the project "Development of the Russian analogue of the system software for centralized management of personal devices and platforms in enterprise networks" of the St. Petersburg Polytechnic University (with the financial support of the state represented by the Ministry of Education and Science of the Russian

  8. CUDA-BLASTP: accelerating BLASTP on CUDA-enabled graphics hardware.

    Science.gov (United States)

    Liu, Weiguo; Schmidt, Bertil; Müller-Wittig, Wolfgang

    2011-01-01

    Scanning protein sequence database is an often repeated task in computational biology and bioinformatics. However, scanning large protein databases, such as GenBank, with popular tools such as BLASTP requires long runtimes on sequential architectures. Due to the continuing rapid growth of sequence databases, there is a high demand to accelerate this task. In this paper, we demonstrate how GPUs, powered by the Compute Unified Device Architecture (CUDA), can be used as an efficient computational platform to accelerate the BLASTP algorithm. In order to exploit the GPU’s capabilities for accelerating BLASTP, we have used a compressed deterministic finite state automaton for hit detection as well as a hybrid parallelization scheme. Our implementation achieves speedups up to 10.0 on an NVIDIA GeForce GTX 295 GPU compared to the sequential NCBI BLASTP 2.2.22. CUDA-BLASTP source code which is available at https://sites.google.com/site/liuweiguohome/software.

  9. Hardware acceleration of lucky-region fusion (LRF) algorithm for imaging

    Science.gov (United States)

    Jackson, Christopher R.; Ejzak, Garrett A.; Aubailly, Mathieu; Carhart, Gary W.; Liu, J. J.; Kiamilev, Fouad

    2014-06-01

    "Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm extracts sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In our previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors rather than single picture images. This document describes a hardware implementation of the LRF algorithm on a VIRTEX-7 field programmable gate array (FPGA) to achieve real-time image processing. The novelty in our approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link or DVI video output. We also describe a custom hardware simulation environment we have built to test our LRF implementation.

  10. A Framework for Dynamically-Loaded Hardware Library (HLL) in FPGA Acceleration

    DEFF Research Database (Denmark)

    Cardarilli, Gian Carlo; Di Carlo, Leonardo; Nannarelli, Alberto

    2016-01-01

    of the accelerators preliminarily requires also the profiling of both the SW (ARM CPU + NEON Units) and HW (FPGA) performance, an evaluation of the partial reconfiguration times and the development of an applicationspecific IP-cores library. This paper focuses on the profiling aspect of both the SW and HW...

  11. Requirements Analysis for a Hardware, Discrete-Event, Simulation Engine Accelerator.

    Science.gov (United States)

    1991-12-01

    event simulations are currently executed on the Intel iPSC/2 hypercube. Continued use of this distributed architecture, based on the Intel 80386 CPU ...using VHDL. A testbed was devised to evaluate the VHDL accelerator design. A VHDL behav- ioral model of the Intel 80386 CPU was not available, hence a...of the Intel 80386 (19:5-353). However, strict compliance to this standard is not required with an asynchronous interface, as the CPU inserts wait

  12. Acceleration of the matrix multiplication of Radiance three phase daylighting simulations with parallel computing on heterogeneous hardware of personal computer

    Energy Technology Data Exchange (ETDEWEB)

    Zuo, Wangda [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); McNeil, Andrew [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Wetter, Michael [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States); Lee, Eleanor S. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States)

    2013-05-23

    Building designers are increasingly relying on complex fenestration systems to reduce energy consumed for lighting and HVAC in low energy buildings. Radiance, a lighting simulation program, has been used to conduct daylighting simulations for complex fenestration systems. Depending on the configurations, the simulation can take hours or even days using a personal computer. This paper describes how to accelerate the matrix multiplication portion of a Radiance three-phase daylight simulation by conducting parallel computing on heterogeneous hardware of a personal computer. The algorithm was optimized and the computational part was implemented in parallel using OpenCL. The speed of new approach was evaluated using various daylighting simulation cases on a multicore central processing unit and a graphics processing unit. Based on the measurements and analysis of the time usage for the Radiance daylighting simulation, further speedups can be achieved by using fast I/O devices and storing the data in a binary format.

  13. Design and implementation of embedded hardware accelerator for diagnosing HDL-CODE in assertion-based verification environment

    Directory of Open Access Journals (Sweden)

    C. U. Ngene

    2013-08-01

    Full Text Available The use of assertions for monitoring the designer’s intention in hardware description language (HDL model is gaining popularity as it helps the designer to observe internal errors at the output ports of the device under verification. During verification assertions are synthesised and the generated data are represented in a tabular forms. The amount of data generated can be enormous depending on the size of the code and the number of modules that constitute the code. Furthermore, to manually inspect these data and diagnose the module with functional violation is a time consuming process which negatively affects the overall product development time. To locate the module with functional violation within acceptable diagnostic time, the data processing and analysis procedure must be accelerated. In this paper a multi-array processor (hardware accelerator was designed and implemented in Virtex6 field programmable gate array (FPGA and it can be integrated into verification environment. The design was captured in very high speed integrated circuit HDL (VHDL. The design was synthesised with Xilinx design suite ISE 13.1 and simulated with Xilinx ISIM. The multi-array processor (MAP executes three logical operations (AND, OR, XOR and a one’s compaction operation on array of data in parallel. An improvement in processing and analysis time was recorded as compared to the manual procedure after the multi-array processor was integrated into the verification environment. It was also found that the multi-array processor which was developed as an Intellectual Property (IP core can also be used in applications where output responses and golden model that are represented in the form of matrices can be compared for searching, recognition and decision-making.

  14. A Hardware-Accelerated ECDLP with High-Performance Modular Multiplication

    Directory of Open Access Journals (Sweden)

    Lyndon Judge

    2012-01-01

    Full Text Available Elliptic curve cryptography (ECC has become a popular public key cryptography standard. The security of ECC is due to the difficulty of solving the elliptic curve discrete logarithm problem (ECDLP. In this paper, we demonstrate a successful attack on ECC over prime field using the Pollard rho algorithm implemented on a hardware-software cointegrated platform. We propose a high-performance architecture for multiplication over prime field using specialized DSP blocks in the FPGA. We characterize this architecture by exploring the design space to determine the optimal integer basis for polynomial representation and we demonstrate an efficient mapping of this design to multiple standard prime field elliptic curves. We use the resulting modular multiplier to demonstrate low-latency multiplications for curves secp112r1 and P-192. We apply our modular multiplier to implement a complete attack on secp112r1 using a Nallatech FSB-Compute platform with Virtex-5 FPGA. The measured performance of the resulting design is 114 cycles per Pollard rho step at 100 MHz, which gives 878 K iterations per second per ECC core. We extend this design to a multicore ECDLP implementation that achieves 14.05 M iterations per second with 16 parallel point addition cores.

  15. Scalable rendering on PC clusters

    Energy Technology Data Exchange (ETDEWEB)

    WYLIE,BRIAN N.; LEWIS,VASILY; SHIRLEY,DAVID NOYES; PAVLAKOS,CONSTANTINE

    2000-04-25

    This case study presents initial results from research targeted at the development of cost-effective scalable visualization and rendering technologies. The implementations of two 3D graphics libraries based on the popular sort-last and sort-middle parallel rendering techniques are discussed. An important goal of these implementations is to provide scalable rendering capability for extremely large datasets (>> 5 million polygons). Applications can use these libraries for either run-time visualization, by linking to an existing parallel simulation, or for traditional post-processing by linking to an interactive display program. The use of parallel, hardware-accelerated rendering on commodity hardware is leveraged to achieve high performance. Current performance results show that, using current hardware (a small 16-node cluster), they can utilize up to 85% of the aggregate graphics performance and achieve rendering rates in excess of 20 million polygons/second using OpenGL{reg_sign} with lighting, Gouraud shading, and individually specified triangles (not t-stripped).

  16. GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

    Science.gov (United States)

    Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oguz; Mutlu, Onur; Alkan, Can

    2017-11-01

    High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. https://github.com/BilkentCompGen/GateKeeper. mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online.

  17. Scalable Reliable SD Erlang Design

    OpenAIRE

    Chechina, Natalia; Trinder, Phil; Ghaffari, Amir; Green, Rickard; Lundin, Kenneth; Virding, Robert

    2014-01-01

    This technical report presents the design of Scalable Distributed (SD) Erlang: a set of language-level changes that aims to enable Distributed Erlang to scale for server applications on commodity hardware with at most 100,000 cores. We cover a number of aspects, specifically anticipated architecture, anticipated failures, scalable data structures, and scalable computation. Other two components that guided us in the design of SD Erlang are design principles and typical Erlang applications. The...

  18. Establishing a novel modeling tool: a python-based interface for a neuromorphic hardware system

    Directory of Open Access Journals (Sweden)

    Daniel Brüderle

    2009-06-01

    Full Text Available Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated.

  19. Establishing a Novel Modeling Tool: A Python-Based Interface for a Neuromorphic Hardware System

    Science.gov (United States)

    Brüderle, Daniel; Müller, Eric; Davison, Andrew; Muller, Eilif; Schemmel, Johannes; Meier, Karlheinz

    2008-01-01

    Neuromorphic hardware systems provide new possibilities for the neuroscience modeling community. Due to the intrinsic parallelism of the micro-electronic emulation of neural computation, such models are highly scalable without a loss of speed. However, the communities of software simulator users and neuromorphic engineering in neuroscience are rather disjoint. We present a software concept that provides the possibility to establish such hardware devices as valuable modeling tools. It is based on the integration of the hardware interface into a simulator-independent language which allows for unified experiment descriptions that can be run on various simulation platforms without modification, implying experiment portability and a huge simplification of the quantitative comparison of hardware and simulator results. We introduce an accelerated neuromorphic hardware device and describe the implementation of the proposed concept for this system. An example setup and results acquired by utilizing both the hardware system and a software simulator are demonstrated. PMID:19562085

  20. Scalable Resolution Display Walls

    KAUST Repository

    Leigh, Jason

    2013-01-01

    This article will describe the progress since 2000 on research and development in 2-D and 3-D scalable resolution display walls that are built from tiling individual lower resolution flat panel displays. The article will describe approaches and trends in display hardware construction, middleware architecture, and user-interaction design. The article will also highlight examples of use cases and the benefits the technology has brought to their respective disciplines. © 1963-2012 IEEE.

  1. A hardware-oriented histogram of oriented gradients algorithm and its VLSI implementation

    Science.gov (United States)

    Zhang, Xiangyu; An, Fengwei; Nakashima, Ikki; Luo, Aiwen; Chen, Lei; Ishii, Idaku; Jürgen Mattausch, Hans

    2017-04-01

    A challenging and important issue for object recognition is feature extraction on embedded systems. We report a hardware implementation of the histogram of oriented gradients (HOG) algorithm for real-time object recognition, which is known to provide high efficiency and accuracy. The developed hardware-oriented algorithm exploits the cell-based scan strategy which enables image-sensor synchronization and extraction-speed acceleration. Furthermore, buffers for image frames or integral images are avoided. An image-size scalable hardware architecture with an effective bin-decoder and a parallelized voting element (PVE) is developed and used to verify the hardware-oriented HOG implementation with the application of human detection. The fabricated test chip in 180 nm CMOS technology achieves fast processing speed and large flexibility for different image resolutions with substantially reduced hardware cost and energy consumption.

  2. Scalable hardware verification with symbolic simulation

    CERN Document Server

    Bertacco, Valeria

    2006-01-01

    An innovative presentation of the theory of disjoint support decomposition, presenting novel results and algorithms, plus original and up-to-date techniques in formal verificationProvides an overview of current verification techniques, and unveils the inner workings of symbolic simulationFocuses on new techniques that narrow the performance gap between the complexity of digital systems and the limited ability to verify themAddresses key topics in need of future research.

  3. Next Processor Module: A Hardware Accelerator of UT699 LEON3-FT System for On-Board Computer Software Simulation

    Science.gov (United States)

    Langlois, Serge; Fouquet, Olivier; Gouy, Yann; Riant, David

    2014-08-01

    On-Board Computers (OBC) are more and more using integrated systems on-chip (SOC) that embed processors running from 50MHz up to several hundreds of MHz, and around which are plugged some dedicated communication controllers together with other Input/Output channels.For ground testing and On-Board SoftWare (OBSW) validation purpose, a representative simulation of these systems, faster than real-time and with cycle-true timing of execution, is not achieved with current purely software simulators.Since a few years some hybrid solutions where put in place ([1], [2]), including hardware in the loop so as to add accuracy and performance in the computer software simulation.This paper presents the results of the works engaged by Thales Alenia Space (TAS-F) at the end of 2010, that led to a validated HW simulator of the UT699 by mid- 2012 and that is now qualified and fully used in operational contexts.

  4. A new approach to fluid-structure interaction within graphics hardware accelerated smooth particle hydrodynamics considering heterogeneous particle size distribution

    Science.gov (United States)

    Eghtesad, Adnan; Knezevic, Marko

    2017-12-01

    A corrective smooth particle method (CSPM) within smooth particle hydrodynamics (SPH) is used to study the deformation of an aircraft structure under high-velocity water-ditching impact load. The CSPM-SPH method features a new approach for the prediction of two-way fluid-structure interaction coupling. Results indicate that the implementation is well suited for modeling the deformation of structures under high-velocity impact into water as evident from the predicted stress and strain localizations in the aircraft structure as well as the integrity of the impacted interfaces, which show no artificial particle penetrations. To reduce the simulation time, a heterogeneous particle size distribution over a complex three-dimensional geometry is used. The variable particle size is achieved from a finite element mesh with variable element size and, as a result, variable nodal (i.e., SPH particle) spacing. To further accelerate the simulations, the SPH code is ported to a graphics processing unit using the OpenACC standard. The implementation and simulation results are described and discussed in this paper.

  5. The VMTG Hardware Description

    CERN Document Server

    Puccio, B

    1998-01-01

    The document describes the hardware features of the CERN Master Timing Generator. This board is the common platform for the transmission of General Timing Machine required by the CERN accelerators. In addition, the paper shows the various jumper options to customise the card which is compliant to the VMEbus standard.

  6. Hardware Acceleration for Cyber Security

    Science.gov (United States)

    2010-11-01

    adapters from Napatech [23]. Platforms provided by research comunity are COMBO cards [4] from CESNET and NetFPGA [24] cards from Stanford. Endace and... manager providing user interface on the client (SOC) side and the NETCONF agent applica- tion that controls configuration datastores on the device side. On...using NETCONF protocol. NETCONF uses simple Remote Procedure Call (RPC)-like approach to exchange messages between manager and agent application. This

  7. An Introduction to Parallelism, Concurrency and Acceleration (1/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Concurrency and parallelism are firm elements of any modern computing infrastructure, made even more prominent by the emergence of accelerators. These lectures offer an introduction to these important concepts. We will begin with a brief refresher of recent hardware offerings to modern-day programmers. We will then open the main discussion with an overview of the laws and practical aspects of scalability. Key parallelism data structures, patterns and algorithms will be shown. The main threats to scalability and mitigation strategies will be discussed in the context of real-life optimization problems.

  8. Hardly Hardware

    Science.gov (United States)

    Lott, Debra

    2007-01-01

    In a never-ending search for new and inspirational still-life objects, the author discovered that home improvement retailers make great resources for art teachers. Hardware and building materials are inexpensive and have interesting and variable shapes. She especially liked the dryer-vent coils and the electrical conduit. These items can be…

  9. How to create successful Open Hardware projects - About White Rabbits and open fields

    CERN Document Server

    van der Bij, E; Lewis, J; Stana, T; Wlostowski, T; Gousiou, E; Serrano, J; Arruat, M; Lipinski, M M; Daniluk, G; Voumard, N; Cattin, M

    2013-01-01

    CERN's accelerator control group has embraced "Open Hardware" (OH) to facilitate peer review, avoid vendor lock-in and make support tasks scalable. A web-based tool for easing collaborative work was set up and the CERN OH Licence was created. New ADC, TDC, fine delay and carrier cards based on VITA and PCI-SIG standards were designed and drivers for Linux were written. Often industry was paid for developments, while quality and documentation was controlled by CERN. An innovative timing network was also developed with the OH paradigm. Industry now sells and supports these designs that find their way into new fields.

  10. Hardware malware

    CERN Document Server

    Krieg, Christian

    2013-01-01

    In our digital world, integrated circuits are present in nearly every moment of our daily life. Even when using the coffee machine in the morning, or driving our car to work, we interact with integrated circuits. The increasing spread of information technology in virtually all areas of life in the industrialized world offers a broad range of attack vectors. So far, mainly software-based attacks have been considered and investigated, while hardware-based attacks have attracted comparatively little interest. The design and production process of integrated circuits is mostly decentralized due to

  11. Design of a pseudo-log image transform hardware accelerator in a high-level synthesis-based memory management framework

    Science.gov (United States)

    Butt, Shahzad Ahmad; Mancini, Stéphane; Rousseau, Frédéric; Lavagno, Luciano

    2014-09-01

    The pseudo-log image transform belongs to a class of image processing kernels that generate memory references which are nonlinear functions of loop indices. Due to the nonlinearity of the memory references, the usual design methodologies do not allow efficient hardware implementation for nonlinear kernels. For optimized hardware implementation, these kernels require the creation of a customized memory hierarchy and efficient data/memory management strategy. We present the design and real-time hardware implementation of a pseudo-log image transform IP (hardware image processing engine) using a memory management framework. The framework generates a controller which efficiently manages input data movement in the form of tiles between off-chip main memory, on-chip memory, and the core processing unit. The framework can jointly optimize the memory hierarchy and the tile computation schedule to reduce on-chip memory requirements, to maximize throughput, and to increase data reuse for reducing off-chip memory bandwidth requirements. The algorithmic C++ description of the pseudo-log kernel is profiled in the framework to generate an enhanced description with a customized memory hierarchy. The enhanced description of the kernel is then used for high-level synthesis (HLS) to perform architectural design space exploration in order to find an optimal implementation under given performance constraints. The optimized register transfer level implementation of the IP generated after HLS is used for performance estimation. The performance estimation is done in a simulation framework to characterize the IP with different external off-chip memory latencies and a variety of data transfer policies. Experimental results show that the designed IP can be used for real-time implementation and that the generated memory hierarchy is capable of feeding the IP with a sufficiently high bandwidth even in the presence of long external memory latencies.

  12. Real time mitigation of atmospheric turbulence in long distance imaging using the lucky region fusion algorithm with FPGA and GPU hardware acceleration

    Science.gov (United States)

    Jackson, Christopher Robert

    "Lucky-region" fusion (LRF) is a synthetic imaging technique that has proven successful in enhancing the quality of images distorted by atmospheric turbulence. The LRF algorithm selects sharp regions of an image obtained from a series of short exposure frames, and fuses the sharp regions into a final, improved image. In previous research, the LRF algorithm had been implemented on a PC using the C programming language. However, the PC did not have sufficient sequential processing power to handle real-time extraction, processing and reduction required when the LRF algorithm was applied to real-time video from fast, high-resolution image sensors. This thesis describes two hardware implementations of the LRF algorithm to achieve real-time image processing. The first was created with a VIRTEX-7 field programmable gate array (FPGA). The other developed using the graphics processing unit (GPU) of a NVIDIA GeForce GTX 690 video card. The novelty in the FPGA approach is the creation of a "black box" LRF video processing system with a general camera link input, a user controller interface, and a camera link video output. We also describe a custom hardware simulation environment we have built to test the FPGA LRF implementation. The advantage of the GPU approach is significantly improved development time, integration of image stabilization into the system, and comparable atmospheric turbulence mitigation.

  13. Scalable fast multipole methods for vortex element methods

    KAUST Repository

    Hu, Qi

    2012-11-01

    We use a particle-based method to simulate incompressible flows, where the Fast Multipole Method (FMM) is used to accelerate the calculation of particle interactions. The most time-consuming kernelsâ\\'the Biot-Savart equation and stretching term of the vorticity equationâ\\'are mathematically reformulated so that only two Laplace scalar potentials are used instead of six, while automatically ensuring divergence-free far-field computation. Based on this formulation, and on our previous work for a scalar heterogeneous FMM algorithm, we develop a new FMM-based vortex method capable of simulating general flows including turbulence on heterogeneous architectures, which distributes the work between multi-core CPUs and GPUs to best utilize the hardware resources and achieve excellent scalability. The algorithm also uses new data structures which can dynamically manage inter-node communication and load balance efficiently but with only a small parallel construction overhead. This algorithm can scale to large-sized clusters showing both strong and weak scalability. Careful error and timing trade-off analysis are also performed for the cutoff functions induced by the vortex particle method. Our implementation can perform one time step of the velocity+stretching for one billion particles on 32 nodes in 55.9 seconds, which yields 49.12 Tflop/s. © 2012 IEEE.

  14. A Hardware Filesystem Implementation with Multidisk Support

    National Research Council Canada - National Science Library

    Mendon, Ashwin A; Schmidt, Andrew G; Sass, Ron

    2009-01-01

    .... This article describes one such innovation: a filesystem implemented in hardware. This has the potential of improving the performance of data-intensive applications by connecting secondary storage directly to FPGA compute accelerators...

  15. Using the FLUKA Monte Carlo Code to Simulate the Interactions of Ionizing Radiation with Matter to Assist and Aid Our Understanding of Ground Based Accelerator Testing, Space Hardware Design, and Secondary Space Radiation Environments

    Science.gov (United States)

    Reddell, Brandon

    2015-01-01

    Designing hardware to operate in the space radiation environment is a very difficult and costly activity. Ground based particle accelerators can be used to test for exposure to the radiation environment, one species at a time, however, the actual space environment cannot be duplicated because of the range of energies and isotropic nature of space radiation. The FLUKA Monte Carlo code is an integrated physics package based at CERN that has been under development for the last 40+ years and includes the most up-to-date fundamental physics theory and particle physics data. This work presents an overview of FLUKA and how it has been used in conjunction with ground based radiation testing for NASA and improve our understanding of secondary particle environments resulting from the interaction of space radiation with matter.

  16. Scalability of the LEU-Modified Cintichem Process: 3-MeV Van de Graaff and 35-MeV Electron Linear Accelerator Studies

    Energy Technology Data Exchange (ETDEWEB)

    Rotsch, David A. [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Brossard, Tom [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Roussin, Ethan [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Quigley, Kevin [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Chemerisov, Sergey [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Gromov, Roman [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Jonah, Charles [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Hafenrichter, Lohman [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Tkac, Peter [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Krebs, John [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division; Vandegrift, George F. [Argonne National Lab. (ANL), Argonne, IL (United States). Nuclear Engineering Division

    2016-10-31

    Molybdenum-99, the mother of Tc-99m, can be produced from fission of U-235 in nuclear reactors and purified from fission products by the Cintichem process, later modified for low-enriched uranium (LEU) targets. The key step in this process is the precipitation of Mo with α-benzoin oxime (ABO). The stability of this complex to radiation has been examined. Molybdenum-ABO was irradiated with 3 MeV electrons produced by a Van de Graaff generator and 35 MeV electrons produced by a 50 MeV/25 kW electron linear accelerator. Dose equivalents of 1.7–31.2 kCi of Mo-99 were administered to freshly prepared Mo-ABO. Irradiated samples of Mo-ABO were processed according to the LEU Modified-Cintichem process. The Van de Graaff data indicated good radiation stability of the Mo-ABO complex up to ~15 kCi dose equivalents of Mo-99 and nearly complete destruction at doses >24 kCi Mo-99. The linear accelerator data indicate that even at 6.2 kCi of Mo-99 equivalence of dose, the sample lost ~20% of Mo-99. The 20% loss of Mo-99 at this low dose may be attributed to thermal decomposition of the product from the heat deposited in the sample during irradiation.

  17. Scalable Frequent Subgraph Mining

    KAUST Repository

    Abdelhamid, Ehab

    2017-06-19

    A graph is a data structure that contains a set of nodes and a set of edges connecting these nodes. Nodes represent objects while edges model relationships among these objects. Graphs are used in various domains due to their ability to model complex relations among several objects. Given an input graph, the Frequent Subgraph Mining (FSM) task finds all subgraphs with frequencies exceeding a given threshold. FSM is crucial for graph analysis, and it is an essential building block in a variety of applications, such as graph clustering and indexing. FSM is computationally expensive, and its existing solutions are extremely slow. Consequently, these solutions are incapable of mining modern large graphs. This slowness is caused by the underlying approaches of these solutions which require finding and storing an excessive amount of subgraph matches. This dissertation proposes a scalable solution for FSM that avoids the limitations of previous work. This solution is composed of four components. The first component is a single-threaded technique which, for each candidate subgraph, needs to find only a minimal number of matches. The second component is a scalable parallel FSM technique that utilizes a novel two-phase approach. The first phase quickly builds an approximate search space, which is then used by the second phase to optimize and balance the workload of the FSM task. The third component focuses on accelerating frequency evaluation, which is a critical step in FSM. To do so, a machine learning model is employed to predict the type of each graph node, and accordingly, an optimized method is selected to evaluate that node. The fourth component focuses on mining dynamic graphs, such as social networks. To this end, an incremental index is maintained during the dynamic updates. Only this index is processed and updated for the majority of graph updates. Consequently, search space is significantly pruned and efficiency is improved. The empirical evaluation shows that the

  18. Introduction to Hardware Security

    Directory of Open Access Journals (Sweden)

    Yier Jin

    2015-10-01

    Full Text Available Hardware security has become a hot topic recently with more and more researchers from related research domains joining this area. However, the understanding of hardware security is often mixed with cybersecurity and cryptography, especially cryptographic hardware. For the same reason, the research scope of hardware security has never been clearly defined. To help researchers who have recently joined in this area better understand the challenges and tasks within the hardware security domain and to help both academia and industry investigate countermeasures and solutions to solve hardware security problems, we will introduce the key concepts of hardware security as well as its relations to related research topics in this survey paper. Emerging hardware security topics will also be clearly depicted through which the future trend will be elaborated, making this survey paper a good reference for the continuing research efforts in this area.

  19. An Introduction to Parallelism, Concurrency and Acceleration (1/2)

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Concurrency and parallelism are firm elements of any modern computing infrastructure, made even more prominent by the emergence of accelerators. These lectures offer an introduction to these important concepts. We will begin with a brief refresher of recent hardware offerings to modern-day programmers. We will then open the main discussion with an overview of the laws and practical aspects of scalability. Key parallelism data structures, patterns and algorithms will be shown. The main threats to scalability and mitigation strategies will be discussed in the context of real-life optimization problems. Lecturer's short bio: Andrzej Nowak has 10 years of experience in computing technologies, primarily from CERN openlab and Intel. At CERN, he managed a research lab collaborating with Intel and was part of the openlab Chief Technology Office. Andrzej also worked closely and initiated projects with the private sector (e.g. HP and Google), as well as international research institutes, such as EPFL. Current...

  20. Memory Based Machine Intelligence Techniques in VLSI hardware

    OpenAIRE

    James, Alex Pappachen

    2012-01-01

    We briefly introduce the memory based approaches to emulate machine intelligence in VLSI hardware, describing the challenges and advantages. Implementation of artificial intelligence techniques in VLSI hardware is a practical and difficult problem. Deep architectures, hierarchical temporal memories and memory networks are some of the contemporary approaches in this area of research. The techniques attempt to emulate low level intelligence tasks and aim at providing scalable solutions to high ...

  1. Scalable Parallel Distributed Coprocessor System for Graph Searching Problems with Massive Data

    Directory of Open Access Journals (Sweden)

    Wanrong Huang

    2017-01-01

    Full Text Available The Internet applications, such as network searching, electronic commerce, and modern medical applications, produce and process massive data. Considerable data parallelism exists in computation processes of data-intensive applications. A traversal algorithm, breadth-first search (BFS, is fundamental in many graph processing applications and metrics when a graph grows in scale. A variety of scientific programming methods have been proposed for accelerating and parallelizing BFS because of the poor temporal and spatial locality caused by inherent irregular memory access patterns. However, new parallel hardware could provide better improvement for scientific methods. To address small-world graph problems, we propose a scalable and novel field-programmable gate array-based heterogeneous multicore system for scientific programming. The core is multithread for streaming processing. And the communication network InfiniBand is adopted for scalability. We design a binary search algorithm to address mapping to unify all processor addresses. Within the limits permitted by the Graph500 test bench after 1D parallel hybrid BFS algorithm testing, our 8-core and 8-thread-per-core system achieved superior performance and efficiency compared with the prior work under the same degree of parallelism. Our system is efficient not as a special acceleration unit but as a processor platform that deals with graph searching applications.

  2. Use of hardware accelerators for ATLAS computing

    CERN Document Server

    Bauce, Matteo; Dankel, Maik; Howard, Jacob; Kama, Sami

    2015-01-01

    Modern HEP experiments produce tremendous amounts of data. These data are processed by in-house built software frameworks which have lifetimes longer than the detector itself. Such frameworks were traditionally based on serial code and relied on advances in CPU technologies, mainly clock frequency, to cope with increasing data volumes. With the advent of many-core architectures and GPGPUs this paradigm has to shift to parallel processing and has to include the use of co-processors. However, since the design of most existing frameworks is based on the assumption of frequency scaling and predate co-processors, parallelisation and integration of co-processors are not an easy task. The ATLAS experiment is an example of such a big experiment with a big software framework called Athena. In this talk we will present the studies on parallelisation and co-processor (GPGPU) use in data preparation and tracking for trigger and offline reconstruction as well as their integration into a multiple process based Athena frame...

  3. Hardware Acceleration of Sparse Cognitive Algorithms

    Science.gov (United States)

    2016-05-01

    specifications, or other data does not license the holder or any other person or corporation; or convey any rights or permission to manufacture , use, or sell...GPU’s will be used for future comparisons avoiding any overhead associated with data transfers to and from the host CPU . Power was calculated using...central processing units ( CPUs ) and general-purpose graphic processing units (GPGPUs) is a very promising platform to support the development of

  4. Routing Aware Switch Hardware Customization for Networks on Chips

    OpenAIRE

    Meloni, Paolo; Murali, Srinivasan; Carta, Salvatore; Camplani, Massimo; Raffo, Luigi; Micheli, Giovanni,

    2006-01-01

    Networks on Chip (NoC) has been proposed as a scalable and reusable solution for interconnecting the ever- growing number of processor/memory cores on a single silicon die. As the hardware complexity of a NoC is significant, methods for designing a NoC with low hardware overhead, matching the application requirements are essential. In this work, we present a method for reducing the hardware complexity of the NoC by automatically configuring the architecture of the NoC switches to suit the app...

  5. Language Classification using N-grams Accelerated by FPGA-based Bloom Filters

    Energy Technology Data Exchange (ETDEWEB)

    Jacob, A; Gokhale, M

    2007-09-13

    N-Gram (n-character sequences in text documents) counting is a well-established technique used in classifying the language of text in a document. In this paper, n-gram processing is accelerated through the use of reconfigurable hardware on the XtremeData XD1000 system. Our design employs parallelism at multiple levels, with parallel Bloom Filters accessing on-chip RAM, parallel language classifiers, and parallel document processing. In contrast to another hardware implementation (HAIL algorithm) that uses off-chip SRAM for lookup, our highly scalable implementation uses only on-chip memory blocks. Our implementation of end-to-end language classification runs at 85x comparable software and 1.45x the competing hardware design.

  6. Open Hardware Business Models

    Directory of Open Access Journals (Sweden)

    Edy Ferreira

    2008-04-01

    Full Text Available In the September issue of the Open Source Business Resource, Patrick McNamara, president of the Open Hardware Foundation, gave a comprehensive introduction to the concept of open hardware, including some insights about the potential benefits for both companies and users. In this article, we present the topic from a different perspective, providing a classification of market offers from companies that are making money with open hardware.

  7. Evaluating the Scalability of Enterprise JavaBeans Technology

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Yan (Jenny); Gorton, Ian; Liu, Anna; Chen, Shiping; Paul A Strooper; Pornsiri Muenchaisri

    2002-12-04

    One of the major problems in building large-scale distributed systems is to anticipate the performance of the eventual solution before it has been built. This problem is especially germane to Internet-based e-business applications, where failure to provide high performance and scalability can lead to application and business failure. The fundamental software engineering problem is compounded by many factors, including individual application diversity, software architecture trade-offs, COTS component integration requirements, and differences in performance of various software and hardware infrastructures. In this paper, we describe the results of an empirical investigation into the scalability of a widely used distributed component technology, Enterprise JavaBeans (EJB). A benchmark application is developed and tested to measure the performance of a system as both the client load and component infrastructure are scaled up. A scalability metric from the literature is then applied to analyze the scalability of the EJB component infrastructure under two different architectural solutions.

  8. Improving Hardware Reusability: Software Defined Hardware

    Science.gov (United States)

    2017-03-01

    performance improvements over software, specialization is likely the future of hardware design. This trend will manifest in an increased demand for chip ...design methodologies is critical to meeting the incoming demand for chip diversity. Acknowledgements Research partially funded by DARPA Award Number...DARPA; and ASPIRE Lab industrial sponsors and affiliates Intel, Google, HPE, Huawei, LGE, Nokia, NVIDIA, Oracle, and Samsung. References [1

  9. Hardware Testing and System Evaluation: Procedures to Evaluate Commodity Hardware for Production Clusters

    Energy Technology Data Exchange (ETDEWEB)

    Goebel, J

    2004-02-27

    Without stable hardware any program will fail. The frustration and expense of supporting bad hardware can drain an organization, delay progress, and frustrate everyone involved. At Stanford Linear Accelerator Center (SLAC), we have created a testing method that helps our group, SLAC Computer Services (SCS), weed out potentially bad hardware and purchase the best hardware at the best possible cost. Commodity hardware changes often, so new evaluations happen periodically each time we purchase systems and minor re-evaluations happen for revised systems for our clusters, about twice a year. This general framework helps SCS perform correct, efficient evaluations. This article outlines SCS's computer testing methods and our system acceptance criteria. We expanded the basic ideas to other evaluations such as storage, and we think the methods outlined in this article has helped us choose hardware that is much more stable and supportable than our previous purchases. We have found that commodity hardware ranges in quality, so systematic method and tools for hardware evaluation were necessary. This article is based on one instance of a hardware purchase, but the guidelines apply to the general problem of purchasing commodity computer systems for production computational work.

  10. HARDWARE AND SOFTWARE STATUS OF QCDOC.

    Energy Technology Data Exchange (ETDEWEB)

    BOYLE,P.A.; CHEN,D.; CHRIST,N.H.; PETROV.K.; ET AL.

    2003-07-15

    QCDOC is a massively parallel supercomputer whose processing nodes are based on an application-specific integrated circuit (ASIC). This ASIC was custom-designed so that crucial lattice QCD kernels achieve an overall sustained performance of 50% on machines with several 10,000 nodes. This strong scalability, together with low power consumption and a price/performance ratio of $1 per sustained MFlops, enable QCDOC to attack the most demanding lattice QCD problems. The first ASICs became available in June of 2003, and the testing performed so far has shown all systems functioning according to specification. We review the hardware and software status of QCDOC and present performance figures obtained in real hardware as well as in simulation.

  11. PKI Scalability Issues

    OpenAIRE

    Slagell, Adam J.; Bonilla, Rafael

    2004-01-01

    This report surveys different PKI technologies such as PKIX and SPKI and the issues of PKI that affect scalability. Much focus is spent on certificate revocation methodologies and status verification systems such as CRLs, Delta-CRLs, CRS, Certificate Revocation Trees, Windowed Certificate Revocation, OCSP, SCVP and DVCS.

  12. DISP: Optimizations towards Scalable MPI Startup

    Energy Technology Data Exchange (ETDEWEB)

    Fu, Huansong [Florida State University, Tallahassee; Pophale, Swaroop S [ORNL; Gorentla Venkata, Manjunath [ORNL; Yu, Weikuan [Florida State University, Tallahassee

    2016-01-01

    Despite the popularity of MPI for high performance computing, the startup of MPI programs faces a scalability challenge as both the execution time and memory consumption increase drastically at scale. We have examined this problem using the collective modules of Cheetah and Tuned in Open MPI as representative implementations. Previous improvements for collectives have focused on algorithmic advances and hardware off-load. In this paper, we examine the startup cost of the collective module within a communicator and explore various techniques to improve its efficiency and scalability. Accordingly, we have developed a new scalable startup scheme with three internal techniques, namely Delayed Initialization, Module Sharing and Prediction-based Topology Setup (DISP). Our DISP scheme greatly benefits the collective initialization of the Cheetah module. At the same time, it helps boost the performance of non-collective initialization in the Tuned module. We evaluate the performance of our implementation on Titan supercomputer at ORNL with up to 4096 processes. The results show that our delayed initialization can speed up the startup of Tuned and Cheetah by an average of 32.0% and 29.2%, respectively, our module sharing can reduce the memory consumption of Tuned and Cheetah by up to 24.1% and 83.5%, respectively, and our prediction-based topology setup can speed up the startup of Cheetah by up to 80%.

  13. Hardware protection through obfuscation

    CERN Document Server

    Bhunia, Swarup; Tehranipoor, Mark

    2017-01-01

    This book introduces readers to various threats faced during design and fabrication by today’s integrated circuits (ICs) and systems. The authors discuss key issues, including illegal manufacturing of ICs or “IC Overproduction,” insertion of malicious circuits, referred as “Hardware Trojans”, which cause in-field chip/system malfunction, and reverse engineering and piracy of hardware intellectual property (IP). The authors provide a timely discussion of these threats, along with techniques for IC protection based on hardware obfuscation, which makes reverse-engineering an IC design infeasible for adversaries and untrusted parties with any reasonable amount of resources. This exhaustive study includes a review of the hardware obfuscation methods developed at each level of abstraction (RTL, gate, and layout) for conventional IC manufacturing, new forms of obfuscation for emerging integration strategies (split manufacturing, 2.5D ICs, and 3D ICs), and on-chip infrastructure needed for secure exchange o...

  14. Hardware removal - extremity

    Science.gov (United States)

    ... enable JavaScript. Surgeons use hardware such as pins, plates, or screws to help fix a broken bone ... SW, Hotchkiss RN, Pederson WC, Kozin SH, Cohen MS, eds. Green's Operative Hand Surgery . 7th ed. Philadelphia, ...

  15. Open Hardware at CERN

    CERN Multimedia

    CERN Knowledge Transfer Group

    2015-01-01

    CERN is actively making its knowledge and technology available for the benefit of society and does so through a variety of different mechanisms. Open hardware has in recent years established itself as a very effective way for CERN to make electronics designs and in particular printed circuit board layouts, accessible to anyone, while also facilitating collaboration and design re-use. It is creating an impact on many levels, from companies producing and selling products based on hardware designed at CERN, to new projects being released under the CERN Open Hardware Licence. Today the open hardware community includes large research institutes, universities, individual enthusiasts and companies. Many of the companies are actively involved in the entire process from design to production, delivering services and consultancy and even making their own products available under open licences.

  16. Scalable resource management in high performance computers.

    Energy Technology Data Exchange (ETDEWEB)

    Frachtenberg, E. (Eitan); Petrini, F. (Fabrizio); Fernandez Peinador, J. (Juan); Coll, S. (Salvador)

    2002-01-01

    Clusters of workstations have emerged as an important platform for building cost-effective, scalable and highly-available computers. Although many hardware solutions are available today, the largest challenge in making large-scale clusters usable lies in the system software. In this paper we present STORM, a resource management tool designed to provide scalability, low overhead and the flexibility necessary to efficiently support and analyze a wide range of job scheduling algorithms. STORM achieves these feats by closely integrating the management daemons with the low-level features that are common in state-of-the-art high-performance system area networks. The architecture of STORM is based on three main technical innovations. First, a sizable part of the scheduler runs in the thread processor located on the network interface. Second, we use hardware collectives that are highly scalable both for implementing control heartbeats and to distribute the binary of a parallel job in near-constant time, irrespective of job and machine sizes. Third, we use an I/O bypass protocol that allows fast data movements from the file system to the communication buffers in the network interface and vice versa. The experimental results show that STORM can launch a job with a binary of 12MB on a 64 processor/32 node cluster in less than 0.25 sec on an empty network, in less than 0.45 sec when all the processors are busy computing other jobs, and in less than 0.65 sec when the network is flooded with a background traffic. This paper provides experimental and analytical evidence that these results scale to a much larger number of nodes. To the best of our knowledge, STORM is at least two orders of magnitude faster than existing production schedulers in launching jobs, performing resource management tasks and gang scheduling.

  17. Travel Software using GPU Hardware

    CERN Document Server

    Szalwinski, Chris M; Dimov, Veliko Atanasov; CERN. Geneva. ATS Department

    2015-01-01

    Travel is the main multi-particle tracking code being used at CERN for the beam dynamics calculations through hadron and ion linear accelerators. It uses two routines for the calculation of space charge forces, namely, rings of charges and point-to-point. This report presents the studies to improve the performance of Travel using GPU hardware. The studies showed that the performance of Travel with the point-to-point simulations of space-charge effects can be speeded up at least 72 times using current GPU hardware. Simple recompilation of the source code using an Intel compiler can improve performance at least 4 times without GPU support. The limited memory of the GPU is the bottleneck. Two algorithms were investigated on this point: repeated computation and tiling. The repeating computation algorithm is simpler and is the currently recommended solution. The tiling algorithm was more complicated and degraded performance. Both build and test instructions for the parallelized version of the software are inclu...

  18. NASA HUNCH Hardware

    Science.gov (United States)

    Hall, Nancy R.; Wagner, James; Phelps, Amanda

    2014-01-01

    What is NASA HUNCH? High School Students United with NASA to Create Hardware-HUNCH is an instructional partnership between NASA and educational institutions. This partnership benefits both NASA and students. NASA receives cost-effective hardware and soft goods, while students receive real-world hands-on experiences. The 2014-2015 was the 12th year of the HUNCH Program. NASA Glenn Research Center joined the program that already included the NASA Johnson Space Flight Center, Marshall Space Flight Center, Langley Research Center and Goddard Space Flight Center. The program included 76 schools in 24 states and NASA Glenn worked with the following five schools in the HUNCH Build to Print Hardware Program: Medina Career Center, Medina, OH; Cattaraugus Allegheny-BOCES, Olean, NY; Orleans Niagara-BOCES, Medina, NY; Apollo Career Center, Lima, OH; Romeo Engineering and Tech Center, Washington, MI. The schools built various parts of an International Space Station (ISS) middeck stowage locker and learned about manufacturing process and how best to build these components to NASA specifications. For the 2015-2016 school year the schools will be part of a larger group of schools building flight hardware consisting of 20 ISS middeck stowage lockers for the ISS Program. The HUNCH Program consists of: Build to Print Hardware; Build to Print Soft Goods; Design and Prototyping; Culinary Challenge; Implementation: Web Page and Video Production.

  19. Computer hardware fault administration

    Science.gov (United States)

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-09-14

    Computer hardware fault administration carried out in a parallel computer, where the parallel computer includes a plurality of compute nodes. The compute nodes are coupled for data communications by at least two independent data communications networks, where each data communications network includes data communications links connected to the compute nodes. Typical embodiments carry out hardware fault administration by identifying a location of a defective link in the first data communications network of the parallel computer and routing communications data around the defective link through the second data communications network of the parallel computer.

  20. ONMCGP: Orthogonal Neighbourhood Mutation Cartesian Genetic Programming for Evolvable Hardware

    Science.gov (United States)

    I, Fuchuan N.; I, Yuanxiang L.; E, Peng K.

    2014-03-01

    Evolvable Hardware is facing the problems of scalability and stalling effect. This paper proposed a novel Orthogonal Neighbourhood Mutation (ONM) operator in Cartesian genetic programming (CGP), to reduce the stalling effect in CGP and improve the efficiency of the algorithms.The method incorporates with Differential Evolution strategy. Demonstrated by experiments on benchmark, the proposed Orthogonal Neighbourhood Search can jump out of Local optima, reduce the stalling effect in CGP and the algorithm convergence faster.

  1. CERN Neutrino Platform Hardware

    CERN Document Server

    Nelson, Kevin

    2017-01-01

    My summer research was broadly in CERN's neutrino platform hardware efforts. This project had two main components: detector assembly and data analysis work for ICARUS. Specifically, I worked on assembly for the ProtoDUNE project and monitored the safety of ICARUS as it was transported to Fermilab by analyzing the accelerometer data from its move.

  2. Construction of a Smart Medication Dispenser with High Degree of Scalability and Remote Manageability

    OpenAIRE

    JuGeon Pak; KeeHyun Park

    2012-01-01

    We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispens...

  3. DCSP hardware maintenance system

    Energy Technology Data Exchange (ETDEWEB)

    Pazmino, M.

    1995-11-01

    This paper discusses the necessary changes to be implemented on the hardware side of the DCSP database. DCSP is currently tracking hardware maintenance costs in six separate databases. The goal is to develop a system that combines all data and works off a single database. Some of the tasks that will be discussed in this paper include adding the capability for report generation, creating a help package and preparing a users guide, testing the executable file, and populating the new database with data taken from the old database. A brief description of the basic process used in developing the system will also be discussed. Conclusions about the future of the database and the delivery of the final product are then addressed, based on research and the desired use of the system.

  4. Sterilization of space hardware.

    Science.gov (United States)

    Pflug, I. J.

    1971-01-01

    Discussion of various techniques of sterilization of space flight hardware using either destructive heating or the action of chemicals. Factors considered in the dry-heat destruction of microorganisms include the effects of microbial water content, temperature, the physicochemical properties of the microorganism and adjacent support, and nature of the surrounding gas atmosphere. Dry-heat destruction rates of microorganisms on the surface, between mated surface areas, or buried in the solid material of space vehicle hardware are reviewed, along with alternative dry-heat sterilization cycles, thermodynamic considerations, and considerations of final sterilization-process design. Discussed sterilization chemicals include ethylene oxide, formaldehyde, methyl bromide, dimethyl sulfoxide, peracetic acid, and beta-propiolactone.

  5. Scalable photoreactor for hydrogen production

    KAUST Repository

    Takanabe, Kazuhiro

    2017-04-06

    Provided herein are scalable photoreactors that can include a membrane-free water- splitting electrolyzer and systems that can include a plurality of membrane-free water- splitting electrolyzers. Also provided herein are methods of using the scalable photoreactors provided herein.

  6. Modular particle filtering FPGA hardware architecture for brain machine interfaces.

    Science.gov (United States)

    Mountney, John; Obeid, Iyad; Silage, Dennis

    2011-01-01

    As the computational complexities of neural decoding algorithms for brain machine interfaces (BMI) increase, their implementation through sequential processors becomes prohibitive for real-time applications. This work presents the field programmable gate array (FPGA) as an alternative to sequential processors for BMIs. The reprogrammable hardware architecture of the FPGA provides a near optimal platform for performing parallel computations in real-time. The scalability and reconfigurability of the FPGA accommodates diverse sets of neural ensembles and a variety of decoding algorithms. Throughput is significantly increased by decomposing computations into independent parallel hardware modules on the FPGA. This increase in throughput is demonstrated through a parallel hardware implementation of the auxiliary particle filtering signal processing algorithm.

  7. Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments

    Energy Technology Data Exchange (ETDEWEB)

    Luszczek, Piotr R [ORNL; Tomov, Stanimire Z [ORNL; Dongarra, Jack J [ORNL

    2015-01-01

    We present an efficient and scalable programming model for the development of linear algebra in heterogeneous multi-coprocessor environments. The model incorporates some of the current best design and implementation practices for the heterogeneous acceleration of dense linear algebra (DLA). Examples are given as the basis for solving linear systems' algorithms - the LU, QR, and Cholesky factorizations. To generate the extreme level of parallelism needed for the efficient use of coprocessors, algorithms of interest are redesigned and then split into well-chosen computational tasks. The tasks execution is scheduled over the computational components of a hybrid system of multi-core CPUs and coprocessors using a light-weight runtime system. The use of lightweight runtime systems keeps scheduling overhead low, while enabling the expression of parallelism through otherwise sequential code. This simplifies the development efforts and allows the exploration of the unique strengths of the various hardware components.

  8. Compact hardware liquid state machines on FPGA for real-time speech recognition.

    Science.gov (United States)

    Schrauwen, Benjamin; D'Haene, Michiel; Verstraeten, David; Campenhout, Jan Van

    2008-01-01

    Hardware implementations of Spiking Neural Networks are numerous because they are well suited for implementation in digital and analog hardware, and outperform classic neural networks. This work presents an application driven digital hardware exploration where we implement real-time, isolated digit speech recognition using a Liquid State Machine. The Liquid State Machine is a recurrent neural network of spiking neurons where only the output layer is trained. First we test two existing hardware architectures which we improve and extend, but that appears to be too fast and thus area consuming for this application. Next, we present a scalable, serialized architecture that allows a very compact implementation of spiking neural networks that is still fast enough for real-time processing. All architectures support leaky integrate-and-fire membranes with exponential synaptic models. This work shows that there is actually a large hardware design space of Spiking Neural Network hardware that can be explored. Existing architectures have only spanned part of it.

  9. COMPUTER HARDWARE MARKING

    CERN Multimedia

    Groupe de protection des biens

    2000-01-01

    As part of the campaign to protect CERN property and for insurance reasons, all computer hardware belonging to the Organization must be marked with the words 'PROPRIETE CERN'.IT Division has recently introduced a new marking system that is both economical and easy to use. From now on all desktop hardware (PCs, Macintoshes, printers) issued by IT Division with a value equal to or exceeding 500 CHF will be marked using this new system.For equipment that is already installed but not yet marked, including UNIX workstations and X terminals, IT Division's Desktop Support Service offers the following services free of charge:Equipment-marking wherever the Service is called out to perform other work (please submit all work requests to the IT Helpdesk on 78888 or helpdesk@cern.ch; for unavoidable operational reasons, the Desktop Support Service will only respond to marking requests when these coincide with requests for other work such as repairs, system upgrades, etc.);Training of personnel designated by Division Leade...

  10. Scalable parallel communications

    Science.gov (United States)

    Maly, K.; Khanna, S.; Overstreet, C. M.; Mukkamala, R.; Zubair, M.; Sekhar, Y. S.; Foudriat, E. C.

    1992-01-01

    Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth

  11. Foundations of hardware IP protection

    CERN Document Server

    Torres, Lionel

    2017-01-01

    This book provides a comprehensive and up-to-date guide to the design of security-hardened, hardware intellectual property (IP). Readers will learn how IP can be threatened, as well as protected, by using means such as hardware obfuscation/camouflaging, watermarking, fingerprinting (PUF), functional locking, remote activation, hidden transmission of data, hardware Trojan detection, protection against hardware Trojan, use of secure element, ultra-lightweight cryptography, and digital rights management. This book serves as a single-source reference to design space exploration of hardware security and IP protection. · Provides readers with a comprehensive overview of hardware intellectual property (IP) security, describing threat models and presenting means of protection, from integrated circuit layout to digital rights management of IP; · Enables readers to transpose techniques fundamental to digital rights management (DRM) to the realm of hardware IP security; · Introduce designers to the concept of salutar...

  12. Open hardware for open science

    CERN Multimedia

    CERN Bulletin

    2011-01-01

    Inspired by the open source software movement, the Open Hardware Repository was created to enable hardware developers to share the results of their R&D activities. The recently published CERN Open Hardware Licence offers the legal framework to support this knowledge and technology exchange.   Two years ago, a group of electronics designers led by Javier Serrano, a CERN engineer, working in experimental physics laboratories created the Open Hardware Repository (OHR). This project was initiated in order to facilitate the exchange of hardware designs across the community in line with the ideals of “open science”. The main objectives include avoiding duplication of effort by sharing results across different teams that might be working on the same need. “For hardware developers, the advantages of open hardware are numerous. For example, it is a great learning tool for technologies some developers would not otherwise master, and it avoids unnecessary work if someone ha...

  13. Scalable Nanomanufacturing—A Review

    Directory of Open Access Journals (Sweden)

    Khershed Cooper

    2017-01-01

    Full Text Available This article describes the field of scalable nanomanufacturing, its importance and need, its research activities and achievements. The National Science Foundation is taking a leading role in fostering basic research in scalable nanomanufacturing (SNM. From this effort several novel nanomanufacturing approaches have been proposed, studied and demonstrated, including scalable nanopatterning. This paper will discuss SNM research areas in materials, processes and applications, scale-up methods with project examples, and manufacturing challenges that need to be addressed to move nanotechnology discoveries closer to the marketplace.

  14. (Submitted) Scalable quantum circuit and control for a superconducting surface code

    NARCIS (Netherlands)

    Versluis, R.; Poletto, S.; Khammassi, N.; Haider, N.; Michalak, D.J.; Bruno, A.; Bertels, K.; DiCarlo, L.

    2016-01-01

    We present a scalable scheme for executing the error-correction cycle of a monolithic surface-code fabric composed of fast-flux-tuneable transmon qubits with nearest-neighbor coupling. An eight-qubit unit cell forms the basis for repeating both the quantum hardware and coherent control, enabling

  15. Scalable Gravity Offload System Project

    Data.gov (United States)

    National Aeronautics and Space Administration — A scalable gravity offload device simulates reduced gravity for the testing of various surface system elements such as mobile robots, excavators, habitats, and...

  16. A scalable healthcare information system based on a service-oriented architecture.

    Science.gov (United States)

    Yang, Tzu-Hsiang; Sun, Yeali S; Lai, Feipei

    2011-06-01

    Many existing healthcare information systems are composed of a number of heterogeneous systems and face the important issue of system scalability. This paper first describes the comprehensive healthcare information systems used in National Taiwan University Hospital (NTUH) and then presents a service-oriented architecture (SOA)-based healthcare information system (HIS) based on the service standard HL7. The proposed architecture focuses on system scalability, in terms of both hardware and software. Moreover, we describe how scalability is implemented in rightsizing, service groups, databases, and hardware scalability. Although SOA-based systems sometimes display poor performance, through a performance evaluation of our HIS based on SOA, the average response time for outpatient, inpatient, and emergency HL7Central systems are 0.035, 0.04, and 0.036 s, respectively. The outpatient, inpatient, and emergency WebUI average response times are 0.79, 1.25, and 0.82 s. The scalability of the rightsizing project and our evaluation results show that the SOA HIS we propose provides evidence that SOA can provide system scalability and sustainability in a highly demanding healthcare information system.

  17. Space hardware microbial contamination

    Science.gov (United States)

    Baker, A.; Kern, R.; Mancinelli, R.; Venkateswaren, K.; Wainwright, N.

    Planetary Protection (PP) requirements imposed on unmanned planetary missions require that the spacecraft undergo rigorous bioload reduction prior to launch. The ability to quantitate bioburden on such spacecraft is dependent on developing new analytical methodologies that can be used to identify and trace biological contamination on flight hardware. The focus of new method development is to move forward and to augment the current spore analysis method which was first used on Viking. The ultimate goal of the new techniques is not to increase the cleanliness requirement currently levied on various missions, b ut instead to better understand the nature of the bioburden through the use of well-characterized standard methods. Subsequently an array of standard techniques is needed to provide various analytical methodologies that can be used to access bioburden, depending upon mission specifications. This poster will provide information on two workshops that have been held to review the status of the development of new quantitative techniques for determining the bioload on spacecraft at the time of launch. The purpose of the workshops was to review and revise NASA Standard Operation Procedure NPG:5340.1C "Microbiological Examination of Space Hardware and Associated Environments" to incorporate improvements in the procedure and to reflect current field practices. I addition the paneln reviewed the status of new analytical methods currently under study for planetary protection applications, defining expected research that would bring the individual methods to a point where they can be drafted for submittal to the NASA standard procedure process. The poster will highlight changes to current standard procedures as well as review the status of new methods currently being studied. Methods included Polymerase Chain Reaction (PCR), Epifluorescence Techniques, Live/Dead Cell Analysis, Capillary Electrophoresis of Amino Acids and Ionic Contaminants, High Sensitivity Assay for

  18. Hardware Support for Embedded Java

    DEFF Research Database (Denmark)

    Schoeberl, Martin

    2012-01-01

    The general Java runtime environment is resource hungry and unfriendly for real-time systems. To reduce the resource consumption of Java in embedded systems, direct hardware support of the language is a valuable option. Furthermore, an implementation of the Java virtual machine in hardware enables...... worst-case execution time analysis of Java programs. This chapter gives an overview of current approaches to hardware support for embedded and real-time Java....

  19. Compact FPGA hardware architecture for public key encryption in embedded devices.

    Science.gov (United States)

    Rodríguez-Flores, Luis; Morales-Sandoval, Miguel; Cumplido, René; Feregrino-Uribe, Claudia; Algredo-Badillo, Ignacio

    2018-01-01

    Security is a crucial requirement in the envisioned applications of the Internet of Things (IoT), where most of the underlying computing platforms are embedded systems with reduced computing capabilities and energy constraints. In this paper we present the design and evaluation of a scalable low-area FPGA hardware architecture that serves as a building block to accelerate the costly operations of exponentiation and multiplication in [Formula: see text], commonly required in security protocols relying on public key encryption, such as in key agreement, authentication and digital signature. The proposed design can process operands of different size using the same datapath, which exhibits a significant reduction in area without loss of efficiency if compared to representative state of the art designs. For example, our design uses 96% less standard logic than a similar design optimized for performance, and 46% less resources than other design optimized for area. Even using fewer area resources, our design still performs better than its embedded software counterparts (190x and 697x).

  20. Trainable hardware for dynamical computing using error backpropagation through physical media.

    Science.gov (United States)

    Hermans, Michiel; Burm, Michaël; Van Vaerenbergh, Thomas; Dambre, Joni; Bienstman, Peter

    2015-03-24

    Neural networks are currently implemented on digital Von Neumann machines, which do not fully leverage their intrinsic parallelism. We demonstrate how to use a novel class of reconfigurable dynamical systems for analogue information processing, mitigating this problem. Our generic hardware platform for dynamic, analogue computing consists of a reciprocal linear dynamical system with nonlinear feedback. Thanks to reciprocity, a ubiquitous property of many physical phenomena like the propagation of light and sound, the error backpropagation-a crucial step for tuning such systems towards a specific task-can happen in hardware. This can potentially speed up the optimization process significantly, offering important benefits for the scalability of neuro-inspired hardware. In this paper, we show, using one experimentally validated and one conceptual example, that such systems may provide a straightforward mechanism for constructing highly scalable, fully dynamical analogue computers.

  1. Hardware for soft computing and soft computing for hardware

    CERN Document Server

    Nedjah, Nadia

    2014-01-01

    Single and Multi-Objective Evolutionary Computation (MOEA),  Genetic Algorithms (GAs), Artificial Neural Networks (ANNs), Fuzzy Controllers (FCs), Particle Swarm Optimization (PSO) and Ant colony Optimization (ACO) are becoming omnipresent in almost every intelligent system design. Unfortunately, the application of the majority of these techniques is complex and so requires a huge computational effort to yield useful and practical results. Therefore, dedicated hardware for evolutionary, neural and fuzzy computation is a key issue for designers. With the spread of reconfigurable hardware such as FPGAs, digital as well as analog hardware implementations of such computation become cost-effective. The idea behind this book is to offer a variety of hardware designs for soft computing techniques that can be embedded in any final product. Also, to introduce the successful application of soft computing technique to solve many hard problem encountered during the design of embedded hardware designs. Reconfigurable em...

  2. LHCb: Hardware Data Injector

    CERN Multimedia

    Delord, V; Neufeld, N

    2009-01-01

    The LHCb High Level Trigger and Data Acquisition system selects about 2 kHz of events out of the 1 MHz of events, which have been selected previously by the first-level hardware trigger. The selected events are consolidated into files and then sent to permanent storage for subsequent analysis on the Grid. The goal of the upgrade of the LHCb readout is to lift the limitation to 1 MHz. This means speeding up the DAQ to 40 MHz. Such a DAQ system will certainly employ 10 Gigabit or technologies and might also need new networking protocols: a customized TCP or proprietary solutions. A test module is being presented, which integrates in the existing LHCb infrastructure. It is a 10-Gigabit traffic generator, flexible enough to generate LHCb's raw data packets using dummy data or simulated data. These data are seen as real data coming from sub-detectors by the DAQ. The implementation is based on an FPGA using 10 Gigabit Ethernet interface. This module is integrated in the experiment control system. The architecture, ...

  3. Design for scalability in 3D computer graphics architectures

    DEFF Research Database (Denmark)

    Holten-Lund, Hans Erik

    2002-01-01

    This thesis describes useful methods and techniques for designing scalable hybrid parallel rendering architectures for 3D computer graphics. Various techniques for utilizing parallelism in a pipelines system are analyzed. During the Ph.D study a prototype 3D graphics architecture named Hybris has...... been developed. Hybris is a prototype rendering architeture which can be tailored to many specific 3D graphics applications and implemented in various ways. Parallel software implementations for both single and multi-processor Windows 2000 system have been demonstrated. Working hardware...... as a case study and an application of the Hybris graphics architecture....

  4. Hardware Removal in Craniomaxillofacial Trauma

    Science.gov (United States)

    Cahill, Thomas J.; Gandhi, Rikesh; Allori, Alexander C.; Marcus, Jeffrey R.; Powers, David; Erdmann, Detlev; Hollenbeck, Scott T.; Levinson, Howard

    2015-01-01

    Background Craniomaxillofacial (CMF) fractures are typically treated with open reduction and internal fixation. Open reduction and internal fixation can be complicated by hardware exposure or infection. The literature often does not differentiate between these 2 entities; so for this study, we have considered all hardware exposures as hardware infections. Approximately 5% of adults with CMF trauma are thought to develop hardware infections. Management consists of either removing the hardware versus leaving it in situ. The optimal approach has not been investigated. Thus, a systematic review of the literature was undertaken and a resultant evidence-based approach to the treatment and management of CMF hardware infections was devised. Materials and Methods A comprehensive search of journal articles was performed in parallel using MEDLINE, Web of Science, and ScienceDirect electronic databases. Keywords and phrases used were maxillofacial injuries; facial bones; wounds and injuries; fracture fixation, internal; wound infection; and infection. Our search yielded 529 articles. To focus on CMF fractures with hardware infections, the full text of English-language articles was reviewed to identify articles focusing on the evaluation and management of infected hardware in CMF trauma. Each article’s reference list was manually reviewed and citation analysis performed to identify articles missed by the search strategy. There were 259 articles that met the full inclusion criteria and form the basis of this systematic review. The articles were rated based on the level of evidence. There were 81 grade II articles included in the meta-analysis. Result Our meta-analysis revealed that 7503 patients were treated with hardware for CMF fractures in the 81 grade II articles. Hardware infection occurred in 510 (6.8%) of these patients. Of those infections, hardware removal occurred in 264 (51.8%) patients; hardware was left in place in 166 (32.6%) patients; and in 80 (15.6%) cases

  5. An Integrated Hardware Array for Very High Speed Logic Simulation

    Directory of Open Access Journals (Sweden)

    E. Scott Fehr

    1996-01-01

    boolean evaluation and fanout switching circuits, while large scale parallelism is integrated at die level to reduce cost and communication delays. The results of this research form the basis for a multiple order of magnitude improvement in reported state-of-the-art cost-performance merit for hardware gate level simulation accelerators.

  6. BIOLOGICALLY INSPIRED HARDWARE CELL ARCHITECTURE

    DEFF Research Database (Denmark)

    2010-01-01

    Disclosed is a system comprising: - a reconfigurable hardware platform; - a plurality of hardware units defined as cells adapted to be programmed to provide self-organization and self-maintenance of the system by means of implementing a program expressed in a programming language defined as DNA...

  7. Secure coupling of hardware components

    NARCIS (Netherlands)

    Hoepman, J.H.; Joosten, H.J.M.; Knobbe, J.W.

    2011-01-01

    A method and a system for securing communication between at least a first and a second hardware components of a mobile device is described. The method includes establishing a first shared secret between the first and the second hardware components during an initialization of the mobile device and,

  8. Hardware support for CSP on a Java chip multiprocessor

    DEFF Research Database (Denmark)

    Gruian, Flavius; Schoeberl, Martin

    2013-01-01

    Due to memory bandwidth limitations, chip multiprocessors (CMPs) adopting the convenient shared memory model for their main memory architecture scale poorly. On-chip core-to-core communication is a solution to this problem, that can lead to further performance increase for a number of multithreaded...... applications. Programmatically, the Communicating Sequential Processes (CSPs) paradigm provides a sound computational model for such an architecture with message based communication. In this paper we explore hardware support for CSP in the context of an embedded Java CMP. The hardware support for CSP are on-chip...... communication channels, implemented by a ring-based network-on-chip (NoC), to reduce the memory bandwidth pressure on the shared memory.The presented solution is scalable and also specific for our limited resources and real-time predictability requirements. CMP architectures of three to eight processors were...

  9. Binary Associative Memories as a Benchmark for Spiking Neuromorphic Hardware

    Directory of Open Access Journals (Sweden)

    Andreas Stöckel

    2017-08-01

    Full Text Available Large-scale neuromorphic hardware platforms, specialized computer systems for energy efficient simulation of spiking neural networks, are being developed around the world, for example as part of the European Human Brain Project (HBP. Due to conceptual differences, a universal performance analysis of these systems in terms of runtime, accuracy and energy efficiency is non-trivial, yet indispensable for further hard- and software development. In this paper we describe a scalable benchmark based on a spiking neural network implementation of the binary neural associative memory. We treat neuromorphic hardware and software simulators as black-boxes and execute exactly the same network description across all devices. Experiments on the HBP platforms under varying configurations of the associative memory show that the presented method allows to test the quality of the neuron model implementation, and to explain significant deviations from the expected reference output.

  10. Scalable algorithms for contact problems

    CERN Document Server

    Dostál, Zdeněk; Sadowská, Marie; Vondrák, Vít

    2016-01-01

    This book presents a comprehensive and self-contained treatment of the authors’ newly developed scalable algorithms for the solutions of multibody contact problems of linear elasticity. The brand new feature of these algorithms is theoretically supported numerical scalability and parallel scalability demonstrated on problems discretized by billions of degrees of freedom. The theory supports solving multibody frictionless contact problems, contact problems with possibly orthotropic Tresca’s friction, and transient contact problems. It covers BEM discretization, jumping coefficients, floating bodies, mortar non-penetration conditions, etc. The exposition is divided into four parts, the first of which reviews appropriate facets of linear algebra, optimization, and analysis. The most important algorithms and optimality results are presented in the third part of the volume. The presentation is complete, including continuous formulation, discretization, decomposition, optimality results, and numerical experimen...

  11. RADIATION RESISTANT LED POWER SUPPLY RELEASED UNDER CERN OPEN HARDWARE LICENSE

    CERN Multimedia

    2016-01-01

    As part of the design of a new emergency lighting system for the CERN accelerator complex a new design for a radiation resistant power supply has been produced. The design is available from the Open Hardware Repository.

  12. Hardware Realization of an FPGA Processor - Operating System Call Offload and Experiences

    DEFF Research Database (Denmark)

    Hindborg, Andreas Erik; Karlsson, Sven

    2014-01-01

    on a microprocessor. It is therefore convenient for many applications to employ a synthesizable microprocessor to execute sequential tasks and custom hardware structures to accelerate parallel sections of an algorithm. In this paper, we discuss the hardware realization of Tinuso-I, a small synthesizable processor...

  13. Rapid Non-Cartesian Parallel Imaging Reconstruction on Commodity Graphics Hardware

    DEFF Research Database (Denmark)

    Sørensen, Thomas Sangild; Atkinson, David; Boubertakh, Redha

    2008-01-01

    This presentation describes an implementation of non-Cartesian SENSE and kt-SENSE accelerated on commodity graphics hardware. This inexpensive hardware platform is now fully programmable and very suited for solving reconstruction problems. We show that for both SENSE and kt-SENSE the reconstruction...

  14. Scalable shared-memory multiprocessing

    CERN Document Server

    Lenoski, Daniel E

    1995-01-01

    Dr. Lenoski and Dr. Weber have experience with leading-edge research and practical issues involved in implementing large-scale parallel systems. They were key contributors to the architecture and design of the DASH multiprocessor. Currently, they are involved with commercializing scalable shared-memory technology.

  15. Scalability study of solid xenon

    Energy Technology Data Exchange (ETDEWEB)

    Yoo, J.; Cease, H.; Jaskierny, W. F.; Markley, D.; Pahlka, R. B.; Balakishiyeva, D.; Saab, T.; Filipenko, M.

    2015-04-01

    We report a demonstration of the scalability of optically transparent xenon in the solid phase for use as a particle detector above a kilogram scale. We employed a cryostat cooled by liquid nitrogen combined with a xenon purification and chiller system. A modified {\\it Bridgeman's technique} reproduces a large scale optically transparent solid xenon.

  16. Scalable MPEG-4 Encoder on FPGA Multiprocessor SOC

    Directory of Open Access Journals (Sweden)

    Kulmala Ari

    2006-01-01

    Full Text Available High computational requirements combined with rapidly evolving video coding algorithms and standards are a great challenge for contemporary encoder implementations. Rapid specification changes prefer full programmability and configurability both for software and hardware. This paper presents a novel scalable MPEG-4 video encoder on an FPGA-based multiprocessor system-on-chip (MPSOC. The MPSOC architecture is truly scalable and is based on a vendor-independent intellectual property (IP block interconnection network. The scalability in video encoding is achieved by spatial parallelization where images are divided to horizontal slices. A case design is presented with up to four synthesized processors on an Altera Stratix 1S40 device. A truly portable ANSI-C implementation that supports an arbitrary number of processors gives 11 QCIF frames/s at 50 MHz without processor specific optimizations. The parallelization efficiency is 97% for two processors and 93% with three. The FPGA utilization is 70%, requiring 28 797 logic elements. The implementation effort is significantly lower compared to traditional multiprocessor implementations.

  17. Scalable MPEG-4 Encoder on FPGA Multiprocessor SOC

    Directory of Open Access Journals (Sweden)

    Marko Hännikäinen

    2006-10-01

    Full Text Available High computational requirements combined with rapidly evolving video coding algorithms and standards are a great challenge for contemporary encoder implementations. Rapid specification changes prefer full programmability and configurability both for software and hardware. This paper presents a novel scalable MPEG-4 video encoder on an FPGA-based multiprocessor system-on-chip (MPSOC. The MPSOC architecture is truly scalable and is based on a vendor-independent intellectual property (IP block interconnection network. The scalability in video encoding is achieved by spatial parallelization where images are divided to horizontal slices. A case design is presented with up to four synthesized processors on an Altera Stratix 1S40 device. A truly portable ANSI-C implementation that supports an arbitrary number of processors gives 11 QCIF frames/s at 50 MHz without processor specific optimizations. The parallelization efficiency is 97% for two processors and 93% with three. The FPGA utilization is 70%, requiring 28 797 logic elements. The implementation effort is significantly lower compared to traditional multiprocessor implementations.

  18. Hardware for dynamic quantum computing.

    Science.gov (United States)

    Ryan, Colm A; Johnson, Blake R; Ristè, Diego; Donovan, Brian; Ohki, Thomas A

    2017-10-01

    We describe the hardware, gateware, and software developed at Raytheon BBN Technologies for dynamic quantum information processing experiments on superconducting qubits. In dynamic experiments, real-time qubit state information is fed back or fed forward within a fraction of the qubits' coherence time to dynamically change the implemented sequence. The hardware presented here covers both control and readout of superconducting qubits. For readout, we created a custom signal processing gateware and software stack on commercial hardware to convert pulses in a heterodyne receiver into qubit state assignments with minimal latency, alongside data taking capability. For control, we developed custom hardware with gateware and software for pulse sequencing and steering information distribution that is capable of arbitrary control flow in a fraction of superconducting qubit coherence times. Both readout and control platforms make extensive use of field programmable gate arrays to enable tailored qubit control systems in a reconfigurable fabric suitable for iterative development.

  19. Hardware for dynamic quantum computing

    Science.gov (United States)

    Ryan, Colm A.; Johnson, Blake R.; Ristè, Diego; Donovan, Brian; Ohki, Thomas A.

    2017-10-01

    We describe the hardware, gateware, and software developed at Raytheon BBN Technologies for dynamic quantum information processing experiments on superconducting qubits. In dynamic experiments, real-time qubit state information is fed back or fed forward within a fraction of the qubits' coherence time to dynamically change the implemented sequence. The hardware presented here covers both control and readout of superconducting qubits. For readout, we created a custom signal processing gateware and software stack on commercial hardware to convert pulses in a heterodyne receiver into qubit state assignments with minimal latency, alongside data taking capability. For control, we developed custom hardware with gateware and software for pulse sequencing and steering information distribution that is capable of arbitrary control flow in a fraction of superconducting qubit coherence times. Both readout and control platforms make extensive use of field programmable gate arrays to enable tailored qubit control systems in a reconfigurable fabric suitable for iterative development.

  20. NDAS Hardware Translation Layer Development

    Science.gov (United States)

    Nazaretian, Ryan N.; Holladay, Wendy T.

    2011-01-01

    The NASA Data Acquisition System (NDAS) project is aimed to replace all DAS software for NASA s Rocket Testing Facilities. There must be a software-hardware translation layer so the software can properly talk to the hardware. Since the hardware from each test stand varies, drivers for each stand have to be made. These drivers will act more like plugins for the software. If the software is being used in E3, then the software should point to the E3 driver package. If the software is being used at B2, then the software should point to the B2 driver package. The driver packages should also be filled with hardware drivers that are universal to the DAS system. For example, since A1, A2, and B2 all use the Preston 8300AU signal conditioners, then the driver for those three stands should be the same and updated collectively.

  1. Hardware Middleware for Person Tracking on Embedded Distributed Smart Cameras

    Directory of Open Access Journals (Sweden)

    Ali Akbar Zarezadeh

    2012-01-01

    Full Text Available Tracking individuals is a prominent application in such domains like surveillance or smart environments. This paper provides a development of a multiple camera setup with jointed view that observes moving persons in a site. It focuses on a geometry-based approach to establish correspondence among different views. The expensive computational parts of the tracker are hardware accelerated via a novel system-on-chip (SoC design. In conjunction with this vision application, a hardware object request broker (ORB middleware is presented as the underlying communication system. The hardware ORB provides a hardware/software architecture to achieve real-time intercommunication among multiple smart cameras. Via a probing mechanism, a performance analysis is performed to measure network latencies, that is, time traversing the TCP/IP stack, in both software and hardware ORB approaches on the same smart camera platform. The empirical results show that using the proposed hardware ORB as client and server in separate smart camera nodes will considerably reduce the network latency up to 100 times compared to the software ORB.

  2. An FPGA-based quench detection and protection system for superconducting accelerator magnets

    Energy Technology Data Exchange (ETDEWEB)

    Carcagno, R.H.; Feher, S.; Lamm, M.; Makulski, A.; Nehring, R.; Orris, D.F.; Pischalnikov, Y.; Tartaglia, M.; /Fermilab

    2005-05-01

    A new quench detection and protection system for superconducting accelerator magnets was developed for the Fermilab's Magnet Test Facility (MTF). This system is based on a Field-Programmable Gate Array (FPGA) module, and it is made of mostly commercially available, integrated hardware and software components. It provides all the functions of our existing VME-based quench detection and protection system, but in addition the new system is easily scalable to protect multiple magnets powered independently and a more powerful user interface and analysis tools. The new system has been used successfully for testing LHC Interaction Region Quadrupoles correctors and High Field Magnet HFDM04. In this paper we describe the system and present results.

  3. Diamon2- Improved Monitoring of CERN’s Accelerator Controls Infrastructure

    CERN Document Server

    Buczak, W; Ehm, F; Jurcso, P; Mitev, M

    2014-01-01

    Monitoring of heterogeneous systems in large organizations like CERN is always challenging. CERN's accelerators infrastructure includes large number of equipment (servers, consoles, FECs, PLCs), some still running legacy software like LynxOS 4 or Red Hat Enterprise Linux 4 on older hardware with very limited resources. DIAMON2 is based on CERN Common Monitoring platform. Using Java industry standards, notably Spring, Ehcache and the Java Message Service, together with a small footprint C++ -based monitoring agent for real time systems and wide variety of additional data acquisition components (SNMP, JMS, JMX etc.), DIAMON2 targets CERN’s environment, providing easily extensible, dynamically reconfigurable, reliable and scalable monitoring solution. This article explains the evolution of the CERN diagnostics and monitoring environment until DIAMON2, describes the overall system’s architecture, main components and their functionality as well as the first operational experiences with the new system, observed...

  4. Quality scalable video data stream

    OpenAIRE

    Wiegand, T.; Kirchhoffer, H.; Schwarz, H

    2008-01-01

    An apparatus for generating a quality-scalable video data stream (36) is described which comprises means (42) for coding a video signal (18) using block-wise transformation to obtain transform blocks (146, 148) of transformation coefficient values for a picture (140) of the video signal, a predetermined scan order (154, 156, 164, 166) with possible scan positions being defined among the transformation coefficient values within the transform blocks so that in each transform block, for each pos...

  5. Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

    NARCIS (Netherlands)

    Laan, Wladimir J. van der; Jalba, Andrei C.; Roerdink, Jos B.T.M.

    The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. We show that this transform, by means of the lifting scheme, can be performed in a memory and computation-efficient way on modern, programmable GPUs, which can be regarded as

  6. Graphics hardware accelerated panorama builder for mobile phones

    Science.gov (United States)

    Bordallo López, Miguel; Hannuksela, Jari; Silvén, Olli; Vehviläinen, Markku

    2009-02-01

    Modern mobile communication devices frequently contain built-in cameras allowing users to capture highresolution still images, but at the same time the imaging applications are facing both usability and throughput bottlenecks. The difficulties in taking ad hoc pictures of printed paper documents with multi-megapixel cellular phone cameras on a common business use case, illustrate these problems for anyone. The result can be examined only after several seconds, and is often blurry, so a new picture is needed, although the view-finder image had looked good. The process can be a frustrating one with waits and the user not being able to predict the quality beforehand. The problems can be traced to the processor speed and camera resolution mismatch, and application interactivity demands. In this context we analyze building mosaic images of printed documents from frames selected from VGA resolution (640x480 pixel) video. High interactivity is achieved by providing real-time feedback on the quality, while simultaneously guiding the user actions. The graphics processing unit of the mobile device can be used to speed up the reconstruction computations. To demonstrate the viability of the concept, we present an interactive document scanning application implemented on a Nokia N95 mobile phone.

  7. Accelerating ATM Optimization Algorithms Using High Performance Computing Hardware Project

    Data.gov (United States)

    National Aeronautics and Space Administration — NASA is developing algorithms and methodologies for efficient air-traffic management. Several researchers have adopted an optimization framework for solving problems...

  8. Accelerating ATM Optimization Algorithms Using High Performance Computing Hardware Project

    Data.gov (United States)

    National Aeronautics and Space Administration — NASA is developing algorithms and methodologies for efficient air-traffic management (ATM). Several researchers have adopted an optimization framework for solving...

  9. Parallelized Local Volatility Estimation Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.

    2010-01-01

    We introduce an inverse problem for the local volatility model in option pricing. We solve the problem using the Levenberg-Marquardt algorithm and use the notion of the Fréchet derivative when calculating the Jacobian matrix. We analyze the existence of the Fréchet derivative and its numerical computation. To reduce the computational time of the inverse problem, a GP-GPU environment is considered for parallel computation. Numerical results confirm the validity and efficiency of the proposed method. ©2010 IEEE.

  10. Basket Option Pricing Using GP-GPU Hardware Acceleration

    KAUST Repository

    Douglas, Craig C.

    2010-08-01

    We introduce a basket option pricing problem arisen in financial mathematics. We discretized the problem based on the alternating direction implicit (ADI) method and parallel cyclic reduction is applied to solve the set of tridiagonal matrices generated by the ADI method. To reduce the computational time of the problem, a general purpose graphics processing units (GP-GPU) environment is considered. Numerical results confirm the convergence and efficiency of the proposed method. © 2010 IEEE.

  11. Apple-CORE: Microgrids of SVP cores: flexible, general-purpose, fine-grained hardware concurrency management

    NARCIS (Netherlands)

    Poss, R.; Lankamp, M.; Yang, Q.; Fu, J.; van Tol, M.W.; Jesshope, C.; Nair, S.

    2012-01-01

    To harness the potential of CMPs for scalable, energy-efficient performance in general-purpose computers, the Apple-CORE project has co-designed a general machine model and concurrency control interface with dedicated hardware support for concurrency control across multiple cores. Its SVP interface

  12. Palacios and Kitten : high performance operating systems for scalable virtualized and native supercomputing.

    Energy Technology Data Exchange (ETDEWEB)

    Widener, Patrick (University of New Mexico); Jaconette, Steven (Northwestern University); Bridges, Patrick G. (University of New Mexico); Xia, Lei (Northwestern University); Dinda, Peter (Northwestern University); Cui, Zheng.; Lange, John (Northwestern University); Hudson, Trammell B.; Levenhagen, Michael J.; Pedretti, Kevin Thomas Tauke; Brightwell, Ronald Brian

    2009-09-01

    Palacios and Kitten are new open source tools that enable applications, whether ported or not, to achieve scalable high performance on large machines. They provide a thin layer over the hardware to support both full-featured virtualized environments and native code bases. Kitten is an OS under development at Sandia that implements a lightweight kernel architecture to provide predictable behavior and increased flexibility on large machines, while also providing Linux binary compatibility. Palacios is a VMM that is under development at Northwestern University and the University of New Mexico. Palacios, which can be embedded into Kitten and other OSes, supports existing, unmodified applications and operating systems by using virtualization that leverages hardware technologies. We describe the design and implementation of both Kitten and Palacios. Our benchmarks show that they provide near native, scalable performance. Palacios and Kitten provide an incremental path to using supercomputer resources that is not performance-compromised.

  13. GPU-accelerated Tersoff potentials for massively parallel Molecular Dynamics simulations

    Science.gov (United States)

    Nguyen, Trung Dac

    2017-03-01

    The Tersoff potential is one of the empirical many-body potentials that has been widely used in simulation studies at atomic scales. Unlike pair-wise potentials, the Tersoff potential involves three-body terms, which require much more arithmetic operations and data dependency. In this contribution, we have implemented the GPU-accelerated version of several variants of the Tersoff potential for LAMMPS, an open-source massively parallel Molecular Dynamics code. Compared to the existing MPI implementation in LAMMPS, the GPU implementation exhibits a better scalability and offers a speedup of 2.2X when run on 1000 compute nodes on the Titan supercomputer. On a single node, the speedup ranges from 2.0 to 8.0 times, depending on the number of atoms per GPU and hardware configurations. The most notable features of our GPU-accelerated version include its design for MPI/accelerator heterogeneous parallelism, its compatibility with other functionalities in LAMMPS, its ability to give deterministic results and to support both NVIDIA CUDA- and OpenCL-enabled accelerators. Our implementation is now part of the GPU package in LAMMPS and accessible for public use.

  14. Advanced technologies for scalable ATLAS conditions database access on the grid

    CERN Document Server

    Basset, R; Dimitrov, G; Girone, M; Hawkings, R; Nevski, P; Valassi, A; Vaniachine, A; Viegas, F; Walker, R; Wong, A

    2010-01-01

    During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysi...

  15. Raspberry Pi hardware projects 1

    CERN Document Server

    Robinson, Andrew

    2013-01-01

    Learn how to take full advantage of all of Raspberry Pi's amazing features and functions-and have a blast doing it! Congratulations on becoming a proud owner of a Raspberry Pi, the credit-card-sized computer! If you're ready to dive in and start finding out what this amazing little gizmo is really capable of, this ebook is for you. Taken from the forthcoming Raspberry Pi Projects, Raspberry Pi Hardware Projects 1 contains three cool hardware projects that let you have fun with the Raspberry Pi while developing your Raspberry Pi skills. The authors - PiFace inventor, Andrew Robinson and Rasp

  16. Highly Scalable Multiplication for Distributed Sparse Multivariate Polynomials on Many-core Systems

    OpenAIRE

    Gastineau, Mickael; Laskar, Jacques

    2013-01-01

    We present a highly scalable algorithm for multiplying sparse multivariate polynomials represented in a distributed format. This algo- rithm targets not only the shared memory multicore computers, but also computers clusters or specialized hardware attached to a host computer, such as graphics processing units or many-core coprocessors. The scal- ability on the large number of cores is ensured by the lacks of synchro- nizations, locks and false-sharing during the main parallel step.

  17. A Novel Scalable Deblocking-Filter Architecture for H.264/AVC and SVC Video Codecs

    OpenAIRE

    Cervero, Teresa; Otero Marnotes, Andres; López, S.; Torre Arnanz, Eduardo de la; Gallicó, G.; Sarmiento, Roberto; Riesgo Alcaide, Teresa

    2011-01-01

    A highly parallel and scalable Deblocking Filter (DF) hardware architecture for H.264/AVC and SVC video codecs is presented in this paper. The proposed architecture mainly consists on a coarse grain systolic array obtained by replicating a unique and homogeneous Functional Unit (FU), in which a whole Deblocking-Filter unit is implemented. The proposal is also based on a novel macroblock-level parallelization strategy of the filtering algorithm which improves the final performance by exploitin...

  18. 16 CFR 1508.6 - Hardware.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 2 2010-01-01 2010-01-01 false Hardware. 1508.6 Section 1508.6 Commercial... FULL-SIZE BABY CRIBS § 1508.6 Hardware. (a) A crib shall be designed and constructed in a manner that eliminates from any hardware accessible to a child within the crib the possibility of the hardware's...

  19. Compact accelerator for medical therapy

    Energy Technology Data Exchange (ETDEWEB)

    Caporaso, George J.; Chen, Yu-Jiuan; Hawkins, Steven A.; Sampayan, Stephen E.; Paul, Arthur C.

    2010-05-04

    A compact accelerator system having an integrated particle generator-linear accelerator with a compact, small-scale construction capable of producing an energetic (.about.70-250 MeV) proton beam or other nuclei and transporting the beam direction to a medical therapy patient without the need for bending magnets or other hardware often required for remote beam transport. The integrated particle generator-accelerator is actuable as a unitary body on a support structure to enable scanning of a particle beam by direction actuation of the particle generator-accelerator.

  20. Physical principles for scalable neural recording.

    Science.gov (United States)

    Marblestone, Adam H; Zamft, Bradley M; Maguire, Yael G; Shapiro, Mikhail G; Cybulski, Thaddeus R; Glaser, Joshua I; Amodei, Dario; Stranges, P Benjamin; Kalhor, Reza; Dalrymple, David A; Seo, Dongjin; Alon, Elad; Maharbiz, Michel M; Carmena, Jose M; Rabaey, Jan M; Boyden, Edward S; Church, George M; Kording, Konrad P

    2013-01-01

    Simultaneously measuring the activities of all neurons in a mammalian brain at millisecond resolution is a challenge beyond the limits of existing techniques in neuroscience. Entirely new approaches may be required, motivating an analysis of the fundamental physical constraints on the problem. We outline the physical principles governing brain activity mapping using optical, electrical, magnetic resonance, and molecular modalities of neural recording. Focusing on the mouse brain, we analyze the scalability of each method, concentrating on the limitations imposed by spatiotemporal resolution, energy dissipation, and volume displacement. Based on this analysis, all existing approaches require orders of magnitude improvement in key parameters. Electrical recording is limited by the low multiplexing capacity of electrodes and their lack of intrinsic spatial resolution, optical methods are constrained by the scattering of visible light in brain tissue, magnetic resonance is hindered by the diffusion and relaxation timescales of water protons, and the implementation of molecular recording is complicated by the stochastic kinetics of enzymes. Understanding the physical limits of brain activity mapping may provide insight into opportunities for novel solutions. For example, unconventional methods for delivering electrodes may enable unprecedented numbers of recording sites, embedded optical devices could allow optical detectors to be placed within a few scattering lengths of the measured neurons, and new classes of molecularly engineered sensors might obviate cumbersome hardware architectures. We also study the physics of powering and communicating with microscale devices embedded in brain tissue and find that, while radio-frequency electromagnetic data transmission suffers from a severe power-bandwidth tradeoff, communication via infrared light or ultrasound may allow high data rates due to the possibility of spatial multiplexing. The use of embedded local recording and

  1. VALU, AVX and GPU acceleration techniques for parallel FDTD methods

    CERN Document Server

    Yu, Wenhua

    2013-01-01

    This book introduces a general hardware acceleration technique that can significantly speed up FDTD simulations and their applications to engineering problems without requiring any additional hardware devices. This acceleration of complex problems can be efficient in saving both time and money and once learned these new techniques can be used repeatedly.

  2. No-hardware-signature cybersecurity-crypto-module: a resilient cyber defense agent

    Science.gov (United States)

    Zaghloul, A. R. M.; Zaghloul, Y. A.

    2014-06-01

    We present an optical cybersecurity-crypto-module as a resilient cyber defense agent. It has no hardware signature since it is bitstream reconfigurable, where single hardware architecture functions as any selected device of all possible ones of the same number of inputs. For a two-input digital device, a 4-digit bitstream of 0s and 1s determines which device, of a total of 16 devices, the hardware performs as. Accordingly, the hardware itself is not physically reconfigured, but its performance is. Such a defense agent allows the attack to take place, rendering it harmless. On the other hand, if the system is already infected with malware sending out information, the defense agent allows the information to go out, rendering it meaningless. The hardware architecture is immune to side attacks since such an attack would reveal information on the attack itself and not on the hardware. This cyber defense agent can be used to secure a point-to-point, point-to-multipoint, a whole network, and/or a single entity in the cyberspace. Therefore, ensuring trust between cyber resources. It can provide secure communication in an insecure network. We provide the hardware design and explain how it works. Scalability of the design is briefly discussed. (Protected by United States Patents No.: US 8,004,734; US 8,325,404; and other National Patents worldwide.)

  3. Scalable Techniques for Formal Verification

    CERN Document Server

    Ray, Sandip

    2010-01-01

    This book presents state-of-the-art approaches to formal verification techniques to seamlessly integrate different formal verification methods within a single logical foundation. It should benefit researchers and practitioners looking to get a broad overview of the spectrum of formal verification techniques, as well as approaches to combining such techniques within a single framework. Coverage includes a range of case studies showing how such combination is fruitful in developing a scalable verification methodology for industrial designs. This book outlines both theoretical and practical issue

  4. Flexible scalable photonic manufacturing method

    Science.gov (United States)

    Skunes, Timothy A.; Case, Steven K.

    2003-06-01

    A process for flexible, scalable photonic manufacturing is described. Optical components are actively pre-aligned and secured to precision mounts. In a subsequent operation, the mounted optical components are passively placed onto a substrate known as an Optical Circuit Board (OCB). The passive placement may be either manual for low volume applications or with a pick-and-place robot for high volume applications. Mating registration features on the component mounts and the OCB facilitate accurate optical alignment. New photonic circuits may be created by changing the layout of the OCB. Predicted yield data from Monte Carlo tolerance simulations for two fiber optic photonic circuits is presented.

  5. Highly Scalable Matching Pursuit Signal Decomposition Algorithm

    Data.gov (United States)

    National Aeronautics and Space Administration — In this research, we propose a variant of the classical Matching Pursuit Decomposition (MPD) algorithm with significantly improved scalability and computational...

  6. Perceptual compressive sensing scalability in mobile video

    Science.gov (United States)

    Bivolarski, Lazar

    2011-09-01

    Scalability features embedded within the video sequences allows for streaming over heterogeneous networks to a variety of end devices. Compressive sensing techniques that will allow for lowering the complexity increase the robustness of the video scalability are reviewed. Human visual system models are often used in establishing perceptual metrics that would evaluate quality of video. Combining of perceptual and compressive sensing approach outlined from recent investigations. The performance and the complexity of different scalability techniques are evaluated. Application of perceptual models to evaluation of the quality of compressive sensing scalability is considered in the near perceptually lossless case and to the appropriate coding schemes is reviewed.

  7. Scalable Quantum Circuit and Control for a Superconducting Surface Code

    Science.gov (United States)

    Versluis, R.; Poletto, S.; Khammassi, N.; Tarasinski, B.; Haider, N.; Michalak, D. J.; Bruno, A.; Bertels, K.; DiCarlo, L.

    2017-09-01

    We present a scalable scheme for executing the error-correction cycle of a monolithic surface-code fabric composed of fast-flux-tunable transmon qubits with nearest-neighbor coupling. An eight-qubit unit cell forms the basis for repeating both the quantum hardware and coherent control, enabling spatial multiplexing. This control uses three fixed frequencies for all single-qubit gates and a unique frequency-detuning pattern for each qubit in the cell. By pipelining the interaction and readout steps of ancilla-based X - and Z -type stabilizer measurements, we can engineer detuning patterns that avoid all second-order transmon-transmon interactions except those exploited in controlled-phase gates, regardless of fabric size. Our scheme is applicable to defect-based and planar logical qubits, including lattice surgery.

  8. The principles of computer hardware

    CERN Document Server

    Clements, Alan

    2000-01-01

    Principles of Computer Hardware, now in its third edition, provides a first course in computer architecture or computer organization for undergraduates. The book covers the core topics of such a course, including Boolean algebra and logic design; number bases and binary arithmetic; the CPU; assembly language; memory systems; and input/output methods and devices. It then goes on to cover the related topics of computer peripherals such as printers; the hardware aspects of the operating system; and data communications, and hence provides a broader overview of the subject. Its readable, tutorial-based approach makes it an accessible introduction to the subject. The book has extensive in-depth coverage of two microprocessors, one of which (the 68000) is widely used in education. All chapters in the new edition have been updated. Major updates include: powerful software simulations of digital systems to accompany the chapters on digital design; a tutorial-based introduction to assembly language, including many exam...

  9. Performance and Scalability Evaluation of the Ceph Parallel File System

    Energy Technology Data Exchange (ETDEWEB)

    Wang, Feiyi [ORNL; Nelson, Mark [Inktank Storage, Inc.; Oral, H Sarp [ORNL; Settlemyer, Bradley W [ORNL; Atchley, Scott [ORNL; Caldwell, Blake A [ORNL; Hill, Jason J [ORNL

    2013-01-01

    Ceph is an open-source and emerging parallel distributed file and storage system technology. By design, Ceph assumes running on unreliable and commodity storage and network hardware and provides reliability and fault-tolerance through controlled object placement and data replication. We evaluated the Ceph technology for scientific high-performance computing (HPC) environments. This paper presents our evaluation methodology, experiments, results and observations from mostly parallel I/O performance and scalability perspectives. Our work made two unique contributions. First, our evaluation is performed under a realistic setup for a large-scale capability HPC environment using a commercial high-end storage system. Second, our path of investigation, tuning efforts, and findings made direct contributions to Ceph's development and improved code quality, scalability, and performance. These changes should also benefit both Ceph and HPC communities at large. Throughout the evaluation, we observed that Ceph still is an evolving technology under fast-paced development and showing great promises.

  10. Selection Criteria for Computer Software and Hardware: A Case Study of Six University Libraries in Nigeria

    Directory of Open Access Journals (Sweden)

    Udoh-Ilomechine Queenette

    2011-12-01

    Full Text Available This paper investigates the criteria used in the selection of computer hardware and software in six university libraries in Nigeria. Six (6 copies of a questionnaire were sent to selected librarians in Edo and Delta states, Nigeria. All copies of the questionnaire were retrieved. The data collected were analyzed. The fin dings reveal that the respondents took into consideration such factors as memory, speed, capacity, durability, costs, reliability and standardization, brand and manufacturer, warranty, and scalability of the system before procuring computer hardware. The respondents also take into consideration the reliability and track record of the vendor, service and technical support, previews or sample sections, compatibility with other program s being used, product cost, and data migration before procuring computer software. It is also noteworthy that the respondents have encountered with electricity failure, improper implementation, and difficulties to get qualified personnel to maintain and/or repair computer hardware when it was broken down.

  11. Scalable Performance Measurement and Analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gamblin, Todd [Univ. of North Carolina, Chapel Hill, NC (United States)

    2009-01-01

    Concurrency levels in large-scale, distributed-memory supercomputers are rising exponentially. Modern machines may contain 100,000 or more microprocessor cores, and the largest of these, IBM's Blue Gene/L, contains over 200,000 cores. Future systems are expected to support millions of concurrent tasks. In this dissertation, we focus on efficient techniques for measuring and analyzing the performance of applications running on very large parallel machines. Tuning the performance of large-scale applications can be a subtle and time-consuming task because application developers must measure and interpret data from many independent processes. While the volume of the raw data scales linearly with the number of tasks in the running system, the number of tasks is growing exponentially, and data for even small systems quickly becomes unmanageable. Transporting performance data from so many processes over a network can perturb application performance and make measurements inaccurate, and storing such data would require a prohibitive amount of space. Moreover, even if it were stored, analyzing the data would be extremely time-consuming. In this dissertation, we present novel methods for reducing performance data volume. The first draws on multi-scale wavelet techniques from signal processing to compress systemwide, time-varying load-balance data. The second uses statistical sampling to select a small subset of running processes to generate low-volume traces. A third approach combines sampling and wavelet compression to stratify performance data adaptively at run-time and to reduce further the cost of sampled tracing. We have integrated these approaches into Libra, a toolset for scalable load-balance analysis. We present Libra and show how it can be used to analyze data from large scientific applications scalably.

  12. Accelerating Cardiac Bidomain Simulations Using Graphics Processing Units

    Science.gov (United States)

    Neic, Aurel; Liebmann, Manfred; Hoetzl, Elena; Mitchell, Lawrence; Vigmond, Edward J.; Haase, Gundolf

    2013-01-01

    Anatomically realistic and biophysically detailed multiscale computer models of the heart are playing an increasingly important role in advancing our understanding of integrated cardiac function in health and disease. Such detailed simulations, however, are computationally vastly demanding, which is a limiting factor for a wider adoption of in-silico modeling. While current trends in high-performance computing (HPC) hardware promise to alleviate this problem, exploiting the potential of such architectures remains challenging since strongly scalable algorithms are necessitated to reduce execution times. Alternatively, acceleration technologies such as graphics processing units (GPUs) are being considered. While the potential of GPUs has been demonstrated in various applications, benefits in the context of bidomain simulations where large sparse linear systems have to be solved in parallel with advanced numerical techniques are less clear. In this study, the feasibility of multi-GPU bidomain simulations is demonstrated by running strong scalability benchmarks using a state-of-the-art model of rabbit ventricles. The model is spatially discretized using the finite element methods (FEM) on fully unstructured grids. The GPU code is directly derived from a large pre-existing code, the Cardiac Arrhythmia Research Package (CARP), with very minor perturbation of the code base. Overall, bidomain simulations were sped up by a factor of 11.8 to 16.3 in benchmarks running on 6–20 GPUs compared to the same number of CPU cores. To match the fastest GPU simulation which engaged 20GPUs, 476 CPU cores were required on a national supercomputing facility. PMID:22692867

  13. Hardware and software reliability estimation using simulations

    Science.gov (United States)

    Swern, Frederic L.

    1994-01-01

    The simulation technique is used to explore the validation of both hardware and software. It was concluded that simulation is a viable means for validating both hardware and software and associating a reliability number with each. This is useful in determining the overall probability of system failure of an embedded processor unit, and improving both the code and the hardware where necessary to meet reliability requirements. The methodologies were proved using some simple programs, and simple hardware models.

  14. Accelerating Dense Linear Algebra on the GPU

    DEFF Research Database (Denmark)

    Sørensen, Hans Henrik Brandenborg

    GPUs have already become an integral part of high performance scientific computing, since they offer dedicated parallel hardware that can potentially accelerate the execution of many scientific applications. In this talk, I will consider the automatic performance acceleration of dense vector...... and matrix-vector operations on GPUs. Such operations form the backbone of level 1 and level 2 routines in the Basic Linear Algebra Subroutines (BLAS) library and are therefore of great importance in many scientific applications. The target hardware is the most recent NVIDIA Tesla 20-series (Fermi...... architecture). Most of the techniques I discuss for accelerating dense linear algebra are applicable to memory-bound GPU algorithms in general....

  15. 16 CFR 1509.7 - Hardware.

    Science.gov (United States)

    2010-01-01

    ... 16 Commercial Practices 2 2010-01-01 2010-01-01 false Hardware. 1509.7 Section 1509.7 Commercial Practices CONSUMER PRODUCT SAFETY COMMISSION FEDERAL HAZARDOUS SUBSTANCES ACT REGULATIONS REQUIREMENTS FOR NON-FULL-SIZE BABY CRIBS § 1509.7 Hardware. (a) The hardware in a non-full-size baby crib shall be...

  16. GENI: Grid Hardware and Software

    Energy Technology Data Exchange (ETDEWEB)

    None

    2012-01-09

    GENI Project: The 15 projects in ARPA-E’s GENI program, short for “Green Electricity Network Integration,” aim to modernize the way electricity is transmitted in the U.S. through advances in hardware and software for the electric grid. These advances will improve the efficiency and reliability of electricity transmission, increase the amount of renewable energy the grid can utilize, and provide energy suppliers and consumers with greater control over their power flows in order to better manage peak power demand and cost.

  17. Preparing the hardware of the CMS Electromagnetic Calorimeter control and safety systems for LHC Run 2

    CERN Document Server

    AUTHOR|(CDS)2068025; Di Calafiori, D.; Cirkovic, P.; Dissertori, G.; Djambazov, L.; Jovanovic, D.; Lustermann, W.; Zelepoukine, S.

    2016-01-01

    The Detector Control System of the CMS Electromagnetic Calorimeter has undergone significant improvements during the first LHC Long Shutdown. Based on the experience acquired during the first period of physics data taking of the LHC, several hardware projects were carried out to improve data accuracy, to minimise the impact of failures and to extend remote control possibilities in order to accelerate recovery from problematic situations. This paper outlines the hardware of the detector control and safety systems and explains in detail the requirements, design and commissioning of the new hardware projects.

  18. Hardware complications in scoliosis surgery

    Energy Technology Data Exchange (ETDEWEB)

    Bagchi, Kaushik; Mohaideen, Ahamed [Department of Orthopaedic Surgery and Musculoskeletal Services, Maimonides Medical Center, Brooklyn, NY (United States); Thomson, Jeffrey D. [Connecticut Children' s Medical Center, Department of Orthopaedics, Hartford, CT (United States); Foley, Christopher L. [Department of Radiology, Connecticut Children' s Medical Center, Hartford, Connecticut (United States)

    2002-07-01

    Background: Scoliosis surgery has undergone a dramatic evolution over the past 20 years with the advent of new surgical techniques and sophisticated instrumentation. Surgeons have realized scoliosis is a complex multiplanar deformity that requires thorough knowledge of spinal anatomy and pathophysiology in order to manage patients afflicted by it. Nonoperative modalities such as bracing and casting still play roles in the treatment of scoliosis; however, it is the operative treatment that has revolutionized the treatment of this deformity that affects millions worldwide. As part of the evolution of scoliosis surgery, newer implants have resulted in improved outcomes with respect to deformity correction, reliability of fixation, and paucity of complications. Each technique and implant has its own set of unique complications, and the surgeon must appreciate these when planning surgery. Materials and methods: Various surgical techniques and types of instrumentation typically used in scoliosis surgery are briefly discussed. Though scoliosis surgery is associated with a wide variety of complications, only those that directly involve the hardware are discussed. The current literature is reviewed and several illustrative cases of patients treated for scoliosis at the Connecticut Children's Medical Center and the Newington Children's Hospital in Connecticut are briefly presented. Conclusion: Spine surgeons and radiologists should be familiar with the different types of instrumentation in the treatment of scoliosis. Furthermore, they should recognize the clinical and roentgenographic signs of hardware failure as part of prompt and effective treatment of such complications. (orig.)

  19. Quality Scalability Aware Watermarking for Visual Content.

    Science.gov (United States)

    Bhowmik, Deepayan; Abhayaratne, Charith

    2016-11-01

    Scalable coding-based content adaptation poses serious challenges to traditional watermarking algorithms, which do not consider the scalable coding structure and hence cannot guarantee correct watermark extraction in media consumption chain. In this paper, we propose a novel concept of scalable blind watermarking that ensures more robust watermark extraction at various compression ratios while not effecting the visual quality of host media. The proposed algorithm generates scalable and robust watermarked image code-stream that allows the user to constrain embedding distortion for target content adaptations. The watermarked image code-stream consists of hierarchically nested joint distortion-robustness coding atoms. The code-stream is generated by proposing a new wavelet domain blind watermarking algorithm guided by a quantization based binary tree. The code-stream can be truncated at any distortion-robustness atom to generate the watermarked image with the desired distortion-robustness requirements. A blind extractor is capable of extracting watermark data from the watermarked images. The algorithm is further extended to incorporate a bit-plane discarding-based quantization model used in scalable coding-based content adaptation, e.g., JPEG2000. This improves the robustness against quality scalability of JPEG2000 compression. The simulation results verify the feasibility of the proposed concept, its applications, and its improved robustness against quality scalable content adaptation. Our proposed algorithm also outperforms existing methods showing 35% improvement. In terms of robustness to quality scalable video content adaptation using Motion JPEG2000 and wavelet-based scalable video coding, the proposed method shows major improvement for video watermarking.

  20. Imaging of current spinal hardware: lumbar spine.

    Science.gov (United States)

    Ha, Alice S; Petscavage-Thomas, Jonelle M

    2014-09-01

    The purposes of this article are to review the indications for and the materials and designs of hardware more commonly used in the lumbar spine; to discuss alternatives for each of the types of hardware; to review normal postoperative imaging findings; to describe the appropriateness of different imaging modalities for postoperative evaluation; and to show examples of hardware complications. Stabilization and fusion of the lumbar spine with intervertebral disk replacement, artificial ligaments, spinous process distraction devices, plate-and-rod systems, dynamic posterior fusion devices, and newer types of material incorporation are increasingly more common in contemporary surgical practice. These spinal hardware devices will be seen more often in radiology practice. Successful postoperative radiologic evaluation of this spinal hardware necessitates an understanding of fundamental hardware design, physiologic objectives, normal postoperative imaging appearances, and unique complications. Radiologists may have little training and experience with the new and modified types of hardware used in the lumbar spine.

  1. Fast and Reliable Mouse Picking Using Graphics Hardware

    Directory of Open Access Journals (Sweden)

    Hanli Zhao

    2009-01-01

    Full Text Available Mouse picking is the most commonly used intuitive operation to interact with 3D scenes in a variety of 3D graphics applications. High performance for such operation is necessary in order to provide users with fast responses. This paper proposes a fast and reliable mouse picking algorithm using graphics hardware for 3D triangular scenes. Our approach uses a multi-layer rendering algorithm to perform the picking operation in linear time complexity. The objectspace based ray-triangle intersection test is implemented in a highly parallelized geometry shader. After applying the hardware-supported occlusion queries, only a small number of objects (or sub-objects are rendered in subsequent layers, which accelerates the picking efficiency. Experimental results demonstrate the high performance of our novel approach. Due to its simplicity, our algorithm can be easily integrated into existing real-time rendering systems.

  2. Reconfigurable hardware-software codesign methodology for protein identification.

    Science.gov (United States)

    Gudur, Venkateshwarlu Y; Thallada, Sandeep; Deevi, Abhinay R; Gande, Venkata Krishna; Acharyya, Amit; Bhandari, Vasundhra; Sharma, Paresh; Khursheed, Saqib; Naik, Ganesh R

    2016-08-01

    In this paper we propose an on-the-fly reconfigurable hardware-software codesign based reconfigurable solution for real-time protein identification. Reconfigurable string matching is performed in the disciplines of protein identification and biomarkers discovery. With the generation of plethora of sequenced data and number of biomarkers for several diseases, it is becoming necessary to have an accelerated processing and on-the-fly reconfigurable system design methodology to bring flexibility to its usage in the medical science community without the need of changing the entire hardware every time with the advent of new biomarker or protein. The proteome database of human at UniProtKB (Proteome ID up000005640) comprising of 42132 canonical and isoform proteins with variable database-size are used for testing the proposed design and the performance of the proposed system has been found to compare favorably with the state-of-the-art approaches with the additional advantage of real-time reconfigurability.

  3. Grassmann Averages for Scalable Robust PCA

    DEFF Research Database (Denmark)

    Hauberg, Søren; Feragen, Aasa; Black, Michael J.

    2014-01-01

    As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase—“big data” implies “big outliers”. While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbit......, making it scalable to “big noisy data.” We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie....

  4. Fast and scalable inequality joins

    KAUST Repository

    Khayyat, Zuhair

    2016-09-07

    Inequality joins, which is to join relations with inequality conditions, are used in various applications. Optimizing joins has been the subject of intensive research ranging from efficient join algorithms such as sort-merge join, to the use of efficient indices such as (Formula presented.)-tree, (Formula presented.)-tree and Bitmap. However, inequality joins have received little attention and queries containing such joins are notably very slow. In this paper, we introduce fast inequality join algorithms based on sorted arrays and space-efficient bit-arrays. We further introduce a simple method to estimate the selectivity of inequality joins which is then used to optimize multiple predicate queries and multi-way joins. Moreover, we study an incremental inequality join algorithm to handle scenarios where data keeps changing. We have implemented a centralized version of these algorithms on top of PostgreSQL, a distributed version on top of Spark SQL, and an existing data cleaning system, Nadeef. By comparing our algorithms against well-known optimization techniques for inequality joins, we show our solution is more scalable and several orders of magnitude faster. © 2016 Springer-Verlag Berlin Heidelberg

  5. Scalable encryption using alpha rooting

    Science.gov (United States)

    Wharton, Eric J.; Panetta, Karen A.; Agaian, Sos S.

    2008-04-01

    Full and partial encryption methods are important for subscription based content providers, such as internet and cable TV pay channels. Providers need to be able to protect their products while at the same time being able to provide demonstrations to attract new customers without giving away the full value of the content. If an algorithm were introduced which could provide any level of full or partial encryption in a fast and cost effective manner, the applications to real-time commercial implementation would be numerous. In this paper, we present a novel application of alpha rooting, using it to achieve fast and straightforward scalable encryption with a single algorithm. We further present use of the measure of enhancement, the Logarithmic AME, to select optimal parameters for the partial encryption. When parameters are selected using the measure, the output image achieves a balance between protecting the important data in the image while still containing a good overall representation of the image. We will show results for this encryption method on a number of images, using histograms to evaluate the effectiveness of the encryption.

  6. Finite Element Modeling on Scalable Parallel Computers

    Science.gov (United States)

    Cwik, T.; Zuffada, C.; Jamnejad, V.; Katz, D.

    1995-01-01

    A coupled finite element-integral equation was developed to model fields scattered from inhomogenous, three-dimensional objects of arbitrary shape. This paper outlines how to implement the software on a scalable parallel processor.

  7. Chromium Renderserver: Scalable and Open Source Remote RenderingInfrastructure

    Energy Technology Data Exchange (ETDEWEB)

    Paul, Brian; Ahern, Sean; Bethel, E. Wes; Brugger, Eric; Cook,Rich; Daniel, Jamison; Lewis, Ken; Owen, Jens; Southard, Dale

    2007-12-01

    Chromium Renderserver (CRRS) is software infrastructure thatprovides the ability for one or more users to run and view image outputfrom unmodified, interactive OpenGL and X11 applications on a remote,parallel computational platform equipped with graphics hardwareaccelerators via industry-standard Layer 7 network protocolsand clientviewers. The new contributions of this work include a solution to theproblem of synchronizing X11 and OpenGL command streams, remote deliveryof parallel hardware-accelerated rendering, and a performance analysis ofseveral different optimizations that are generally applicable to avariety of rendering architectures. CRRSis fully operational, Open Sourcesoftware.

  8. Corfu: A Platform for Scalable Consistency

    OpenAIRE

    Wei, Michael

    2017-01-01

    Corfu is a platform for building systems which are extremely scalable, strongly consistent and robust. Unlike other systems which weaken guarantees to provide better performance, we have built Corfu with a resilient fabric tuned and engineered for scalability and strong consistency at its core: the Corfu shared log. On top of the Corfu log, we have built a layer of advanced data services which leverage the properties of the Corfu log. Today, Corfu is already replacing data platforms in commer...

  9. Visual analytics in scalable visualization environments

    OpenAIRE

    Yamaoka, So

    2011-01-01

    Visual analytics is an interdisciplinary field that facilitates the analysis of the large volume of data through interactive visual interface. This dissertation focuses on the development of visual analytics techniques in scalable visualization environments. These scalable visualization environments offer a high-resolution, integrated virtual space, as well as a wide-open physical space that affords collaborative user interaction. At the same time, the sheer scale of these environments poses ...

  10. Hardware-and-software-based collective communication on the Quadrics network.

    Energy Technology Data Exchange (ETDEWEB)

    Petrini, F. (Fabrizio); Coll, S. (Salvador); Frachtemberg, E. (Eitan); Hoisie, A. (Adolfy)

    2001-01-01

    The efficient implementation of collective communication patterns in a parallel machine is a challenging design effort, that requires the solution of many problems. In this paper we present an in-depth description of how the Quadrics network supports both hardware- and software-based collectives. We describe the main features of the two building blocks of this network, a network interface that can perform zero-copy user-level communication and a wormhole switch. We also focus our attention on the routing and $ow control algorithms, deadlock avoidance and on how the processing nodes are integrated in a global, virtual shared memory. Experimental results conducted on 64-node AlphaServer cluster indicate that the time to complete the hardware-based barrier synchronization on the whole network is as low as 6 ps, with veiy good scalability. Good latency and scalability are also achieved with the software-based synchronization, which takes about 15 ps. With the broadcast, similar performance is achieved by the hardware- and software-based implementations, which can deliver messages of up to 256 b,ytes in 13 ps and can get a sustained bandwidth of 288 Mbyteshec on all the nodes, with wressages larger than 64KB. The hardware-based barrier is almost insensitive to the network congestion, with 93% of the synchronizations taking less than 20 ps. On the other hand, the software based implementation suflers from a signif cant performance degradation. In high load environments the hardware broadcast maintains a reasonably good performance, delivering messages up to 2KB in 200 ps, while the software broadcast suffers from slightly higher latencies inherited by the synchronization mechanism.

  11. Fully scalable video coding in multicast applications

    Science.gov (United States)

    Lerouge, Sam; De Sutter, Robbie; Lambert, Peter; Van de Walle, Rik

    2004-01-01

    The increasing diversity of the characteristics of the terminals and networks that are used to access multimedia content through the internet introduces new challenges for the distribution of multimedia data. Scalable video coding will be one of the elementary solutions in this domain. This type of coding allows to adapt an encoded video sequence to the limitations of the network or the receiving device by means of very basic operations. Algorithms for creating fully scalable video streams, in which multiple types of scalability are offered at the same time, are becoming mature. On the other hand, research on applications that use such bitstreams is only recently emerging. In this paper, we introduce a mathematical model for describing such bitstreams. In addition, we show how we can model applications that use scalable bitstreams by means of definitions that are built on top of this model. In particular, we chose to describe a multicast protocol that is targeted at scalable bitstreams. This way, we will demonstrate that it is possible to define an abstract model for scalable bitstreams, that can be used as a tool for reasoning about such bitstreams and related applications.

  12. FPGA-accelerated simulation of computer systems

    CERN Document Server

    Angepat, Hari; Chung, Eric S; Hoe, James C; Chung, Eric S

    2014-01-01

    To date, the most common form of simulators of computer systems are software-based running on standard computers. One promising approach to improve simulation performance is to apply hardware, specifically reconfigurable hardware in the form of field programmable gate arrays (FPGAs). This manuscript describes various approaches of using FPGAs to accelerate software-implemented simulation of computer systems and selected simulators that incorporate those techniques. More precisely, we describe a simulation architecture taxonomy that incorporates a simulation architecture specifically designed f

  13. Hardware Resource Allocation for Hardware/Software Partitioning in the LYCOS System

    DEFF Research Database (Denmark)

    Grode, Jesper Nicolai Riis; Madsen, Jan; Knudsen, Peter Voigt

    1998-01-01

    This paper presents a novel hardware resource allocation technique for hardware/software partitioning. It allocates hardware resources to the hardware data-path using information such as data-dependencies between operations in the application, and profiling information. The algorithm is useful...... as a designer's/design tool's aid to generate good hardware allocations for use in hardware/software partitioning. The algorithm has been implemented in a tool under the LYCOS system. The results show that the allocations produced by the algorithm come close to the best allocations obtained by exhaustive search...

  14. Hardware Resource Allocation for Hardware/Software Partitioning in the LYCOS System

    DEFF Research Database (Denmark)

    Grode, Jesper Nicolai Riis; Knudsen, Peter Voigt; Madsen, Jan

    1998-01-01

    This paper presents a novel hardware resource allocation technique for hardware/software partitioning. It allocates hardware resources to the hardware data-path using information such as data-dependencies between operations in the application, and profiling information. The algorithm is useful...... as a designer's/design tool's aid to generate good hardware allocations for use in hardware/software partitioning. The algorithm has been implemented in a tool under the LYCOS system. The results show that the allocations produced by the algorithm come close to the best allocations obtained by exhaustive search....

  15. Projecto de hardware digital orientado por objectos

    OpenAIRE

    Fernandes, João M.; Machado, Ricardo J.

    1997-01-01

    Os limites entre os domínios do software e do hardware são cada vez mais ténues, pelo que técnicas inicialmente experimentadas no software têm vindo a ser gradualmente aplicadas no hardware. Este artigo pretende descrever o estado actual da utilização da tecnologia de programação orientada por objectos no projecto de hardware digital. São analisadas as vantagens e implicações quando se introduzem conceitos ligados à tecnologia orientada por objectos em projectos de hardware e é apresent...

  16. Open-source hardware for medical devices.

    Science.gov (United States)

    Niezen, Gerrit; Eslambolchilar, Parisa; Thimbleby, Harold

    2016-04-01

    Open-source hardware is hardware whose design is made publicly available so anyone can study, modify, distribute, make and sell the design or the hardware based on that design. Some open-source hardware projects can potentially be used as active medical devices. The open-source approach offers a unique combination of advantages, including reducing costs and faster innovation. This article compares 10 of open-source healthcare projects in terms of how easy it is to obtain the required components and build the device.

  17. Thermal Hardware for the Thermal Analyst

    Science.gov (United States)

    Steinfeld, David

    2015-01-01

    The presentation will be given at the 26th Annual Thermal Fluids Analysis Workshop (TFAWS 2015) hosted by the Goddard Space Flight Center (GSFC) Thermal Engineering Branch (Code 545). NCTS 21070-1. Most Thermal analysts do not have a good background into the hardware which thermally controls the spacecraft they design. SINDA and Thermal Desktop models are nice, but knowing how this applies to the actual thermal hardware (heaters, thermostats, thermistors, MLI blanketing, optical coatings, etc...) is just as important. The course will delve into the thermal hardware and their application techniques on actual spacecraft. Knowledge of how thermal hardware is used and applied will make a thermal analyst a better engineer.

  18. Future accelerators (?)

    Energy Technology Data Exchange (ETDEWEB)

    John Womersley

    2003-08-21

    I describe the future accelerator facilities that are currently foreseen for electroweak scale physics, neutrino physics, and nuclear structure. I will explore the physics justification for these machines, and suggest how the case for future accelerators can be made.

  19. Molecular Dynamics Simulations of Clathrate Hydrates on Specialised Hardware Platforms

    Directory of Open Access Journals (Sweden)

    Christian R. Trott

    2012-09-01

    Full Text Available Classical equilibrium molecular dynamics (MD simulations have been performed to investigate the computational performance of the Simple Point Charge (SPC and TIP4P water models applied to simulation of methane hydrates, and also of liquid water, on a variety of specialised hardware platforms, in addition to estimation of various equilibrium properties of clathrate hydrates. The FPGA-based accelerator MD-GRAPE 3 was used to accelerate substantially the computation of non-bonded forces, while GPU-based platforms were also used in conjunction with CUDA-enabled versions of the LAMMPS MD software packages to reduce computational time dramatically. The dependence of molecular system size and scaling with number of processors was also investigated. Considering performance relative to power consumption, it is seen that GPU-based computing is quite attractive.

  20. MR-Tree - A Scalable MapReduce Algorithm for Building Decision Trees

    Directory of Open Access Journals (Sweden)

    Vasile PURDILĂ

    2014-03-01

    Full Text Available Learning decision trees against very large amounts of data is not practical on single node computers due to the huge amount of calculations required by this process. Apache Hadoop is a large scale distributed computing platform that runs on commodity hardware clusters and can be used successfully for data mining task against very large datasets. This work presents a parallel decision tree learning algorithm expressed in MapReduce programming model that runs on Apache Hadoop platform and has a very good scalability with dataset size.

  1. DARHT II Scaled Accelerator Tests on the ETA II Accelerator*

    Energy Technology Data Exchange (ETDEWEB)

    Weir, J T; Anaya Jr, E M; Caporaso, G J; Chambers, F W; Chen, Y; Falabella, S; Lee, B S; Paul, A C; Raymond, B A; Richardson, R A; Watson, J A; Chan, D; Davis, H A; Day, L A; Scarpetti, R D; Schultze, M E; Hughes, T P

    2005-05-26

    The DARHT II accelerator at LANL is preparing a series of preliminary tests at the reduced voltage of 7.8 MeV. The transport hardware between the end of the accelerator and the final target magnet was shipped to LLNL and installed on ETA II. Using the ETA II beam at 5.2 MeV we completed a set of experiments designed reduce start up time on the DARHT II experiments and run the equipment in a configuration adapted to the reduced energy. Results of the beam transport using a reduced energy beam, including the kicker and kicker pulser system will be presented.

  2. A comprehensive workflow for general-purpose neural modeling with highly configurable neuromorphic hardware systems.

    Science.gov (United States)

    Brüderle, Daniel; Petrovici, Mihai A; Vogginger, Bernhard; Ehrlich, Matthias; Pfeil, Thomas; Millner, Sebastian; Grübl, Andreas; Wendt, Karsten; Müller, Eric; Schwartz, Marc-Olivier; de Oliveira, Dan Husmann; Jeltsch, Sebastian; Fieres, Johannes; Schilling, Moritz; Müller, Paul; Breitwieser, Oliver; Petkov, Venelin; Muller, Lyle; Davison, Andrew P; Krishnamurthy, Pradeep; Kremkow, Jens; Lundqvist, Mikael; Muller, Eilif; Partzsch, Johannes; Scholze, Stefan; Zühl, Lukas; Mayr, Christian; Destexhe, Alain; Diesmann, Markus; Potjans, Tobias C; Lansner, Anders; Schüffny, René; Schemmel, Johannes; Meier, Karlheinz

    2011-05-01

    In this article, we present a methodological framework that meets novel requirements emerging from upcoming types of accelerated and highly configurable neuromorphic hardware systems. We describe in detail a device with 45 million programmable and dynamic synapses that is currently under development, and we sketch the conceptual challenges that arise from taking this platform into operation. More specifically, we aim at the establishment of this neuromorphic system as a flexible and neuroscientifically valuable modeling tool that can be used by non-hardware experts. We consider various functional aspects to be crucial for this purpose, and we introduce a consistent workflow with detailed descriptions of all involved modules that implement the suggested steps: The integration of the hardware interface into the simulator-independent model description language PyNN; a fully automated translation between the PyNN domain and appropriate hardware configurations; an executable specification of the future neuromorphic system that can be seamlessly integrated into this biology-to-hardware mapping process as a test bench for all software layers and possible hardware design modifications; an evaluation scheme that deploys models from a dedicated benchmark library, compares the results generated by virtual or prototype hardware devices with reference software simulations and analyzes the differences. The integration of these components into one hardware-software workflow provides an ecosystem for ongoing preparative studies that support the hardware design process and represents the basis for the maturity of the model-to-hardware mapping software. The functionality and flexibility of the latter is proven with a variety of experimental results.

  3. Wanted: Scalable Tracers for Diffusion Measurements

    Science.gov (United States)

    2015-01-01

    Scalable tracers are potentially a useful tool to examine diffusion mechanisms and to predict diffusion coefficients, particularly for hindered diffusion in complex, heterogeneous, or crowded systems. Scalable tracers are defined as a series of tracers varying in size but with the same shape, structure, surface chemistry, deformability, and diffusion mechanism. Both chemical homology and constant dynamics are required. In particular, branching must not vary with size, and there must be no transition between ordinary diffusion and reptation. Measurements using scalable tracers yield the mean diffusion coefficient as a function of size alone; measurements using nonscalable tracers yield the variation due to differences in the other properties. Candidate scalable tracers are discussed for two-dimensional (2D) diffusion in membranes and three-dimensional diffusion in aqueous solutions. Correlations to predict the mean diffusion coefficient of globular biomolecules from molecular mass are reviewed briefly. Specific suggestions for the 3D case include the use of synthetic dendrimers or random hyperbranched polymers instead of dextran and the use of core–shell quantum dots. Another useful tool would be a series of scalable tracers varying in deformability alone, prepared by varying the density of crosslinking in a polymer to make say “reinforced Ficoll” or “reinforced hyperbranched polyglycerol.” PMID:25319586

  4. Scalable L-infinite coding of meshes.

    Science.gov (United States)

    Munteanu, Adrian; Cernea, Dan C; Alecu, Alin; Cornelis, Jan; Schelkens, Peter

    2010-01-01

    The paper investigates the novel concept of local-error control in mesh geometry encoding. In contrast to traditional mesh-coding systems that use the mean-square error as target distortion metric, this paper proposes a new L-infinite mesh-coding approach, for which the target distortion metric is the L-infinite distortion. In this context, a novel wavelet-based L-infinite-constrained coding approach for meshes is proposed, which ensures that the maximum error between the vertex positions in the original and decoded meshes is lower than a given upper bound. Furthermore, the proposed system achieves scalability in L-infinite sense, that is, any decoding of the input stream will correspond to a perfectly predictable L-infinite distortion upper bound. An instantiation of the proposed L-infinite-coding approach is demonstrated for MESHGRID, which is a scalable 3D object encoding system, part of MPEG-4 AFX. In this context, the advantages of scalable L-infinite coding over L-2-oriented coding are experimentally demonstrated. One concludes that the proposed L-infinite mesh-coding approach guarantees an upper bound on the local error in the decoded mesh, it enables a fast real-time implementation of the rate allocation, and it preserves all the scalability features and animation capabilities of the employed scalable mesh codec.

  5. Emulated Muscle Spindle and Spiking Afferents Validates VLSI Neuromorphic Hardware as a Testbed for Sensorimotor Function and Disease

    Directory of Open Access Journals (Sweden)

    Chuanxin M. Niu

    2014-12-01

    Full Text Available The lack of multi-scale empirical measurements (e.g. recording simultaneously from neurons, muscles, whole body, etc. complicates understanding of sensorimotor function in humans. This is particularly true for the understanding of development during childhood, which requires evaluation of measurements over many years. We have developed a synthetic platform for emulating multi-scale activity of the vertebrate sensorimotor system. Our design benefits from Very Large Scale Integrated-circuit (VLSI technology to provide considerable scalability and high-speed, as much as 365x faster than real-time. An essential component of our design is the proprioceptive sensor, or muscle spindle. Here we demonstrate an accurate and extremely fast emulation of a muscle spindle and its spiking afferents, which are computationally expensive but fundamental for reflex functions. We implemented a well-known rate-based model of the spindle (Mileusnic et al., 2006 and a simplified spiking sensory neuron model using the Izhikevich approximation to the Hodgkin-Huxley model. The resulting behavior of our afferent sensory system is qualitatively compatible with classic cat soleus recording (Matthews, 1964; 1972; Crowe and Matthews, 1964b. Our results suggest that this simplified structure of the spindle and afferent neuron is sufficient to produce physiologically-realistic behavior. The VLSI technology allows us to accelerate this behavior beyond 365x real-time. Our goal is to use this testbed for predicting years of disease progression with only a few days of emulation. This is the first hardware emulation of the spindle afferent system, and it may have application not only for emulation of human health and disease, but also for the construction of compliant neuromorphic robotic systems.

  6. Emulated muscle spindle and spiking afferents validates VLSI neuromorphic hardware as a testbed for sensorimotor function and disease.

    Science.gov (United States)

    Niu, Chuanxin M; Nandyala, Sirish K; Sanger, Terence D

    2014-01-01

    The lack of multi-scale empirical measurements (e.g., recording simultaneously from neurons, muscles, whole body, etc.) complicates understanding of sensorimotor function in humans. This is particularly true for the understanding of development during childhood, which requires evaluation of measurements over many years. We have developed a synthetic platform for emulating multi-scale activity of the vertebrate sensorimotor system. Our design benefits from Very Large Scale Integrated-circuit (VLSI) technology to provide considerable scalability and high-speed, as much as 365× faster than real-time. An essential component of our design is the proprioceptive sensor, or muscle spindle. Here we demonstrate an accurate and extremely fast emulation of a muscle spindle and its spiking afferents, which are computationally expensive but fundamental for reflex functions. We implemented a well-known rate-based model of the spindle (Mileusnic et al., 2006) and a simplified spiking sensory neuron model using the Izhikevich approximation to the Hodgkin-Huxley model. The resulting behavior of our afferent sensory system is qualitatively compatible with classic cat soleus recording (Crowe and Matthews, 1964b; Matthews, 1964, 1972). Our results suggest that this simplified structure of the spindle and afferent neuron is sufficient to produce physiologically-realistic behavior. The VLSI technology allows us to accelerate this behavior beyond 365× real-time. Our goal is to use this testbed for predicting years of disease progression with only a few days of emulation. This is the first hardware emulation of the spindle afferent system, and it may have application not only for emulation of human health and disease, but also for the construction of compliant neuromorphic robotic systems.

  7. Relational Algebra as formalism for Hardware Design

    NARCIS (Netherlands)

    ten Berg, A.J.W.M.; ten Berg, A.J.W.M.; Huijs, C.; Krol, Th.

    1993-01-01

    This paper introduces relational algebra as an elegant formalism to describe hardware behaviour. Hardware behaviour is modelled by functions that are represented by sets of tables. Relational algebra, developed for designing large and consistent databases is capable to operate on sets of tables and

  8. Scalable Multicasting over Next-Generation Internet Design, Analysis and Applications

    CERN Document Server

    Tian, Xiaohua

    2013-01-01

    Next-generation Internet providers face high expectations, as contemporary users worldwide expect high-quality multimedia functionality in a landscape of ever-expanding network applications. This volume explores the critical research issue of turning today’s greatly enhanced hardware capacity to good use in designing a scalable multicast  protocol for supporting large-scale multimedia services. Linking new hardware to improved performance in the Internet’s next incarnation is a research hot-spot in the computer communications field.   The methodical presentation deals with the key questions in turn: from the mechanics of multicast protocols to current state-of-the-art designs, and from methods of theoretical analysis of these protocols to applying them in the ns2 network simulator, known for being hard to extend. The authors’ years of research in the field inform this thorough treatment, which covers details such as applying AOM (application-oriented multicast) protocol to IPTV provision and resolving...

  9. Design Considerations for Scalable High-Performance Vision Systems Embedded in Industrial Print Inspection Machines

    Directory of Open Access Journals (Sweden)

    Rössler Peter

    2007-01-01

    Full Text Available This paper describes the design of a scalable high-performance vision system which is used in the application area of optical print inspection. The system is able to process hundreds of megabytes of image data per second coming from several high-speed/high-resolution cameras. Due to performance requirements, some functionality has been implemented on dedicated hardware based on a field programmable gate array (FPGA, which is coupled to a high-end digital signal processor (DSP. The paper discusses design considerations like partitioning of image processing algorithms between hardware and software. The main chapters focus on functionality implemented on the FPGA, including low-level image processing algorithms (flat-field correction, image pyramid generation, neighborhood operations and advanced processing units (programmable arithmetic unit, geometry unit. Verification issues for the complex system are also addressed. The paper concludes with a summary of the FPGA resource usage and some performance results.

  10. Integrating reconfigurable hardware-based grid for high performance computing.

    Science.gov (United States)

    Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

    2015-01-01

    FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process.

  11. Integrating Reconfigurable Hardware-Based Grid for High Performance Computing

    Science.gov (United States)

    Dondo Gazzano, Julio; Sanchez Molina, Francisco; Rincon, Fernando; López, Juan Carlos

    2015-01-01

    FPGAs have shown several characteristics that make them very attractive for high performance computing (HPC). The impressive speed-up factors that they are able to achieve, the reduced power consumption, and the easiness and flexibility of the design process with fast iterations between consecutive versions are examples of benefits obtained with their use. However, there are still some difficulties when using reconfigurable platforms as accelerator that need to be addressed: the need of an in-depth application study to identify potential acceleration, the lack of tools for the deployment of computational problems in distributed hardware platforms, and the low portability of components, among others. This work proposes a complete grid infrastructure for distributed high performance computing based on dynamically reconfigurable FPGAs. Besides, a set of services designed to facilitate the application deployment is described. An example application and a comparison with other hardware and software implementations are shown. Experimental results show that the proposed architecture offers encouraging advantages for deployment of high performance distributed applications simplifying development process. PMID:25874241

  12. Integration of an intelligent systems behavior simulator and a scalable soldier-machine interface

    Science.gov (United States)

    Johnson, Tony; Manteuffel, Chris; Brewster, Benjamin; Tierney, Terry

    2007-04-01

    As the Army's Future Combat Systems (FCS) introduce emerging technologies and new force structures to the battlefield, soldiers will increasingly face new challenges in workload management. The next generation warfighter will be responsible for effectively managing robotic assets in addition to performing other missions. Studies of future battlefield operational scenarios involving the use of automation, including the specification of existing and proposed technologies, will provide significant insight into potential problem areas regarding soldier workload. The US Army Tank Automotive Research, Development, and Engineering Center (TARDEC) is currently executing an Army technology objective program to analyze and evaluate the effect of automated technologies and their associated control devices with respect to soldier workload. The Human-Robotic Interface (HRI) Intelligent Systems Behavior Simulator (ISBS) is a human performance measurement simulation system that allows modelers to develop constructive simulations of military scenarios with various deployments of interface technologies in order to evaluate operator effectiveness. One such interface is TARDEC's Scalable Soldier-Machine Interface (SMI). The scalable SMI provides a configurable machine interface application that is capable of adapting to several hardware platforms by recognizing the physical space limitations of the display device. This paper describes the integration of the ISBS and Scalable SMI applications, which will ultimately benefit both systems. The ISBS will be able to use the Scalable SMI to visualize the behaviors of virtual soldiers performing HRI tasks, such as route planning, and the scalable SMI will benefit from stimuli provided by the ISBS simulation environment. The paper describes the background of each system and details of the system integration approach.

  13. Comparative Modal Analysis of Sieve Hardware Designs

    Science.gov (United States)

    Thompson, Nathaniel

    2012-01-01

    The CMTB Thwacker hardware operates as a testbed analogue for the Flight Thwacker and Sieve components of CHIMRA, a device on the Curiosity Rover. The sieve separates particles with a diameter smaller than 150 microns for delivery to onboard science instruments. The sieving behavior of the testbed hardware should be similar to the Flight hardware for the results to be meaningful. The elastodynamic behavior of both sieves was studied analytically using the Rayleigh Ritz method in conjunction with classical plate theory. Finite element models were used to determine the mode shapes of both designs, and comparisons between the natural frequencies and mode shapes were made. The analysis predicts that the performance of the CMTB Thwacker will closely resemble the performance of the Flight Thwacker within the expected steady state operating regime. Excitations of the testbed hardware that will mimic the flight hardware were recommended, as were those that will improve the efficiency of the sieving process.

  14. Rapid VLIW Processor Customization for Signal Processing Applications Using Combinational Hardware Functions

    Directory of Open Access Journals (Sweden)

    Hoare Raymond R

    2006-01-01

    Full Text Available This paper presents an architecture that combines VLIW (very long instruction word processing with the capability to introduce application-specific customized instructions and highly parallel combinational hardware functions for the acceleration of signal processing applications. To support this architecture, a compilation and design automation flow is described for algorithms written in C. The key contributions of this paper are as follows: (1 a 4-way VLIW processor implemented in an FPGA, (2 large speedups through hardware functions, (3 a hardware/software interface with zero overhead, (4 a design methodology for implementing signal processing applications on this architecture, (5 tractable design automation techniques for extracting and synthesizing hardware functions. Several design tradeoffs for the architecture were examined including the number of VLIW functional units and register file size. The architecture was implemented on an Altera Stratix II FPGA. The Stratix II device was selected because it offers a large number of high-speed DSP (digital signal processing blocks that execute multiply-accumulate operations. Using the MediaBench benchmark suite, we tested our methodology and architecture to accelerate software. Our combined VLIW processor with hardware functions was compared to that of software executing on a RISC processor, specifically the soft core embedded NIOS II processor. For software kernels converted into hardware functions, we show a hardware performance multiplier of up to times that of software with an average times faster. For the entire application in which only a portion of the software is converted to hardware, the performance improvement is as much as 30X times faster than the nonaccelerated application, with a 12X improvement on average.

  15. Scalable Density-Based Subspace Clustering

    DEFF Research Database (Denmark)

    Müller, Emmanuel; Assent, Ira; Günnemann, Stephan

    2011-01-01

    For knowledge discovery in high dimensional databases, subspace clustering detects clusters in arbitrary subspace projections. Scalability is a crucial issue, as the number of possible projections is exponential in the number of dimensions. We propose a scalable density-based subspace clustering...... method that steers mining to few selected subspace clusters. Our novel steering technique reduces subspace processing by identifying and clustering promising subspaces and their combinations directly. Thereby, it narrows down the search space while maintaining accuracy. Thorough experiments on real...... and synthetic databases show that steering is efficient and scalable, with high quality results. For future work, our steering paradigm for density-based subspace clustering opens research potential for speeding up other subspace clustering approaches as well....

  16. Scalable Open Source Smart Grid Simulator (SGSim)

    DEFF Research Database (Denmark)

    Ebeid, Emad Samuel Malki; Jacobsen, Rune Hylsberg; Quaglia, Davide

    2017-01-01

    The future smart power grid will consist of an unlimited number of smart devices that communicate with control units to maintain the grid’s sustainability, efficiency, and balancing. In order to build and verify such controllers over a large grid, a scalable simulation environment is needed....... This paper presents an open source smart grid simulator (SGSim). The simulator is based on open source SystemC Network Simulation Library (SCNSL) and aims to model scalable smart grid applications. SGSim has been tested under different smart grid scenarios that contain hundreds of thousands of households...... and appliances. By using SGSim, different smart grid control strategies and protocols can be tested, validated and evaluated in a scalable environment....

  17. Scalable computing for evolutionary genomics.

    Science.gov (United States)

    Prins, Pjotr; Belhachemi, Dominique; Möller, Steffen; Smant, Geert

    2012-01-01

    Genomic data analysis in evolutionary biology is becoming so computationally intensive that analysis of multiple hypotheses and scenarios takes too long on a single desktop computer. In this chapter, we discuss techniques for scaling computations through parallelization of calculations, after giving a quick overview of advanced programming techniques. Unfortunately, parallel programming is difficult and requires special software design. The alternative, especially attractive for legacy software, is to introduce poor man's parallelization by running whole programs in parallel as separate processes, using job schedulers. Such pipelines are often deployed on bioinformatics computer clusters. Recent advances in PC virtualization have made it possible to run a full computer operating system, with all of its installed software, on top of another operating system, inside a "box," or virtual machine (VM). Such a VM can flexibly be deployed on multiple computers, in a local network, e.g., on existing desktop PCs, and even in the Cloud, to create a "virtual" computer cluster. Many bioinformatics applications in evolutionary biology can be run in parallel, running processes in one or more VMs. Here, we show how a ready-made bioinformatics VM image, named BioNode, effectively creates a computing cluster, and pipeline, in a few steps. This allows researchers to scale-up computations from their desktop, using available hardware, anytime it is required. BioNode is based on Debian Linux and can run on networked PCs and in the Cloud. Over 200 bioinformatics and statistical software packages, of interest to evolutionary biology, are included, such as PAML, Muscle, MAFFT, MrBayes, and BLAST. Most of these software packages are maintained through the Debian Med project. In addition, BioNode contains convenient configuration scripts for parallelizing bioinformatics software. Where Debian Med encourages packaging free and open source bioinformatics software through one central project

  18. How well do STARLAB and NBODY compare? II. Hardware and accuracy

    Science.gov (United States)

    Anders, P.; Baumgardt, H.; Gaburov, E.; Portegies Zwart, S.

    2012-04-01

    Most recent progress in understanding the dynamical evolution of star clusters relies on direct N-body simulations. Owing to the computational demands, and the desire to model more complex and more massive star clusters, hardware calculational accelerators, such as Gravity Pipe (GRAPE) special-purpose hardware or, more recently, graphics prucessing units (GPUs) are generally utilized. In addition, simulations can be accelerated by adjusting parameters determining the calculation accuracy (i.e. changing the internal simulation time-step used for each star). We extend our previous thorough comparison of basic quantities as derived from simulations performed either with STARLAB/KIRA or NBODY6. Here we focus on differences arising from using different hardware accelerations (including the increasingly popular graphic card accelerations/GPUs) and different calculation accuracy settings. We use the large number of star cluster models (for a fixed stellar mass function, without stellar/binary evolution, primordial binaries, external tidal fields, etc.) already used in the previous paper, evolve them with STARLAB/KIRA (and NBODY6, where required), analyse them in a consistent way and compare the averaged results quantitatively. For this quantitative comparison, we apply the bootstrap algorithm for functional dependencies developed in our previous study. In general, we find very high comparability of the simulation results, independent of the computer hardware (including the hardware accelerators) and the N-body code used. For the tested accuracy settings, we find that for reduced accuracy (i.e. time-step at least a factor of 2.5 larger than the standard setting) most simulation results deviate significantly from the results using standard settings. The remaining deviations are comprehensible and explicable.

  19. From Digital Disruption to Business Model Scalability

    DEFF Research Database (Denmark)

    Nielsen, Christian; Lund, Morten; Thomsen, Peter Poulsen

    2017-01-01

    a long time to replicate, business model scalability can be cornered into four dimensions. In many corporate restructuring exercises and Mergers and Acquisitions there is a tendency to look for synergies in the form of cost reductions, lean workflows and market segments. However, this state of mind......This article discusses the terms disruption, digital disruption, business models and business model scalability. It illustrates how managers should be using these terms for the benefit of their business by developing business models capable of achieving exponentially increasing returns to scale...

  20. From Digital Disruption to Business Model Scalability

    DEFF Research Database (Denmark)

    Nielsen, Christian; Lund, Morten; Thomsen, Peter Poulsen

    2017-01-01

    as a response to digital disruption. A series of case studies illustrate that besides frequent existing messages in the business literature relating to the importance of creating agile businesses, both in growing and declining economies, as well as hard to copy value propositions or value propositions that take......This article discusses the terms disruption, digital disruption, business models and business model scalability. It illustrates how managers should be using these terms for the benefit of their business by developing business models capable of achieving exponentially increasing returns to scale...... will seldom lead to business model scalability capable of competing with digital disruption(s)....

  1. Content-Aware Scalability-Type Selection for Rate Adaptation of Scalable Video

    Directory of Open Access Journals (Sweden)

    Tekalp A Murat

    2007-01-01

    Full Text Available Scalable video coders provide different scaling options, such as temporal, spatial, and SNR scalabilities, where rate reduction by discarding enhancement layers of different scalability-type results in different kinds and/or levels of visual distortion depend on the content and bitrate. This dependency between scalability type, video content, and bitrate is not well investigated in the literature. To this effect, we first propose an objective function that quantifies flatness, blockiness, blurriness, and temporal jerkiness artifacts caused by rate reduction by spatial size, frame rate, and quantization parameter scaling. Next, the weights of this objective function are determined for different content (shot types and different bitrates using a training procedure with subjective evaluation. Finally, a method is proposed for choosing the best scaling type for each temporal segment that results in minimum visual distortion according to this objective function given the content type of temporal segments. Two subjective tests have been performed to validate the proposed procedure for content-aware selection of the best scalability type on soccer videos. Soccer videos scaled from 600 kbps to 100 kbps by the proposed content-aware selection of scalability type have been found visually superior to those that are scaled using a single scalability option over the whole sequence.

  2. Hardware removal after osseous free flap reconstruction.

    Science.gov (United States)

    Day, Kristine E; Desmond, Renee; Magnuson, J Scott; Carroll, William R; Rosenthal, Eben L

    2014-01-01

    Identifying risk factors for hardware removal in patients undergoing mandibular reconstruction with vascularized osseous free flaps remains a challenge. The purpose of this study is to identify potential risk factors, including osteocutaneous radial forearm versus fibular flap, for need for removal and to describe the fate of implanted hardware. Case series with chart review Setting Academic tertiary care medical center. Two hundred thirteen patients undergoing 227 vascularized osseous mandibular reconstructions between the years 2004 and 2012. Data were compiled through a manual chart review, and patients incurring hardware removals were identified. Thirty-four of 213 evaluable vascularized osseous free flaps (16%) underwent surgical removal of hardware. The average length of time to removal was 16.2 months (median 10 months), with the majority of removals occurring within the first year. Osteocutaneous radial forearm free flaps (OCRFFF) incurred a slightly higher percentage of hardware removals (9.9%) compared to fibula flaps (6.1%). Partial removal was performed in 8 of 34 cases, and approximately 38% of these required additional surgery for removal. Hardware removal was associated with continued tobacco use after mandibular reconstruction (P = .03). Removal of the supporting hardware most commonly occurs from infection or exposure in the first year. In the majority of cases the bone is well healed and the problem resolves with removal.

  3. Electrostatic accelerators

    OpenAIRE

    Hinterberger, F.

    2006-01-01

    The principle of electrostatic accelerators is presented. We consider Cockcroft– Walton, Van de Graaff and Tandem Van de Graaff accelerators. We resume high voltage generators such as cascade generators, Van de Graaff band generators, Pelletron generators, Laddertron generators and Dynamitron generators. The speci c features of accelerating tubes, ion optics and methods of voltage stabilization are described. We discuss the characteristic beam properties and the variety of possible beams. We ...

  4. Criticality as a Set-Point for Adaptive Behavior in Neuromorphic Hardware

    Directory of Open Access Journals (Sweden)

    Narayan eSrinivasa

    2015-12-01

    Full Text Available Neuromorphic hardware are designed by drawing inspiration from biology to overcome limitations of current computer architectures while forging the development of a new class of autonomous systems that are can exhibit adaptive behaviors.Many such designs in the recent past are capable of emulating large scale networks but avoid complexity in network dynamics by minimizing the number of dynamic variables that are supported and tunable in hardware. We believe that this is due to the lack of a clear understanding of how to design self-tuning complex systems. It has been widely demonstrated that criticality appears to be the default state of the brain and manifests in the form of spontaneous scale-invariant cascades of neural activity. Experiment, theory and recent models have shown that neuronal networks at criticality demonstrate optimal information transfer, learning and information processing capabilities that affect behavior. In this perspective article, we argue that understanding how large scale neuromorphic electronics can be designed to enable emergent adaptive behavior will require an understanding of how networks emulated by such hardware can self-tune local parameters to maintain criticality as a set-point. We believe that such capability will enable the design of truly scalable intelligent systems using neuromorphic hardware that embrace complexity in network dynamics rather than avoid it.

  5. Criticality as a Set-Point for Adaptive Behavior in Neuromorphic Hardware.

    Science.gov (United States)

    Srinivasa, Narayan; Stepp, Nigel D; Cruz-Albrecht, Jose

    2015-01-01

    Neuromorphic hardware are designed by drawing inspiration from biology to overcome limitations of current computer architectures while forging the development of a new class of autonomous systems that can exhibit adaptive behaviors. Several designs in the recent past are capable of emulating large scale networks but avoid complexity in network dynamics by minimizing the number of dynamic variables that are supported and tunable in hardware. We believe that this is due to the lack of a clear understanding of how to design self-tuning complex systems. It has been widely demonstrated that criticality appears to be the default state of the brain and manifests in the form of spontaneous scale-invariant cascades of neural activity. Experiment, theory and recent models have shown that neuronal networks at criticality demonstrate optimal information transfer, learning and information processing capabilities that affect behavior. In this perspective article, we argue that understanding how large scale neuromorphic electronics can be designed to enable emergent adaptive behavior will require an understanding of how networks emulated by such hardware can self-tune local parameters to maintain criticality as a set-point. We believe that such capability will enable the design of truly scalable intelligent systems using neuromorphic hardware that embrace complexity in network dynamics rather than avoiding it.

  6. Electrostatic accelerators

    CERN Document Server

    Hinterberger, F

    2006-01-01

    The principle of electrostatic accelerators is presented. We consider Cockcroft– Walton, Van de Graaff and Tandem Van de Graaff accelerators. We resume high voltage generators such as cascade generators, Van de Graaff band generators, Pelletron generators, Laddertron generators and Dynamitron generators. The speci c features of accelerating tubes, ion optics and methods of voltage stabilization are described. We discuss the characteristic beam properties and the variety of possible beams. We sketch possible applications and the progress in the development of electrostatic accelerators.

  7. GenePING: secure, scalable management of personal genomic data

    Directory of Open Access Journals (Sweden)

    Kohane Isaac S

    2006-04-01

    Full Text Available Abstract Background Patient genomic data are rapidly becoming part of clinical decision making. Within a few years, full genome expression profiling and genotyping will be affordable enough to perform on every individual. The management of such sizeable, yet fine-grained, data in compliance with privacy laws and best practices presents significant security and scalability challenges. Results We present the design and implementation of GenePING, an extension to the PING personal health record system that supports secure storage of large, genome-sized datasets, as well as efficient sharing and retrieval of individual datapoints (e.g. SNPs, rare mutations, gene expression levels. Even with full access to the raw GenePING storage, an attacker cannot discover any stored genomic datapoint on any single patient. Given a large-enough number of patient records, an attacker cannot discover which data corresponds to which patient, or even the size of a given patient's record. The computational overhead of GenePING's security features is a small constant, making the system usable, even in emergency care, on today's hardware. Conclusion GenePING is the first personal health record management system to support the efficient and secure storage and sharing of large genomic datasets. GenePING is available online at http://ping.chip.org/genepinghtml, licensed under the LGPL.

  8. A Scalable Distributed Approach to Mobile Robot Vision

    Science.gov (United States)

    Kuipers, Benjamin; Browning, Robert L.; Gribble, William S.

    1997-01-01

    This paper documents our progress during the first year of work on our original proposal entitled 'A Scalable Distributed Approach to Mobile Robot Vision'. We are pursuing a strategy for real-time visual identification and tracking of complex objects which does not rely on specialized image-processing hardware. In this system perceptual schemas represent objects as a graph of primitive features. Distributed software agents identify and track these features, using variable-geometry image subwindows of limited size. Active control of imaging parameters and selective processing makes simultaneous real-time tracking of many primitive features tractable. Perceptual schemas operate independently from the tracking of primitive features, so that real-time tracking of a set of image features is not hurt by latency in recognition of the object that those features make up. The architecture allows semantically significant features to be tracked with limited expenditure of computational resources, and allows the visual computation to be distributed across a network of processors. Early experiments are described which demonstrate the usefulness of this formulation, followed by a brief overview of our more recent progress (after the first year).

  9. Scalable multi-GPU implementation of the MAGFLOW simulator

    Directory of Open Access Journals (Sweden)

    Giovanni Gallo

    2011-12-01

    Full Text Available We have developed a robust and scalable multi-GPU (Graphics Processing Unit version of the cellular-automaton-based MAGFLOW lava simulator. The cellular automaton is partitioned into strips that are assigned to different GPUs, with minimal overlapping. For each GPU, a host thread is launched to manage allocation, deallocation, data transfer and kernel launches; the main host thread coordinates all of the GPUs, to ensure temporal coherence and data integrity. The overlapping borders and maximum temporal step need to be exchanged among the GPUs at the beginning of every evolution of the cellular automaton; data transfers are asynchronous with respect to the computations, to cover the introduced overhead. It is not required to have GPUs of the same speed or capacity; the system runs flawlessly on homogeneous and heterogeneous hardware. The speed-up factor differs from that which is ideal (#GPUs× only for a constant overhead loss of about 4E−2 · T · #GPUs, with T as the total simulation time.

  10. Toward a Scalable and Sustainable Intervention for Complementary Food Safety.

    Science.gov (United States)

    Rahman, Musarrat J; Nizame, Fosiul A; Nuruzzaman, Mohammad; Akand, Farhana; Islam, Mohammad Aminul; Parvez, Sarker Masud; Stewart, Christine P; Unicomb, Leanne; Luby, Stephen P; Winch, Peter J

    2016-06-01

    Contaminated complementary foods are associated with diarrhea and malnutrition among children aged 6 to 24 months. However, existing complementary food safety intervention models are likely not scalable and sustainable. To understand current behaviors, motivations for these behaviors, and the potential barriers to behavior change and to identify one or two simple actions that can address one or few food contamination pathways and have potential to be sustainably delivered to a larger population. Data were collected from 2 rural sites in Bangladesh through semistructured observations (12), video observations (12), in-depth interviews (18), and focus group discussions (3). Although mothers report preparing dedicated foods for children, observations show that these are not separate from family foods. Children are regularly fed store-bought foods that are perceived to be bad for children. Mothers explained that long storage durations, summer temperatures, flies, animals, uncovered food, and unclean utensils are threats to food safety. Covering foods, storing foods on elevated surfaces, and reheating foods before consumption are methods believed to keep food safe. Locally made cabinet-like hardware is perceived to be acceptable solution to address reported food safety threats. Conventional approaches that include teaching food safety and highlighting benefits such as reduced contamination may be a disincentive for rural mothers who need solutions for their physical environment. We propose extending existing beneficial behaviors by addressing local preferences of taste and convenience. © The Author(s) 2016.

  11. Scalable Detection and Isolation of Phishing

    NARCIS (Netherlands)

    Moreira Moura, Giovane; Pras, Aiko

    2009-01-01

    This paper presents a proposal for scalable detection and isolation of phishing. The main ideas are to move the protection from end users towards the network provider and to employ the novel bad neighborhood concept, in order to detect and isolate both phishing e-mail senders and phishing web

  12. Scalable Open Source Smart Grid Simulator (SGSim)

    DEFF Research Database (Denmark)

    Ebeid, Emad Samuel Malki; Jacobsen, Rune Hylsberg; Stefanni, Francesco

    2017-01-01

    . This paper presents an open source smart grid simulator (SGSim). The simulator is based on open source SystemC Network Simulation Library (SCNSL) and aims to model scalable smart grid applications. SGSim has been tested under different smart grid scenarios that contain hundreds of thousands of households...

  13. Realization of a scalable airborne radar

    NARCIS (Netherlands)

    Halsema, D. van; Jongh, R.V. de; Es, J. van; Otten, M.P.G.; Vermeulen, B.C.B.; Liempt, L.J. van

    2008-01-01

    Modern airborne ground surveillance radar systems are increasingly based on Active Electronically Scanned Array (AESA) antennas. Efficient use of array technology and the need for radar solutions for various airborne platforms, manned and unmanned, leads to the design of scalable radar systems. The

  14. Scalable Domain Decomposed Monte Carlo Particle Transport

    Energy Technology Data Exchange (ETDEWEB)

    O' Brien, Matthew Joseph [Univ. of California, Davis, CA (United States)

    2013-12-05

    In this dissertation, we present the parallel algorithms necessary to run domain decomposed Monte Carlo particle transport on large numbers of processors (millions of processors). Previous algorithms were not scalable, and the parallel overhead became more computationally costly than the numerical simulation.

  15. Subjective comparison of temporal and quality scalability

    DEFF Research Database (Denmark)

    Korhonen, Jari; Reiter, Ulrich; You, Junyong

    2011-01-01

    and quality scalability. The practical experiments with low resolution video sequences show that in general, distortion is a more crucial factor for the perceived subjective quality than frame rate. However, the results also depend on the content. Moreover,, we discuss the role of other different influence...

  16. Accelerating Value Creation with Accelerators

    DEFF Research Database (Denmark)

    Jonsson, Eythor Ivar

    2015-01-01

    accelerator programs. Microsoft runs accelerators in seven different countries. Accelerators have grown out of the infancy stage and are now an accepted approach to develop new ventures based on cutting-edge technology like the internet of things, mobile technology, big data and virtual reality. It is also...... an approach to facilitate implementation and realization of business ideas and is a lucrative approach to transform research into ventures and to revitalize regions and industries in transition. Investors have noticed that the accelerator approach is a way to increase the possibility of success by funnelling...... with the traditional audit and legal universes and industries are examples of emerging potentials both from a research and business point of view to exploit and explore further. The accelerator approach may therefore be an Idea Watch to consider, no matter which industry you are in, because in essence accelerators...

  17. Hardware device binding and mutual authentication

    Energy Technology Data Exchange (ETDEWEB)

    Hamlet, Jason R; Pierson, Lyndon G

    2014-03-04

    Detection and deterrence of device tampering and subversion by substitution may be achieved by including a cryptographic unit within a computing device for binding multiple hardware devices and mutually authenticating the devices. The cryptographic unit includes a physically unclonable function ("PUF") circuit disposed in or on the hardware device, which generates a binding PUF value. The cryptographic unit uses the binding PUF value during an enrollment phase and subsequent authentication phases. During a subsequent authentication phase, the cryptographic unit uses the binding PUF values of the multiple hardware devices to generate a challenge to send to the other device, and to verify a challenge received from the other device to mutually authenticate the hardware devices.

  18. Hardware-in-the-Loop Testing

    Data.gov (United States)

    Federal Laboratory Consortium — RTC has a suite of Hardware-in-the Loop facilities that include three operational facilities that provide performance assessment and production acceptance testing of...

  19. LIBO accelerates

    CERN Multimedia

    2002-01-01

    The prototype module of LIBO, a linear accelerator project designed for cancer therapy, has passed its first proton-beam acceleration test. In parallel a new version - LIBO-30 - is being developed, which promises to open up even more interesting avenues.

  20. A New, Scalable and Low Cost Multi-Channel Monitoring System for Polymer Electrolyte Fuel Cells

    Directory of Open Access Journals (Sweden)

    Antonio José Calderón

    2016-03-01

    Full Text Available In this work a new, scalable and low cost multi-channel monitoring system for Polymer Electrolyte Fuel Cells (PEFCs has been designed, constructed and experimentally validated. This developed monitoring system performs non-intrusive voltage measurement of each individual cell of a PEFC stack and it is scalable, in the sense that it is capable to carry out measurements in stacks from 1 to 120 cells (from watts to kilowatts. The developed system comprises two main subsystems: hardware devoted to data acquisition (DAQ and software devoted to real-time monitoring. The DAQ subsystem is based on the low-cost open-source platform Arduino and the real-time monitoring subsystem has been developed using the high-level graphical language NI LabVIEW. Such integration can be considered a novelty in scientific literature for PEFC monitoring systems. An original amplifying and multiplexing board has been designed to increase the Arduino input port availability. Data storage and real-time monitoring have been performed with an easy-to-use interface. Graphical and numerical visualization allows a continuous tracking of cell voltage. Scalability, flexibility, easy-to-use, versatility and low cost are the main features of the proposed approach. The system is described and experimental results are presented. These results demonstrate its suitability to monitor the voltage in a PEFC at cell level.

  1. A New, Scalable and Low Cost Multi-Channel Monitoring System for Polymer Electrolyte Fuel Cells

    Science.gov (United States)

    Calderón, Antonio José; González, Isaías; Calderón, Manuel; Segura, Francisca; Andújar, José Manuel

    2016-01-01

    In this work a new, scalable and low cost multi-channel monitoring system for Polymer Electrolyte Fuel Cells (PEFCs) has been designed, constructed and experimentally validated. This developed monitoring system performs non-intrusive voltage measurement of each individual cell of a PEFC stack and it is scalable, in the sense that it is capable to carry out measurements in stacks from 1 to 120 cells (from watts to kilowatts). The developed system comprises two main subsystems: hardware devoted to data acquisition (DAQ) and software devoted to real-time monitoring. The DAQ subsystem is based on the low-cost open-source platform Arduino and the real-time monitoring subsystem has been developed using the high-level graphical language NI LabVIEW. Such integration can be considered a novelty in scientific literature for PEFC monitoring systems. An original amplifying and multiplexing board has been designed to increase the Arduino input port availability. Data storage and real-time monitoring have been performed with an easy-to-use interface. Graphical and numerical visualization allows a continuous tracking of cell voltage. Scalability, flexibility, easy-to-use, versatility and low cost are the main features of the proposed approach. The system is described and experimental results are presented. These results demonstrate its suitability to monitor the voltage in a PEFC at cell level. PMID:27005630

  2. IDD Archival Hardware Architecture and Workflow

    Energy Technology Data Exchange (ETDEWEB)

    Mendonsa, D; Nekoogar, F; Martz, H

    2008-10-09

    This document describes the functionality of every component in the DHS/IDD archival and storage hardware system shown in Fig. 1. The document describes steps by step process of image data being received at LLNL then being processed and made available to authorized personnel and collaborators. Throughout this document references will be made to one of two figures, Fig. 1 describing the elements of the architecture and the Fig. 2 describing the workflow and how the project utilizes the available hardware.

  3. Cooperative communications hardware, channel and PHY

    CERN Document Server

    Dohler, Mischa

    2010-01-01

    Facilitating Cooperation for Wireless Systems Cooperative Communications: Hardware, Channel & PHY focuses on issues pertaining to the PHY layer of wireless communication networks, offering a rigorous taxonomy of this dispersed field, along with a range of application scenarios for cooperative and distributed schemes, demonstrating how these techniques can be employed. The authors discuss hardware, complexity and power consumption issues, which are vital for understanding what can be realized at the PHY layer, showing how wireless channel models differ from more traditional

  4. Accelerating Inspire

    CERN Document Server

    AUTHOR|(CDS)2266999

    2017-01-01

    CERN has been involved in the dissemination of scientific results since its early days and has continuously updated the distribution channels. Currently, Inspire hosts catalogues of articles, authors, institutions, conferences, jobs, experiments, journals and more. Successful orientation among this amount of data requires comprehensive linking between the content. Inspire has lacked a system for linking experiments and articles together based on which accelerator they were conducted at. The purpose of this project has been to create such a system. Records for 156 accelerators were created and all 2913 experiments on Inspire were given corresponding MARC tags. Records of 18404 accelerator physics related bibliographic entries were also tagged with corresponding accelerator tags. Finally, as a part of the endeavour to broaden CERN's presence on Wikipedia, existing Wikipedia articles of accelerators were updated with short descriptions and links to Inspire. In total, 86 Wikipedia articles were updated. This repo...

  5. Induction accelerators

    CERN Document Server

    Takayama, Ken

    2011-01-01

    A broad class of accelerators rests on the induction principle whereby the accelerating electrical fields are generated by time-varying magnetic fluxes. Particularly suitable for the transport of bright and high-intensity beams of electrons, protons or heavy ions in any geometry (linear or circular) the research and development of induction accelerators is a thriving subfield of accelerator physics. This text is the first comprehensive account of both the fundamentals and the state of the art about the modern conceptual design and implementation of such devices. Accordingly, the first part of the book is devoted to the essential features of and key technologies used for induction accelerators at a level suitable for postgraduate students and newcomers to the field. Subsequent chapters deal with more specialized and advanced topics.

  6. Software for Managing Inventory of Flight Hardware

    Science.gov (United States)

    Salisbury, John; Savage, Scott; Thomas, Shirman

    2003-01-01

    The Flight Hardware Support Request System (FHSRS) is a computer program that relieves engineers at Marshall Space Flight Center (MSFC) of most of the non-engineering administrative burden of managing an inventory of flight hardware. The FHSRS can also be adapted to perform similar functions for other organizations. The FHSRS affords a combination of capabilities, including those formerly provided by three separate programs in purchasing, inventorying, and inspecting hardware. The FHSRS provides a Web-based interface with a server computer that supports a relational database of inventory; electronic routing of requests and approvals; and electronic documentation from initial request through implementation of quality criteria, acquisition, receipt, inspection, storage, and final issue of flight materials and components. The database lists both hardware acquired for current projects and residual hardware from previous projects. The increased visibility of residual flight components provided by the FHSRS has dramatically improved the re-utilization of materials in lieu of new procurements, resulting in a cost savings of over $1.7 million. The FHSRS includes subprograms for manipulating the data in the database, informing of the status of a request or an item of hardware, and searching the database on any physical or other technical characteristic of a component or material. The software structure forces normalization of the data to facilitate inquiries and searches for which users have entered mixed or inconsistent values.

  7. Accelerating artificial intelligence with reconfigurable computing

    Science.gov (United States)

    Cieszewski, Radoslaw

    Reconfigurable computing is emerging as an important area of research in computer architectures and software systems. Many algorithms can be greatly accelerated by placing the computationally intense portions of an algorithm into reconfigurable hardware. Reconfigurable computing combines many benefits of both software and ASIC implementations. Like software, the mapped circuit is flexible, and can be changed over the lifetime of the system. Similar to an ASIC, reconfigurable systems provide a method to map circuits into hardware. Reconfigurable systems therefore have the potential to achieve far greater performance than software as a result of bypassing the fetch-decode-execute operations of traditional processors, and possibly exploiting a greater level of parallelism. Such a field, where there is many different algorithms which can be accelerated, is an artificial intelligence. This paper presents example hardware implementations of Artificial Neural Networks, Genetic Algorithms and Expert Systems.

  8. AVR microcontroller simulator for software implemented hardware fault tolerance algorithms research

    Science.gov (United States)

    Piotrowski, Adam; Tarnowski, Szymon; Napieralski, Andrzej

    2008-01-01

    Reliability of new, advanced electronic systems becomes a serious problem especially in places like accelerators and synchrotrons, where sophisticated digital devices operate closely to radiation sources. One of the possible solutions to harden the microprocessor-based system is a strict programming approach known as the Software Implemented Hardware Fault Tolerance. Unfortunately, in real environments it is not possible to perform precise and accurate tests of the new algorithms due to hardware limitation. This paper highlights the AVR-family microcontroller simulator project equipped with an appropriate monitoring and the SEU injection systems.

  9. Comparison Of Hybrid Sorting Algorithms Implemented On Different Parallel Hardware Platforms

    Directory of Open Access Journals (Sweden)

    Dominik Zurek

    2013-01-01

    Full Text Available Sorting is a common problem in computer science. There are lot of well-known sorting algorithms created for sequential execution on a single processor. Recently, hardware platforms enable to create wide parallel algorithms. We have standard processors consist of multiple cores and hardware accelerators like GPU. The graphic cards with their parallel architecture give new possibility to speed up many algorithms. In this paper we describe results of implementation of a few different sorting algorithms on GPU cards and multicore processors. Then hybrid algorithm will be presented which consists of parts executed on both platforms, standard CPU and GPU.

  10. Cosmic Acceleration

    Science.gov (United States)

    Bean, Rachel

    2011-03-01

    In this series of lectures we review observational evidence for, and theoretical investigations into, cosmic acceleration and dark energy. The notes are in four sections. First I review the basic cosmological formalism to describe the expansion history of the universe and how distance measures are defined. The second section covers the evidence for cosmic acceleration from cosmic distance measurements. Section 3 discusses the theoretical avenues being considered to explain the cosmological observations. Section 4 discusses how the growth of inhomogeneities and large scale structure observations might help us pin down the theoretical origin of cosmic acceleration.

  11. Real-time multiprocessor architecture for sharing stream processing accelerators

    NARCIS (Netherlands)

    Dekens, B.H.J.; Bekooij, Marco Jan Gerrit; Smit, Gerardus Johannes Maria

    2015-01-01

    Stream processing accelerators are often applied in MPSoCs for software defined radios. Sharing of these accelerators between different streams could improve their utilization and reduce thereby the hardware cost but is challenging under real-time constraints. In this paper we introduce entry- and

  12. Optimization of the Felix Accelerator with Respect to Laser Performance

    NARCIS (Netherlands)

    van der Meer, A. F. G.; Bakker, R. J.; van der Geer, C. A. J.; Oepts, D.; van Amersfoort, P. W.; Gillespie, W. A.; Martin, P. F.; Saxon, G.

    1993-01-01

    In this paper we discuss the performance of the FELIX accelerator in relation to the laser performance. Over the past year, a number of improvements have been made to the accelerator, both to the hardware and to the way in which it was operated, that have resulted in a reduction of the time needed

  13. Computing requirements for S. S. C. accelerator design and studies

    Energy Technology Data Exchange (ETDEWEB)

    Dragt, A.; Talman, R.; Siemann, R.; Dell, G.F.; Leemann, B.; Leemann, C.; Nauenberg, U.; Peggs, S.; Douglas, D.

    1984-01-01

    We estimate the computational hardware resources that will be required for accelerator physics studies during the design of the Superconducting SuperCollider. It is found that both Class IV and Class VI facilities (1) will be necessary. We describe a user environment for these facilities that is desirable within the context of accelerator studies. An acquisition scenario for these facilities is presented.

  14. Horizontal Accelerator

    Data.gov (United States)

    Federal Laboratory Consortium — The Horizontal Accelerator (HA) Facility is a versatile research tool available for use on projects requiring simulation of the crash environment. The HA Facility is...

  15. Accelerated construction

    Science.gov (United States)

    2004-01-01

    Accelerated Construction Technology Transfer (ACTT) is a strategic process that uses various innovative techniques, strategies, and technologies to minimize actual construction time, while enhancing quality and safety on today's large, complex multip...

  16. A Modular Framework for Modeling Hardware Elements in Distributed Engine Control Systems

    Science.gov (United States)

    Zinnecker, Alicia M.; Culley, Dennis E.; Aretskin-Hariton, Eliot D.

    2015-01-01

    Progress toward the implementation of distributed engine control in an aerospace application may be accelerated through the development of a hardware-in-the-loop (HIL) system for testing new control architectures and hardware outside of a physical test cell environment. One component required in an HIL simulation system is a high-fidelity model of the control platform: sensors, actuators, and the control law. The control system developed for the Commercial Modular Aero-Propulsion System Simulation 40k (C-MAPSS40k) provides a verifiable baseline for development of a model for simulating a distributed control architecture. This distributed controller model will contain enhanced hardware models, capturing the dynamics of the transducer and the effects of data processing, and a model of the controller network. A multilevel framework is presented that establishes three sets of interfaces in the control platform: communication with the engine (through sensors and actuators), communication between hardware and controller (over a network), and the physical connections within individual pieces of hardware. This introduces modularity at each level of the model, encouraging collaboration in the development and testing of various control schemes or hardware designs. At the hardware level, this modularity is leveraged through the creation of a SimulinkR library containing blocks for constructing smart transducer models complying with the IEEE 1451 specification. These hardware models were incorporated in a distributed version of the baseline C-MAPSS40k controller and simulations were run to compare the performance of the two models. The overall tracking ability differed only due to quantization effects in the feedback measurements in the distributed controller. Additionally, it was also found that the added complexity of the smart transducer models did not prevent real-time operation of the distributed controller model, a requirement of an HIL system.

  17. A scalable parallel black oil simulator on distributed memory parallel computers

    Science.gov (United States)

    Wang, Kun; Liu, Hui; Chen, Zhangxin

    2015-11-01

    This paper presents our work on developing a parallel black oil simulator for distributed memory computers based on our in-house parallel platform. The parallel simulator is designed to overcome the performance issues of common simulators that are implemented for personal computers and workstations. The finite difference method is applied to discretize the black oil model. In addition, some advanced techniques are employed to strengthen the robustness and parallel scalability of the simulator, including an inexact Newton method, matrix decoupling methods, and algebraic multigrid methods. A new multi-stage preconditioner is proposed to accelerate the solution of linear systems from the Newton methods. Numerical experiments show that our simulator is scalable and efficient, and is capable of simulating extremely large-scale black oil problems with tens of millions of grid blocks using thousands of MPI processes on parallel computers.

  18. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Directory of Open Access Journals (Sweden)

    Giovanni Delussu

    Full Text Available This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  19. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    Science.gov (United States)

    Delussu, Giovanni; Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  20. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

    Science.gov (United States)

    Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. PMID:27936191

  1. Hardware and Initial Beam Commissioning of the LHC RF Systems

    CERN Document Server

    Linnecar, T; Arnaudon, L; Baudrenghien, P; Bohl, T; Brunner, O; Butterworth, A; Ciapala, Edmond; Dubouchet, F; Ferreira-Bento, J; Glenat, D; Hagmann, G; Höfle, Wolfgang; Julie, C; Killing, F; Kotzian, G; Landre, D; Louwerse, R; Maesen, P; Martinez-Yanez, P; Molendijk, J; Montesinos, E; Nicou, C; Noirjean, J; Papotti, G; Pashnin, A; Pechaud, G; Pradier, J; Rossi, V; Sanchez-Quesada, J; Schokker, M; Shaposhnikova, E; Sorokoletev, R; Stellfeld, D; Tückmantel, Joachim; Valuch, D; Wehrle, U; Weierud, F

    2008-01-01

    Hardware commissioning of the LHC RF Systems, the ACS Superconducting RF systems, ADT Transverse Dampers and APWL Wideband Longitudinal Monitors, started in late 2007 and was completed in time for the first LHC beams in 2008. The RF inter-machine synchroni-sation systems were in place and operational for the LHC synchronization tests in August 2008. The very first beams through IP4 were observed on the RF monitors and beam 2 was captured on 11th September. Measurements with beam on the damper systems were also pos-sible, preparing the way for closing the damper loop with beam. Major milestones during commissioning the ACS and ADT systems and results obtained during first capture tests are presented. Preparatory work for acceleration and multi-bunch operation is described as are the beam tests foreseen for 2009.

  2. Scalable Atomistic Simulation Algorithms for Materials Research

    Directory of Open Access Journals (Sweden)

    Aiichiro Nakano

    2002-01-01

    Full Text Available A suite of scalable atomistic simulation programs has been developed for materials research based on space-time multiresolution algorithms. Design and analysis of parallel algorithms are presented for molecular dynamics (MD simulations and quantum-mechanical (QM calculations based on the density functional theory. Performance tests have been carried out on 1,088-processor Cray T3E and 1,280-processor IBM SP3 computers. The linear-scaling algorithms have enabled 6.44-billion-atom MD and 111,000-atom QM calculations on 1,024 SP3 processors with parallel efficiency well over 90%. production-quality programs also feature wavelet-based computational-space decomposition for adaptive load balancing, spacefilling-curve-based adaptive data compression with user-defined error bound for scalable I/O, and octree-based fast visibility culling for immersive and interactive visualization of massive simulation data.

  3. Declarative and Scalable Selection for Map Visualizations

    DEFF Research Database (Denmark)

    Kefaloukos, Pimin Konstantin Balic

    supports the PostgreSQL dialect of SQL. The prototype implementation is a compiler that translates CVL into SQL and stored procedures. (c) TileHeat is a framework and basic algorithm for partial materialization of hot tile sets for scalable map distribution. The framework predicts future map workloads......, there are indications that the method is scalable for databases that contain millions of records, especially if the target language of the compiler is substituted by a cluster-ready variant of SQL. While several realistic use cases for maps have been implemented in CVL, additional non-geographic data visualization uses...... goal. The results for Tileheat show that the prediction method offers a substantial improvement over the current method used by the Danish Geodata Agency. Thus, a large amount of computations can potentially be saved by this public institution, who is responsible for the distribution of government...

  4. A Scalability Model for ECS's Data Server

    Science.gov (United States)

    Menasce, Daniel A.; Singhal, Mukesh

    1998-01-01

    This report presents in four chapters a model for the scalability analysis of the Data Server subsystem of the Earth Observing System Data and Information System (EOSDIS) Core System (ECS). The model analyzes if the planned architecture of the Data Server will support an increase in the workload with the possible upgrade and/or addition of processors, storage subsystems, and networks. The approaches in the report include a summary of the architecture of ECS's Data server as well as a high level description of the Ingest and Retrieval operations as they relate to ECS's Data Server. This description forms the basis for the development of the scalability model of the data server and the methodology used to solve it.

  5. Stencil Lithography for Scalable Micro- and Nanomanufacturing

    Directory of Open Access Journals (Sweden)

    Ke Du

    2017-04-01

    Full Text Available In this paper, we review the current development of stencil lithography for scalable micro- and nanomanufacturing as a resistless and reusable patterning technique. We first introduce the motivation and advantages of stencil lithography for large-area micro- and nanopatterning. Then we review the progress of using rigid membranes such as SiNx and Si as stencil masks as well as stacking layers. We also review the current use of flexible membranes including a compliant SiNx membrane with springs, polyimide film, polydimethylsiloxane (PDMS layer, and photoresist-based membranes as stencil lithography masks to address problems such as blurring and non-planar surface patterning. Moreover, we discuss the dynamic stencil lithography technique, which significantly improves the patterning throughput and speed by moving the stencil over the target substrate during deposition. Lastly, we discuss the future advancement of stencil lithography for a resistless, reusable, scalable, and programmable nanolithography method.

  6. LINEAR ACCELERATOR

    Science.gov (United States)

    Christofilos, N.C.; Polk, I.J.

    1959-02-17

    Improvements in linear particle accelerators are described. A drift tube system for a linear ion accelerator reduces gap capacity between adjacent drift tube ends. This is accomplished by reducing the ratio of the diameter of the drift tube to the diameter of the resonant cavity. Concentration of magnetic field intensity at the longitudinal midpoint of the external sunface of each drift tube is reduced by increasing the external drift tube diameter at the longitudinal center region.

  7. SPRNG Scalable Parallel Random Number Generator LIbrary

    Energy Technology Data Exchange (ETDEWEB)

    2010-03-16

    This revision corrects some errors in SPRNG 1. Users of newer SPRNG versions can obtain the corrected files and build their version with it. This version also improves the scalability of some of the application-based tests in the SPRNG test suite. It also includes an interface to a parallel Mersenne Twister, so that if users install the Mersenne Twister, then they can test this generator with the SPRNG test suite and also use some SPRNG features with that generator.

  8. Bitcoin-NG: A Scalable Blockchain Protocol

    OpenAIRE

    Eyal, Ittay; Gencer, Adem Efe; Sirer, Emin Gun; Renesse, Robbert,

    2015-01-01

    Cryptocurrencies, based on and led by Bitcoin, have shown promise as infrastructure for pseudonymous online payments, cheap remittance, trustless digital asset exchange, and smart contracts. However, Bitcoin-derived blockchain protocols have inherent scalability limits that trade-off between throughput and latency and withhold the realization of this potential. This paper presents Bitcoin-NG, a new blockchain protocol designed to scale. Based on Bitcoin's blockchain protocol, Bitcoin-NG is By...

  9. Stencil Lithography for Scalable Micro- and Nanomanufacturing

    OpenAIRE

    Ke Du; Junjun Ding; Yuyang Liu; Ishan Wathuthanthri; Chang-Hwan Choi

    2017-01-01

    In this paper, we review the current development of stencil lithography for scalable micro- and nanomanufacturing as a resistless and reusable patterning technique. We first introduce the motivation and advantages of stencil lithography for large-area micro- and nanopatterning. Then we review the progress of using rigid membranes such as SiNx and Si as stencil masks as well as stacking layers. We also review the current use of flexible membranes including a compliant SiNx membrane with spring...

  10. Scalable robotic biofabrication of tissue spheroids

    Energy Technology Data Exchange (ETDEWEB)

    Mehesz, A Nagy; Hajdu, Z; Visconti, R P; Markwald, R R; Mironov, V [Advanced Tissue Biofabrication Center, Department of Regenerative Medicine and Cell Biology, Medical University of South Carolina, Charleston, SC (United States); Brown, J [Department of Mechanical Engineering, Clemson University, Clemson, SC (United States); Beaver, W [York Technical College, Rock Hill, SC (United States); Da Silva, J V L, E-mail: mironovv@musc.edu [Renato Archer Information Technology Center-CTI, Campinas (Brazil)

    2011-06-15

    Development of methods for scalable biofabrication of uniformly sized tissue spheroids is essential for tissue spheroid-based bioprinting of large size tissue and organ constructs. The most recent scalable technique for tissue spheroid fabrication employs a micromolded recessed template prepared in a non-adhesive hydrogel, wherein the cells loaded into the template self-assemble into tissue spheroids due to gravitational force. In this study, we present an improved version of this technique. A new mold was designed to enable generation of 61 microrecessions in each well of a 96-well plate. The microrecessions were seeded with cells using an EpMotion 5070 automated pipetting machine. After 48 h of incubation, tissue spheroids formed at the bottom of each microrecession. To assess the quality of constructs generated using this technology, 600 tissue spheroids made by this method were compared with 600 spheroids generated by the conventional hanging drop method. These analyses showed that tissue spheroids fabricated by the micromolded method are more uniform in diameter. Thus, use of micromolded recessions in a non-adhesive hydrogel, combined with automated cell seeding, is a reliable method for scalable robotic fabrication of uniform-sized tissue spheroids.

  11. A scalable distributed RRT for motion planning

    KAUST Repository

    Jacobs, Sam Ade

    2013-05-01

    Rapidly-exploring Random Tree (RRT), like other sampling-based motion planning methods, has been very successful in solving motion planning problems. Even so, sampling-based planners cannot solve all problems of interest efficiently, so attention is increasingly turning to parallelizing them. However, one challenge in parallelizing RRT is the global computation and communication overhead of nearest neighbor search, a key operation in RRTs. This is a critical issue as it limits the scalability of previous algorithms. We present two parallel algorithms to address this problem. The first algorithm extends existing work by introducing a parameter that adjusts how much local computation is done before a global update. The second algorithm radially subdivides the configuration space into regions, constructs a portion of the tree in each region in parallel, and connects the subtrees,i removing cycles if they exist. By subdividing the space, we increase computation locality enabling a scalable result. We show that our approaches are scalable. We present results demonstrating almost linear scaling to hundreds of processors on a Linux cluster and a Cray XE6 machine. © 2013 IEEE.

  12. Accelerating Climate Simulations Through Hybrid Computing

    Science.gov (United States)

    Zhou, Shujia; Sinno, Scott; Cruz, Carlos; Purcell, Mark

    2009-01-01

    Unconventional multi-core processors (e.g., IBM Cell B/E and NYIDIDA GPU) have emerged as accelerators in climate simulation. However, climate models typically run on parallel computers with conventional processors (e.g., Intel and AMD) using MPI. Connecting accelerators to this architecture efficiently and easily becomes a critical issue. When using MPI for connection, we identified two challenges: (1) identical MPI implementation is required in both systems, and; (2) existing MPI code must be modified to accommodate the accelerators. In response, we have extended and deployed IBM Dynamic Application Virtualization (DAV) in a hybrid computing prototype system (one blade with two Intel quad-core processors, two IBM QS22 Cell blades, connected with Infiniband), allowing for seamlessly offloading compute-intensive functions to remote, heterogeneous accelerators in a scalable, load-balanced manner. Currently, a climate solar radiation model running with multiple MPI processes has been offloaded to multiple Cell blades with approx.10% network overhead.

  13. Numeric Analysis for Relationship-Aware Scalable Streaming Scheme

    Directory of Open Access Journals (Sweden)

    Heung Ki Lee

    2014-01-01

    Full Text Available Frequent packet loss of media data is a critical problem that degrades the quality of streaming services over mobile networks. Packet loss invalidates frames containing lost packets and other related frames at the same time. Indirect loss caused by losing packets decreases the quality of streaming. A scalable streaming service can decrease the amount of dropped multimedia resulting from a single packet loss. Content providers typically divide one large media stream into several layers through a scalable streaming service and then provide each scalable layer to the user depending on the mobile network. Also, a scalable streaming service makes it possible to decode partial multimedia data depending on the relationship between frames and layers. Therefore, a scalable streaming service provides a way to decrease the wasted multimedia data when one packet is lost. However, the hierarchical structure between frames and layers of scalable streams determines the service quality of the scalable streaming service. Even if whole packets of layers are transmitted successfully, they cannot be decoded as a result of the absence of reference frames and layers. Therefore, the complicated relationship between frames and layers in a scalable stream increases the volume of abandoned layers. For providing a high-quality scalable streaming service, we choose a proper relationship between scalable layers as well as the amount of transmitted multimedia data depending on the network situation. We prove that a simple scalable scheme outperforms a complicated scheme in an error-prone network. We suggest an adaptive set-top box (AdaptiveSTB to lower the dependency between scalable layers in a scalable stream. Also, we provide a numerical model to obtain the indirect loss of multimedia data and apply it to various multimedia streams. Our AdaptiveSTB enhances the quality of a scalable streaming service by removing indirect loss.

  14. Quantitative hardware prediction modeling for hardware/software co-design

    NARCIS (Netherlands)

    Meeuws, R.J.

    2012-01-01

    Hardware estimation is an important factor in Hardware/Software Co-design. In this dissertation, we present the Quipu Modeling Approach, a high-level quantitative prediction model for HW/SW Partitioning using statistical methods. Our approach uses linear regression between software complexity

  15. A Hardware Abstraction Layer in Java

    DEFF Research Database (Denmark)

    Schoeberl, Martin; Korsholm, Stephan; Kalibera, Tomas

    2011-01-01

    Embedded systems use specialized hardware devices to interact with their environment, and since they have to be dependable, it is attractive to use a modern, type-safe programming language like Java to develop programs for them. Standard Java, as a platform-independent language, delegates access...... to devices, direct memory access, and interrupt handling to some underlying operating system or kernel, but in the embedded systems domain resources are scarce and a Java Virtual Machine (JVM) without an underlying middleware is an attractive architecture. The contribution of this article is a proposal...... for Java packages with hardware objects and interrupt handlers that interface to such a JVM. We provide implementations of the proposal directly in hardware, as extensions of standard interpreters, and finally with an operating system middleware. The latter solution is mainly seen as a migration path...

  16. Efficient k-Winner-Take-All Competitive Learning Hardware Architecture for On-Chip Learning

    Science.gov (United States)

    Ou, Chien-Min; Li, Hui-Ya; Hwang, Wen-Jyi

    2012-01-01

    A novel k-winners-take-all (k-WTA) competitive learning (CL) hardware architecture is presented for on-chip learning in this paper. The architecture is based on an efficient pipeline allowing k-WTA competition processes associated with different training vectors to be performed concurrently. The pipeline architecture employs a novel codeword swapping scheme so that neurons failing the competition for a training vector are immediately available for the competitions for the subsequent training vectors. The architecture is implemented by the field programmable gate array (FPGA). It is used as a hardware accelerator in a system on programmable chip (SOPC) for realtime on-chip learning. Experimental results show that the SOPC has significantly lower training time than that of other k-WTA CL counterparts operating with or without hardware support.

  17. High-performance image reconstruction in fluorescence tomography on desktop computers and graphics hardware.

    Science.gov (United States)

    Freiberger, Manuel; Egger, Herbert; Liebmann, Manfred; Scharfetter, Hermann

    2011-11-01

    Image reconstruction in fluorescence optical tomography is a three-dimensional nonlinear ill-posed problem governed by a system of partial differential equations. In this paper we demonstrate that a combination of state of the art numerical algorithms and a careful hardware optimized implementation allows to solve this large-scale inverse problem in a few seconds on standard desktop PCs with modern graphics hardware. In particular, we present methods to solve not only the forward but also the non-linear inverse problem by massively parallel programming on graphics processors. A comparison of optimized CPU and GPU implementations shows that the reconstruction can be accelerated by factors of about 15 through the use of the graphics hardware without compromising the accuracy in the reconstructed images.

  18. High-performance image reconstruction in fluorescence tomography on desktop computers and graphics hardware

    Science.gov (United States)

    Freiberger, Manuel; Egger, Herbert; Liebmann, Manfred; Scharfetter, Hermann

    2011-01-01

    Image reconstruction in fluorescence optical tomography is a three-dimensional nonlinear ill-posed problem governed by a system of partial differential equations. In this paper we demonstrate that a combination of state of the art numerical algorithms and a careful hardware optimized implementation allows to solve this large-scale inverse problem in a few seconds on standard desktop PCs with modern graphics hardware. In particular, we present methods to solve not only the forward but also the non-linear inverse problem by massively parallel programming on graphics processors. A comparison of optimized CPU and GPU implementations shows that the reconstruction can be accelerated by factors of about 15 through the use of the graphics hardware without compromising the accuracy in the reconstructed images. PMID:22076279

  19. Scalable and balanced dynamic hybrid data assimilation

    Science.gov (United States)

    Kauranne, Tuomo; Amour, Idrissa; Gunia, Martin; Kallio, Kari; Lepistö, Ahti; Koponen, Sampsa

    2017-04-01

    Scalability of complex weather forecasting suites is dependent on the technical tools available for implementing highly parallel computational kernels, but to an equally large extent also on the dependence patterns between various components of the suite, such as observation processing, data assimilation and the forecast model. Scalability is a particular challenge for 4D variational assimilation methods that necessarily couple the forecast model into the assimilation process and subject this combination to an inherently serial quasi-Newton minimization process. Ensemble based assimilation methods are naturally more parallel, but large models force ensemble sizes to be small and that results in poor assimilation accuracy, somewhat akin to shooting with a shotgun in a million-dimensional space. The Variational Ensemble Kalman Filter (VEnKF) is an ensemble method that can attain the accuracy of 4D variational data assimilation with a small ensemble size. It achieves this by processing a Gaussian approximation of the current error covariance distribution, instead of a set of ensemble members, analogously to the Extended Kalman Filter EKF. Ensemble members are re-sampled every time a new set of observations is processed from a new approximation of that Gaussian distribution which makes VEnKF a dynamic assimilation method. After this a smoothing step is applied that turns VEnKF into a dynamic Variational Ensemble Kalman Smoother VEnKS. In this smoothing step, the same process is iterated with frequent re-sampling of the ensemble but now using past iterations as surrogate observations until the end result is a smooth and balanced model trajectory. In principle, VEnKF could suffer from similar scalability issues as 4D-Var. However, this can be avoided by isolating the forecast model completely from the minimization process by implementing the latter as a wrapper code whose only link to the model is calling for many parallel and totally independent model runs, all of them

  20. Economic impact of syndesmosis hardware removal.

    Science.gov (United States)

    Lalli, Trapper A J; Matthews, Leslie J; Hanselman, Andrew E; Hubbard, David F; Bramer, Michelle A; Santrock, Robert D

    2015-09-01

    Ankle syndesmosis injuries are commonly seen with 5-10% of sprains and 10% of ankle fractures involving injury to the ankle syndesmosis. Anatomic reduction has been shown to be the most important predictor of clinical outcomes. Optimal surgical management has been a subject of debate in the literature. The method of fixation, number of screws, screw size, and number of cortices are all controversial. Postoperative hardware removal has also been widely debated in the literature. Some surgeons advocate for elective hardware removal prior to resuming full weightbearing. Returning to the operating room for elective hardware removal results in increased cost to the patient, potential for infection or complication(s), and missed work days for the patient. Suture button devices and bioabsorbable screw fixation present other options, but cortical screw fixation remains the gold standard. This retrospective review was designed to evaluate the economic impact of a second operative procedure for elective removal of 3.5mm cortical syndesmosis screws. Two hundred and two patients with ICD-9 code for "open treatment of distal tibiofibular joint (syndesmosis) disruption" were identified. The medical records were reviewed for those who underwent elective syndesmosis hardware removal. The primary outcome measurements included total hospital billing charges and total hospital billing collection. Secondary outcome measurements included average individual patient operative costs and average operating room time. Fifty-six patients were included in the study. Our institution billed a total of $188,271 (USD) and collected $106,284 (55%). The average individual patient operating room cost was $3579. The average operating room time was 67.9 min. To the best of our knowledge, no study has previously provided cost associated with syndesmosis hardware removal. Our study shows elective syndesmosis hardware removal places substantial economic burden on both the patient and the healthcare system

  1. Quantum neuromorphic hardware for quantum artificial intelligence

    Science.gov (United States)

    Prati, Enrico

    2017-08-01

    The development of machine learning methods based on deep learning boosted the field of artificial intelligence towards unprecedented achievements and application in several fields. Such prominent results were made in parallel with the first successful demonstrations of fault tolerant hardware for quantum information processing. To which extent deep learning can take advantage of the existence of a hardware based on qubits behaving as a universal quantum computer is an open question under investigation. Here I review the convergence between the two fields towards implementation of advanced quantum algorithms, including quantum deep learning.

  2. Human Centered Hardware Modeling and Collaboration

    Science.gov (United States)

    Stambolian Damon; Lawrence, Brad; Stelges, Katrine; Henderson, Gena

    2013-01-01

    In order to collaborate engineering designs among NASA Centers and customers, to in clude hardware and human activities from multiple remote locations, live human-centered modeling and collaboration across several sites has been successfully facilitated by Kennedy Space Center. The focus of this paper includes innovative a pproaches to engineering design analyses and training, along with research being conducted to apply new technologies for tracking, immersing, and evaluating humans as well as rocket, vehic le, component, or faci lity hardware utilizing high resolution cameras, motion tracking, ergonomic analysis, biomedical monitoring, wor k instruction integration, head-mounted displays, and other innovative human-system integration modeling, simulation, and collaboration applications.

  3. From experiment to design -- Fault characterization and detection in parallel computer systems using computational accelerators

    Science.gov (United States)

    Yim, Keun Soo

    This dissertation summarizes experimental validation and co-design studies conducted to optimize the fault detection capabilities and overheads in hybrid computer systems (e.g., using CPUs and Graphics Processing Units, or GPUs), and consequently to improve the scalability of parallel computer systems using computational accelerators. The experimental validation studies were conducted to help us understand the failure characteristics of CPU-GPU hybrid computer systems under various types of hardware faults. The main characterization targets were faults that are difficult to detect and/or recover from, e.g., faults that cause long latency failures (Ch. 3), faults in dynamically allocated resources (Ch. 4), faults in GPUs (Ch. 5), faults in MPI programs (Ch. 6), and microarchitecture-level faults with specific timing features (Ch. 7). The co-design studies were based on the characterization results. One of the co-designed systems has a set of source-to-source translators that customize and strategically place error detectors in the source code of target GPU programs (Ch. 5). Another co-designed system uses an extension card to learn the normal behavioral and semantic execution patterns of message-passing processes executing on CPUs, and to detect abnormal behaviors of those parallel processes (Ch. 6). The third co-designed system is a co-processor that has a set of new instructions in order to support software-implemented fault detection techniques (Ch. 7). The work described in this dissertation gains more importance because heterogeneous processors have become an essential component of state-of-the-art supercomputers. GPUs were used in three of the five fastest supercomputers that were operating in 2011. Our work included comprehensive fault characterization studies in CPU-GPU hybrid computers. In CPUs, we monitored the target systems for a long period of time after injecting faults (a temporally comprehensive experiment), and injected faults into various types of

  4. Scalable System Design for Covert MIMO Communications

    Science.gov (United States)

    2014-06-01

    Chuck E. Cheese tokens. I’d also like to thank the CCR and the CCR staff for their help particularly in the area of hardware design. Finally, my committee...Communication Systems, pp. 349–352, November 2007. [46] Y. Sun and J. Cavallaro, “Trellis-Search Based Soft -Input Soft -Output MIMO De- tector: Algorithm and VLSI

  5. Broadband accelerator control network

    Energy Technology Data Exchange (ETDEWEB)

    Skelly, J.; Clifford, T.; Frankel, R.

    1983-01-01

    A broadband data communications network has been implemented at BNL for control of the Alternating Gradient Synchrotron (AG) proton accelerator, using commercial CATV hardware, dual coaxial cables as the communications medium, and spanning 2.0 km. A 4 MHz bandwidth Digital Control channel using CSMA-CA protocol is provided for digital data transmission, with 8 access nodes available over the length of the RELWAY. Each node consists of an rf modem and a microprocessor-based store-and-forward message handler which interfaces the RELWAY to a branch line implemented in GPIB. A gateway to the RELWAY control channel for the (preexisting) AGS Computerized Accelerator Operating system has been constructed using an LSI-11/23 microprocessor as a device in a GPIB branch line. A multilayer communications protocol has been defined for the Digital Control Channel, based on the ISO Open Systems Interconnect layered model, and a RELWAY Device Language defined as the required universal language for device control on this channel.

  6. Design considerations for space flight hardware

    Science.gov (United States)

    Glover, Daniel

    1990-01-01

    The environmental and design constraints are reviewed along with some insight into the established design and quality assurance practices that apply to low earth orbit (LEO) space flight hardware. It is intended as an introduction for people unfamiliar with space flight considerations. Some basic data and a bibliography are included.

  7. Enabling Open Hardware through FOSS tools

    CERN Multimedia

    CERN. Geneva

    2016-01-01

    Software developers often take open file formats and tools for granted. When you publish code on github, you do not ask yourself if somebody will be able to open it and modify it. We need the same freedom in the open hardware world, to make it truly accessible for everyone.

  8. Remote hardware-reconfigurable robotic camera

    Science.gov (United States)

    Arias-Estrada, Miguel; Torres-Huitzil, Cesar; Maya-Rueda, Selene E.

    2001-10-01

    In this work, a camera with integrated image processing capabilities is discussed. The camera is based on an imager coupled to an FPGA device (Field Programmable Gate Array) which contains an architecture for real-time computer vision low-level processing. The architecture can be reprogrammed remotely for application specific purposes. The system is intended for rapid modification and adaptation for inspection and recognition applications, with the flexibility of hardware and software reprogrammability. FPGA reconfiguration allows the same ease of upgrade in hardware as a software upgrade process. The camera is composed of a digital imager coupled to an FPGA device, two memory banks, and a microcontroller. The microcontroller is used for communication tasks and FPGA programming. The system implements a software architecture to handle multiple FPGA architectures in the device, and the possibility to download a software/hardware object from the host computer into its internal context memory. System advantages are: small size, low power consumption, and a library of hardware/software functionalities that can be exchanged during run time. The system has been validated with an edge detection and a motion processing architecture, which will be presented in the paper. Applications targeted are in robotics, mobile robotics, and vision based quality control.

  9. QCE : A Simulator for Quantum Computer Hardware

    NARCIS (Netherlands)

    Michielsen, Kristel; Raedt, Hans De

    2003-01-01

    The Quantum Computer Emulator (QCE) described in this paper consists of a simulator of a generic, general purpose quantum computer and a graphical user interface. The latter is used to control the simulator, to define the hardware of the quantum computer and to debug and execute quantum algorithms.

  10. Proof Carrying Hardware based IP Protection

    Science.gov (United States)

    2017-03-01

    service to the hardware. Note that in this paper, we only consider Trojans which can be activated by a specific digital input vector. Further, we...acquisition,” IEEE Transactions on Information Forensics and Security, vol. 7, no. 1, pp. 25–40, 2012. [7] Y. Jin, B. Yang, and Y. Makris, “Cycle-accurate

  11. Efficient Runtime Management of Reconfigurable Hardware Resources

    NARCIS (Netherlands)

    Marconi, T.

    2011-01-01

    Runtime reconfigurable systems built upon devices with partial reconfiguration can provide reduction in overall hardware area, power efficiency, and economic cost in addition to the performance improvements due to better customization. However, the users of such systems have to be able to afford

  12. Environmental Control System Software & Hardware Development

    Science.gov (United States)

    Vargas, Daniel Eduardo

    2017-01-01

    ECS hardware: (1) Provides controlled purge to SLS Rocket and Orion spacecraft. (2) Provide mission-focused engineering products and services. ECS software: (1) NASA requires Compact Unique Identifiers (CUIs); fixed-length identifier used to identify information items. (2) CUI structure; composed of nine semantic fields that aid the user in recognizing its purpose.

  13. Microprocessor Design Using Hardware Description Language

    Science.gov (United States)

    Mita, Rosario; Palumbo, Gaetano

    2008-01-01

    The following paper has been conceived to deal with the contents of some lectures aimed at enhancing courses on digital electronic, microelectronic or VLSI systems. Those lectures show how to use a hardware description language (HDL), such as the VHDL, to specify, design and verify a custom microprocessor. The general goal of this work is to teach…

  14. Digital Hardware Design Teaching: An Alternative Approach

    Science.gov (United States)

    Benkrid, Khaled; Clayton, Thomas

    2012-01-01

    This article presents the design and implementation of a complete review of undergraduate digital hardware design teaching in the School of Engineering at the University of Edinburgh. Four guiding principles have been used in this exercise: learning-outcome driven teaching, deep learning, affordability, and flexibility. This has identified…

  15. Computer hardware for radiologists: Part I

    Directory of Open Access Journals (Sweden)

    Indrajit I

    2010-01-01

    Full Text Available Computers are an integral part of modern radiology practice. They are used in different radiology modalities to acquire, process, and postprocess imaging data. They have had a dramatic influence on contemporary radiology practice. Their impact has extended further with the emergence of Digital Imaging and Communications in Medicine (DICOM, Picture Archiving and Communication System (PACS, Radiology information system (RIS technology, and Teleradiology. A basic overview of computer hardware relevant to radiology practice is presented here. The key hardware components in a computer are the motherboard, central processor unit (CPU, the chipset, the random access memory (RAM, the memory modules, bus, storage drives, and ports. The personnel computer (PC has a rectangular case that contains important components called hardware, many of which are integrated circuits (ICs. The fiberglass motherboard is the main printed circuit board and has a variety of important hardware mounted on it, which are connected by electrical pathways called "buses". The CPU is the largest IC on the motherboard and contains millions of transistors. Its principal function is to execute "programs". A Pentium® 4 CPU has transistors that execute a billion instructions per second. The chipset is completely different from the CPU in design and function; it controls data and interaction of buses between the motherboard and the CPU. Memory (RAM is fundamentally semiconductor chips storing data and instructions for access by a CPU. RAM is classified by storage capacity, access speed, data rate, and configuration.

  16. PET protection optimization for streaming scalable videos with multiple transmissions.

    Science.gov (United States)

    Xiong, Ruiqin; Taubman, David S; Sivaraman, Vijay

    2013-11-01

    This paper investigates priority encoding transmission (PET) protection for streaming scalably compressed video streams over erasure channels, for the scenarios where a small number of retransmissions are allowed. In principle, the optimal protection depends not only on the importance of each stream element, but also on the expected channel behavior. By formulating a collection of hypotheses concerning its own behavior in future transmissions, limited-retransmission PET (LR-PET) effectively constructs channel codes spanning multiple transmission slots and thus offers better protection efficiency than the original PET. As the number of transmission opportunities increases, the optimization for LR-PET becomes very challenging because the number of hypothetical retransmission paths increases exponentially. As a key contribution, this paper develops a method to derive the effective recovery-probability versus redundancy-rate characteristic for the LR-PET procedure with any number of transmission opportunities. This significantly accelerates the protection assignment procedure in the original LR-PET with only two transmissions, and also makes a quick and optimal protection assignment feasible for scenarios where more transmissions are possible. This paper also gives a concrete proof to the redundancy embedding property of the channel codes formed by LR-PET, which allows for a decoupled optimization for sequentially dependent source elements with convex utility-length characteristic. This essentially justifies the source-independent construction of the protection convex hull for LR-PET.

  17. Internet-based hardware/software co-design framework for embedded 3D graphics applications

    Directory of Open Access Journals (Sweden)

    Wong Weng-Fai

    2011-01-01

    Full Text Available Abstract Advances in technology are making it possible to run three-dimensional (3D graphics applications on embedded and handheld devices. In this article, we propose a hardware/software co-design environment for 3D graphics application development that includes the 3D graphics software, OpenGL ES application programming interface (API, device driver, and 3D graphics hardware simulators. We developed a 3D graphics system-on-a-chip (SoC accelerator using transaction-level modeling (TLM. This gives software designers early access to the hardware even before it is ready. On the other hand, hardware designers also stand to gain from the more complex test benches made available in the software for verification. A unique aspect of our framework is that it allows hardware and software designers from geographically dispersed areas to cooperate and work on the same framework. Designs can be entered and executed from anywhere in the world without full access to the entire framework, which may include proprietary components. This results in controlled and secure transparency and reproducibility, granting leveled access to users of various roles.

  18. Combined Scalable Video Coding Method for Wireless Transmission

    Directory of Open Access Journals (Sweden)

    Achmad Affandi

    2011-08-01

    Full Text Available Mobile video streaming is one of multimedia services that has developed very rapidly. Recently, bandwidth utilization for wireless transmission is the main problem in the field of multimedia communications. In this research, we offer a combination of scalable methods as the most attractive solution to this problem. Scalable method for wireless communication should adapt to input video sequence. Standard ITU (International Telecommunication Union - Joint Scalable Video Model (JSVM is employed to produce combined scalable video coding (CSVC method that match the required quality of video streaming services for wireless transmission. The investigation in this paper shows that combined scalable technique outperforms the non-scalable one, in using bit rate capacity at certain layer.

  19. Towards a Scalable, Biomimetic, Antibacterial Coating

    Science.gov (United States)

    Dickson, Mary Nora

    Corneal afflictions are the second leading cause of blindness worldwide. When a corneal transplant is unavailable or contraindicated, an artificial cornea device is the only chance to save sight. Bacterial or fungal biofilm build up on artificial cornea devices can lead to serious complications including the need for systemic antibiotic treatment and even explantation. As a result, much emphasis has been placed on anti-adhesion chemical coatings and antibiotic leeching coatings. These methods are not long-lasting, and microorganisms can eventually circumvent these measures. Thus, I have developed a surface topographical antimicrobial coating. Various surface structures including rough surfaces, superhydrophobic surfaces, and the natural surfaces of insects' wings and sharks' skin are promising anti-biofilm candidates, however none meet the criteria necessary for implementation on the surface of an artificial cornea device. In this thesis I: 1) developed scalable fabrication protocols for a library of biomimetic nanostructure polymer surfaces 2) assessed the potential these for poly(methyl methacrylate) nanopillars to kill or prevent formation of biofilm by E. coli bacteria and species of Pseudomonas and Staphylococcus bacteria and improved upon a proposed mechanism for the rupture of Gram-negative bacterial cell walls 3) developed a scalable, commercially viable method for producing antibacterial nanopillars on a curved, PMMA artificial cornea device and 4) developed scalable fabrication protocols for implantation of antibacterial nanopatterned surfaces on the surfaces of thermoplastic polyurethane materials, commonly used in catheter tubings. This project constitutes a first step towards fabrication of the first entirely PMMA artificial cornea device. The major finding of this work is that by precisely controlling the topography of a polymer surface at the nano-scale, we can kill adherent bacteria and prevent biofilm formation of certain pathogenic bacteria

  20. Programming Scala Scalability = Functional Programming + Objects

    CERN Document Server

    Wampler, Dean

    2009-01-01

    Learn how to be more productive with Scala, a new multi-paradigm language for the Java Virtual Machine (JVM) that integrates features of both object-oriented and functional programming. With this book, you'll discover why Scala is ideal for highly scalable, component-based applications that support concurrency and distribution. Programming Scala clearly explains the advantages of Scala as a JVM language. You'll learn how to leverage the wealth of Java class libraries to meet the practical needs of enterprise and Internet projects more easily. Packed with code examples, this book provides us

  1. Scalable and Anonymous Group Communication with MTor

    Directory of Open Access Journals (Sweden)

    Lin Dong

    2016-04-01

    Full Text Available This paper presents MTor, a low-latency anonymous group communication system. We construct MTor as an extension to Tor, allowing the construction of multi-source multicast trees on top of the existing Tor infrastructure. MTor does not depend on an external service to broker the group communication, and avoids central points of failure and trust. MTor’s substantial bandwidth savings and graceful scalability enable new classes of anonymous applications that are currently too bandwidth-intensive to be viable through traditional unicast Tor communication-e.g., group file transfer, collaborative editing, streaming video, and real-time audio conferencing.

  2. Scalable conditional induction variables (CIV) analysis

    DEFF Research Database (Denmark)

    Oancea, Cosmin Eugen; Rauchwerger, Lawrence

    2015-01-01

    representation. Our technique requires no modifications of our dependence tests, which is agnostic to the original shape of the subscripts, and is more powerful than previously reported dependence tests that rely on the pairwise disambiguation of read-write references. We have implemented the CIV analysis in our...... parallelizing compiler and evaluated its impact on five Fortran benchmarks. We have found that that there are many important loops using CIV subscripts and that our analysis can lead to their scalable parallelization. This in turn has led to the parallelization of the benchmark programs they appear in....

  3. Tip-Based Nanofabrication for Scalable Manufacturing

    Directory of Open Access Journals (Sweden)

    Huan Hu

    2017-03-01

    Full Text Available Tip-based nanofabrication (TBN is a family of emerging nanofabrication techniques that use a nanometer scale tip to fabricate nanostructures. In this review, we first introduce the history of the TBN and the technology development. We then briefly review various TBN techniques that use different physical or chemical mechanisms to fabricate features and discuss some of the state-of-the-art techniques. Subsequently, we focus on those TBN methods that have demonstrated potential to scale up the manufacturing throughput. Finally, we discuss several research directions that are essential for making TBN a scalable nano-manufacturing technology.

  4. Scalable Engineering of Quantum Optical Information Processing Architectures (SEQUOIA)

    Science.gov (United States)

    2016-12-13

    scalable architecture for LOQC and cluster state quantum computing (Ballistic or non-ballistic) - With parametric nonlinearities (Kerr, chi-2...Scalable Engineering of Quantum Optical Information-Processing Architectures (SEQUOIA) 5a. CONTRACT NUMBER W31-P4Q-15-C-0045 5b. GRANT NUMBER 5c...Technologies 13 December 2016 “Scalable Engineering of Quantum Optical Information-Processing Architectures (SEQUOIA)” Final R&D Status Report

  5. Big data integration: scalability and sustainability

    KAUST Repository

    Zhang, Zhang

    2016-01-26

    Integration of various types of omics data is critically indispensable for addressing most important and complex biological questions. In the era of big data, however, data integration becomes increasingly tedious, time-consuming and expensive, posing a significant obstacle to fully exploit the wealth of big biological data. Here we propose a scalable and sustainable architecture that integrates big omics data through community-contributed modules. Community modules are contributed and maintained by different committed groups and each module corresponds to a specific data type, deals with data collection, processing and visualization, and delivers data on-demand via web services. Based on this community-based architecture, we build Information Commons for Rice (IC4R; http://ic4r.org), a rice knowledgebase that integrates a variety of rice omics data from multiple community modules, including genome-wide expression profiles derived entirely from RNA-Seq data, resequencing-based genomic variations obtained from re-sequencing data of thousands of rice varieties, plant homologous genes covering multiple diverse plant species, post-translational modifications, rice-related literatures, and community annotations. Taken together, such architecture achieves integration of different types of data from multiple community-contributed modules and accordingly features scalable, sustainable and collaborative integration of big data as well as low costs for database update and maintenance, thus helpful for building IC4R into a comprehensive knowledgebase covering all aspects of rice data and beneficial for both basic and translational researches.

  6. Using MPI to Implement Scalable Libraries

    Science.gov (United States)

    Lusk, Ewing

    MPI is an instantiation of a general-purpose programming model, and high-performance implementations of the MPI standard have provided scalability for a wide range of applications. Ease of use was not an explicit goal of the MPI design process, which emphasized completeness, portability, and performance. Thus it is not surprising that MPI is occasionally criticized for being inconvenient to use and thus a drag on software developer productivity. One approach to the productivity issue is to use MPI to implement simpler programming models. Such models may limit the range of parallel algorithms that can be expressed, yet provide sufficient generality to benefit a significant number of applications, even from different domains.We illustrate this concept with the ADLB (Asynchronous, Dynamic Load-Balancing) library, which can be used to express manager/worker algorithms in such a way that their execution is scalable, even on the largestmachines. ADLB makes sophisticated use ofMPI functionality while providing an extremely simple API for the application programmer.We will describe it in the context of solving Sudoku puzzles and a nuclear physics Monte Carlo application currently running on tens of thousands of processors.

  7. Using the scalable nonlinear equations solvers package

    Energy Technology Data Exchange (ETDEWEB)

    Gropp, W.D.; McInnes, L.C.; Smith, B.F.

    1995-02-01

    SNES (Scalable Nonlinear Equations Solvers) is a software package for the numerical solution of large-scale systems of nonlinear equations on both uniprocessors and parallel architectures. SNES also contains a component for the solution of unconstrained minimization problems, called SUMS (Scalable Unconstrained Minimization Solvers). Newton-like methods, which are known for their efficiency and robustness, constitute the core of the package. As part of the multilevel PETSc library, SNES incorporates many features and options from other parts of PETSc. In keeping with the spirit of the PETSc library, the nonlinear solution routines are data-structure-neutral, making them flexible and easily extensible. This users guide contains a detailed description of uniprocessor usage of SNES, with some added comments regarding multiprocessor usage. At this time the parallel version is undergoing refinement and extension, as we work toward a common interface for the uniprocessor and parallel cases. Thus, forthcoming versions of the software will contain additional features, and changes to parallel interface may result at any time. The new parallel version will employ the MPI (Message Passing Interface) standard for interprocessor communication. Since most of these details will be hidden, users will need to perform only minimal message-passing programming.

  8. Towards Scalable Graph Computation on Mobile Devices

    Science.gov (United States)

    Chen, Yiqi; Lin, Zhiyuan; Pienta, Robert; Kahng, Minsuk; Chau, Duen Horng

    2015-01-01

    Mobile devices have become increasingly central to our everyday activities, due to their portability, multi-touch capabilities, and ever-improving computational power. Such attractive features have spurred research interest in leveraging mobile devices for computation. We explore a novel approach that aims to use a single mobile device to perform scalable graph computation on large graphs that do not fit in the device's limited main memory, opening up the possibility of performing on-device analysis of large datasets, without relying on the cloud. Based on the familiar memory mapping capability provided by today's mobile operating systems, our approach to scale up computation is powerful and intentionally kept simple to maximize its applicability across the iOS and Android platforms. Our experiments demonstrate that an iPad mini can perform fast computation on large real graphs with as many as 272 million edges (Google+ social graph), at a speed that is only a few times slower than a 13″ Macbook Pro. Through creating a real world iOS app with this technique, we demonstrate the strong potential application for scalable graph computation on a single mobile device using our approach. PMID:25859564

  9. Scalability Optimization of Seamless Positioning Service

    Directory of Open Access Journals (Sweden)

    Juraj Machaj

    2016-01-01

    Full Text Available Recently positioning services are getting more attention not only within research community but also from service providers. From the service providers point of view positioning service that will be able to work seamlessly in all environments, for example, indoor, dense urban, and rural, has a huge potential to open new markets. However, such system does not only need to provide accurate position estimates but have to be scalable and resistant to fake positioning requests. In the previous works we have proposed a modular system, which is able to provide seamless positioning in various environments. The system automatically selects optimal positioning module based on available radio signals. The system currently consists of three positioning modules—GPS, GSM based positioning, and Wi-Fi based positioning. In this paper we will propose algorithm which will reduce time needed for position estimation and thus allow higher scalability of the modular system and thus allow providing positioning services to higher amount of users. Such improvement is extremely important, for real world application where large number of users will require position estimates, since positioning error is affected by response time of the positioning server.

  10. An Open Infrastructure for Scalable, Reconfigurable Analysis

    Energy Technology Data Exchange (ETDEWEB)

    de Supinski, B R; Fowler, R; Gamblin, T; Mueller, F; Ratn, P; Schulz, M

    2008-05-15

    Petascale systems will have hundreds of thousands of processor cores so their applications must be massively parallel. Effective use of petascale systems will require efficient interprocess communication through memory hierarchies and complex network topologies. Tools to collect and analyze detailed data about this communication would facilitate its optimization. However, several factors complicate tool design. First, large-scale runs on petascale systems will be a precious commodity, so scalable tools must have almost no overhead. Second, the volume of performance data from petascale runs could easily overwhelm hand analysis and, thus, tools must collect only data that is relevant to diagnosing performance problems. Analysis must be done in-situ, when available processing power is proportional to the data. We describe a tool framework that overcomes these complications. Our approach allows application developers to combine existing techniques for measurement, analysis, and data aggregation to develop application-specific tools quickly. Dynamic configuration enables application developers to select exactly the measurements needed and generic components support scalable aggregation and analysis of this data with little additional effort.

  11. Highly scalable Ab initio genomic motif identification

    KAUST Repository

    Marchand, Benoit

    2011-01-01

    We present results of scaling an ab initio motif family identification system, Dragon Motif Finder (DMF), to 65,536 processor cores of IBM Blue Gene/P. DMF seeks groups of mutually similar polynucleotide patterns within a set of genomic sequences and builds various motif families from them. Such information is of relevance to many problems in life sciences. Prior attempts to scale such ab initio motif-finding algorithms achieved limited success. We solve the scalability issues using a combination of mixed-mode MPI-OpenMP parallel programming, master-slave work assignment, multi-level workload distribution, multi-level MPI collectives, and serial optimizations. While the scalability of our algorithm was excellent (94% parallel efficiency on 65,536 cores relative to 256 cores on a modest-size problem), the final speedup with respect to the original serial code exceeded 250,000 when serial optimizations are included. This enabled us to carry out many large-scale ab initio motiffinding simulations in a few hours while the original serial code would have needed decades of execution time. Copyright 2011 ACM.

  12. Load Generation for Investigating Game System Scalability

    OpenAIRE

    Halvorsen, Stig Magnus

    2014-01-01

    Video games have proven to be an interesting platform for computer scientists, as many games demand the latest technology, fast response times and effective utilization of hardware. Video games have been used both as a topic of and a tool for computer science (CS). Finding the right games to perform experiments on is however difficult. An important reason is the lack of suitable games for research. Open source games are attractive candidates as their availability and openness is crucial to pr...

  13. Advanced hardware design for error correcting codes

    CERN Document Server

    Coussy, Philippe

    2015-01-01

    This book provides thorough coverage of error correcting techniques. It includes essential basic concepts and the latest advances on key topics in design, implementation, and optimization of hardware/software systems for error correction. The book’s chapters are written by internationally recognized experts in this field. Topics include evolution of error correction techniques, industrial user needs, architectures, and design approaches for the most advanced error correcting codes (Polar Codes, Non-Binary LDPC, Product Codes, etc). This book provides access to recent results, and is suitable for graduate students and researchers of mathematics, computer science, and engineering. • Examines how to optimize the architecture of hardware design for error correcting codes; • Presents error correction codes from theory to optimized architecture for the current and the next generation standards; • Provides coverage of industrial user needs advanced error correcting techniques.

  14. Laser acceleration

    Science.gov (United States)

    Tajima, T.; Nakajima, K.; Mourou, G.

    2017-02-01

    The fundamental idea of Laser Wakefield Acceleration (LWFA) is reviewed. An ultrafast intense laser pulse drives coherent wakefield with a relativistic amplitude robustly supported by the plasma. While the large amplitude of wakefields involves collective resonant oscillations of the eigenmode of the entire plasma electrons, the wake phase velocity ˜ c and ultrafastness of the laser pulse introduce the wake stability and rigidity. A large number of worldwide experiments show a rapid progress of this concept realization toward both the high-energy accelerator prospect and broad applications. The strong interest in this has been spurring and stimulating novel laser technologies, including the Chirped Pulse Amplification, the Thin Film Compression, the Coherent Amplification Network, and the Relativistic Mirror Compression. These in turn have created a conglomerate of novel science and technology with LWFA to form a new genre of high field science with many parameters of merit in this field increasing exponentially lately. This science has triggered a number of worldwide research centers and initiatives. Associated physics of ion acceleration, X-ray generation, and astrophysical processes of ultrahigh energy cosmic rays are reviewed. Applications such as X-ray free electron laser, cancer therapy, and radioisotope production etc. are considered. A new avenue of LWFA using nanomaterials is also emerging.

  15. Particle Transport Simulation on Heterogeneous Hardware

    CERN Multimedia

    CERN. Geneva

    2014-01-01

    CPUs and GPGPUs. About the speaker Vladimir Koylazov is CTO and founder of Chaos Software and one of the original developers of the V-Ray raytracing software. Passionate about 3D graphics and programming, Vlado is the driving force behind Chaos Group's software solutions. He participated in the implementation of algorithms for accurate light simulations and support for different hardware platforms, including CPU and GPGPU, as well as distributed calculat...

  16. Hex-Chrome Free Hardware - BAE Experience

    Science.gov (United States)

    2010-06-01

    Trane S 3201063A1 • TRW TS 2-25-60, Class A • Volkswagen TL 233 • Volvo VCS5737.29, .19 6/23/2010 Magni is one of several coatings, others such...installation and part must be revised. • Example: Panther FOV identified approximately 500 fasteners/ hardware that are being updated to “clean” within...particular program require coordination and funding to revise/ update (ex: MMPV common with MRAP) COTS, Government furnished, proprietary items and

  17. Instrumentation Hardware Abstraction Language (IHAL) Handbook

    Science.gov (United States)

    2017-01-01

    guidelines and thereby eliminating any misinterpretations that may exist. The RCC IRIG 106 sets forth standards for various aspects of telemetry (TM... community . At the time the task was initiated, IHAL had been shown to support configuration of analog signal conditioning hardware and pulse code...configurations were displayed in a single view. The settings on each device were then changed and immediately communicated to the appropriate vendor

  18. A hardware implementation of neural network with modified HANNIBAL architecture

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Bum youb; Chung, Duck Jin [Inha University, Inchon (Korea, Republic of)

    1996-03-01

    A digital hardware architecture for artificial neural network with learning capability is described in this paper. It is a modified hardware architecture known as HANNIBAL(Hardware Architecture for Neural Networks Implementing Back propagation Algorithm Learning). For implementing an efficient neural network hardware, we analyzed various type of multiplier which is major function block of neuro-processor cell. With this result, we design a efficient digital neural network hardware using serial/parallel multiplier, and test the operation. We also analyze the hardware efficiency with logic level simulation. (author). 14 refs., 10 figs., 3 tabs.

  19. A Hardware Lab Anywhere At Any Time

    Directory of Open Access Journals (Sweden)

    Tobias Schubert

    2004-12-01

    Full Text Available Scientific technical courses are an important component in any student's education. These courses are usually characterised by the fact that the students execute experiments in special laboratories. This leads to extremely high costs and a reduction in the maximum number of possible participants. From this traditional point of view, it doesn't seem possible to realise the concepts of a Virtual University in the context of sophisticated technical courses since the students must be "on the spot". In this paper we introduce the so-called Mobile Hardware Lab which makes student participation possible at any time and from any place. This lab nevertheless transfers a feeling of being present in a laboratory. This is accomplished with a special Learning Management System in combination with hardware components which correspond to a fully equipped laboratory workstation that are lent out to the students for the duration of the lab. The experiments are performed and solved at home, then handed in electronically. Judging and marking are also both performed electronically. Since 2003 the Mobile Hardware Lab is now offered in a completely web based form.

  20. Scalable Light Module for Low-Cost, High-Efficiency Light- Emitting Diode Luminaires

    Energy Technology Data Exchange (ETDEWEB)

    Tarsa, Eric [Cree, Inc., Goleta, CA (United States)

    2015-08-31

    During this two-year program Cree developed a scalable, modular optical architecture for low-cost, high-efficacy light emitting diode (LED) luminaires. Stated simply, the goal of this architecture was to efficiently and cost-effectively convey light from LEDs (point sources) to broad luminaire surfaces (area sources). By simultaneously developing warm-white LED components and low-cost, scalable optical elements, a high system optical efficiency resulted. To meet program goals, Cree evaluated novel approaches to improve LED component efficacy at high color quality while not sacrificing LED optical efficiency relative to conventional packages. Meanwhile, efficiently coupling light from LEDs into modular optical elements, followed by optimally distributing and extracting this light, were challenges that were addressed via novel optical design coupled with frequent experimental evaluations. Minimizing luminaire bill of materials and assembly costs were two guiding principles for all design work, in the effort to achieve luminaires with significantly lower normalized cost ($/klm) than existing LED fixtures. Chief project accomplishments included the achievement of >150 lm/W warm-white LEDs having primary optics compatible with low-cost modular optical elements. In addition, a prototype Light Module optical efficiency of over 90% was measured, demonstrating the potential of this scalable architecture for ultra-high-efficacy LED luminaires. Since the project ended, Cree has continued to evaluate optical element fabrication and assembly methods in an effort to rapidly transfer this scalable, cost-effective technology to Cree production development groups. The Light Module concept is likely to make a strong contribution to the development of new cost-effective, high-efficacy luminaries, thereby accelerating widespread adoption of energy-saving SSL in the U.S.

  1. CUDAMPF: a multi-tiered parallel framework for accelerating protein sequence search in HMMER on CUDA-enabled GPU.

    Science.gov (United States)

    Jiang, Hanyu; Ganesan, Narayan

    2016-02-27

    HMMER software suite is widely used for analysis of homologous protein and nucleotide sequences with high sensitivity. The latest version of hmmsearch in HMMER 3.x, utilizes heuristic-pipeline which consists of MSV/SSV (Multiple/Single ungapped Segment Viterbi) stage, P7Viterbi stage and the Forward scoring stage to accelerate homology detection. Since the latest version is highly optimized for performance on modern multi-core CPUs with SSE capabilities, only a few acceleration attempts report speedup. However, the most compute intensive tasks within the pipeline (viz., MSV/SSV and P7Viterbi stages) still stand to benefit from the computational capabilities of massively parallel processors. A Multi-Tiered Parallel Framework (CUDAMPF) implemented on CUDA-enabled GPUs presented here, offers a finer-grained parallelism for MSV/SSV and Viterbi algorithms. We couple SIMT (Single Instruction Multiple Threads) mechanism with SIMD (Single Instructions Multiple Data) video instructions with warp-synchronism to achieve high-throughput processing and eliminate thread idling. We also propose a hardware-aware optimal allocation scheme of scarce resources like on-chip memory and caches in order to boost performance and scalability of CUDAMPF. In addition, runtime compilation via NVRTC available with CUDA 7.0 is incorporated into the presented framework that not only helps unroll innermost loop to yield upto 2 to 3-fold speedup than static compilation but also enables dynamic loading and switching of kernels depending on the query model size, in order to achieve optimal performance. CUDAMPF is designed as a hardware-aware parallel framework for accelerating computational hotspots within the hmmsearch pipeline as well as other sequence alignment applications. It achieves significant speedup by exploiting hierarchical parallelism on single GPU and takes full advantage of limited resources based on their own performance features. In addition to exceeding performance of other

  2. Advanced technologies for scalable ATLAS conditions database access on the grid

    Science.gov (United States)

    Basset, R.; Canali, L.; Dimitrov, G.; Girone, M.; Hawkings, R.; Nevski, P.; Valassi, A.; Vaniachine, A.; Viegas, F.; Walker, R.; Wong, A.

    2010-04-01

    During massive data reprocessing operations an ATLAS Conditions Database application must support concurrent access from numerous ATLAS data processing jobs running on the Grid. By simulating realistic work-flow, ATLAS database scalability tests provided feedback for Conditions Db software optimization and allowed precise determination of required distributed database resources. In distributed data processing one must take into account the chaotic nature of Grid computing characterized by peak loads, which can be much higher than average access rates. To validate database performance at peak loads, we tested database scalability at very high concurrent jobs rates. This has been achieved through coordinated database stress tests performed in series of ATLAS reprocessing exercises at the Tier-1 sites. The goal of database stress tests is to detect scalability limits of the hardware deployed at the Tier-1 sites, so that the server overload conditions can be safely avoided in a production environment. Our analysis of server performance under stress tests indicates that Conditions Db data access is limited by the disk I/O throughput. An unacceptable side-effect of the disk I/O saturation is a degradation of the WLCG 3D Services that update Conditions Db data at all ten ATLAS Tier-1 sites using the technology of Oracle Streams. To avoid such bottlenecks we prototyped and tested a novel approach for database peak load avoidance in Grid computing. Our approach is based upon the proven idea of pilot job submission on the Grid: instead of the actual query, an ATLAS utility library sends to the database server a pilot query first.

  3. Construction of a smart medication dispenser with high degree of scalability and remote manageability.

    Science.gov (United States)

    Pak, JuGeon; Park, KeeHyun

    2012-01-01

    We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispensing tray (MDT). In the proposed dispenser, the medication for each patient is stored in an MDT. One smart medication dispenser contains mainly one MDT; however, the dispenser can be extended to include more MDTs in order to support multiple users using one dispenser. For remote management, the proposed dispenser transmits the medication status and the system configurations to the monitoring server. In the case of a specific event such as a shortage of medication, memory overload, software error, or non-adherence, the event is transmitted immediately. All these operations are performed automatically without the intervention of patients, through the agent program installed in the dispenser. Results of implementation and verification show that the proposed dispenser operates normally and performs the management operations from the medication monitoring server suitably.

  4. Construction of a Smart Medication Dispenser with High Degree of Scalability and Remote Manageability

    Directory of Open Access Journals (Sweden)

    JuGeon Pak

    2012-01-01

    Full Text Available We propose a smart medication dispenser having a high degree of scalability and remote manageability. We construct the dispenser to have extensible hardware architecture for achieving scalability, and we install an agent program in it for achieving remote manageability. The dispenser operates as follows: when the real-time clock reaches the predetermined medication time and the user presses the dispense button at that time, the predetermined medication is dispensed from the medication dispensing tray (MDT. In the proposed dispenser, the medication for each patient is stored in an MDT. One smart medication dispenser contains mainly one MDT; however, the dispenser can be extended to include more MDTs in order to support multiple users using one dispenser. For remote management, the proposed dispenser transmits the medication status and the system configurations to the monitoring server. In the case of a specific event such as a shortage of medication, memory overload, software error, or non-adherence, the event is transmitted immediately. All these operations are performed automatically without the intervention of patients, through the agent program installed in the dispenser. Results of implementation and verification show that the proposed dispenser operates normally and performs the management operations from the medication monitoring server suitably.

  5. Network selection, Information filtering and Scalable computation

    Science.gov (United States)

    Ye, Changqing

    -complete factorizations, possibly with a high percentage of missing values. This promotes additional sparsity beyond rank reduction. Computationally, we design methods based on a ``decomposition and combination'' strategy, to break large-scale optimization into many small subproblems to solve in a recursive and parallel manner. On this basis, we implement the proposed methods through multi-platform shared-memory parallel programming, and through Mahout, a library for scalable machine learning and data mining, for mapReduce computation. For example, our methods are scalable to a dataset consisting of three billions of observations on a single machine with sufficient memory, having good timings. Both theoretical and numerical investigations show that the proposed methods exhibit significant improvement in accuracy over state-of-the-art scalable methods.

  6. Accelerators and the Accelerator Community

    Energy Technology Data Exchange (ETDEWEB)

    Malamud, Ernest; Sessler, Andrew

    2008-06-01

    In this paper, standing back--looking from afar--and adopting a historical perspective, the field of accelerator science is examined. How it grew, what are the forces that made it what it is, where it is now, and what it is likely to be in the future are the subjects explored. Clearly, a great deal of personal opinion is invoked in this process.

  7. Scalable Transactions for Web Applications in the Cloud

    NARCIS (Netherlands)

    Zhou, W.; Pierre, G.E.O.; Chi, C.-H.

    2009-01-01

    Cloud Computing platforms provide scalability and high availability properties for web applications but they sacrifice data consistency at the same time. However, many applications cannot afford any data inconsistency. We present a scalable transaction manager for NoSQL cloud database services to

  8. New Complexity Scalable MPEG Encoding Techniques for Mobile Applications

    Directory of Open Access Journals (Sweden)

    Stephan Mietens

    2004-03-01

    Full Text Available Complexity scalability offers the advantage of one-time design of video applications for a large product family, including mobile devices, without the need of redesigning the applications on the algorithmic level to meet the requirements of the different products. In this paper, we present complexity scalable MPEG encoding having core modules with modifications for scalability. The interdependencies of the scalable modules and the system performance are evaluated. Experimental results show scalability giving a smooth change in complexity and corresponding video quality. Scalability is basically achieved by varying the number of computed DCT coefficients and the number of evaluated motion vectors but other modules are designed such they scale with the previous parameters. In the experiments using the “Stefan” sequence, the elapsed execution time of the scalable encoder, reflecting the computational complexity, can be gradually reduced to roughly 50% of its original execution time. The video quality scales between 20 dB and 48 dB PSNR with unity quantizer setting, and between 21.5 dB and 38.5 dB PSNR for different sequences targeting 1500 kbps. The implemented encoder and the scalability techniques can be successfully applied in mobile systems based on MPEG video compression.

  9. Scalable DeNoise-and-Forward in Bidirectional Relay Networks

    DEFF Research Database (Denmark)

    Sørensen, Jesper Hemming; Krigslund, Rasmus; Popovski, Petar

    2010-01-01

    In this paper a scalable relaying scheme is proposed based on an existing concept called DeNoise-and-Forward, DNF. We call it Scalable DNF, S-DNF, and it targets the scenario with multiple communication flows through a single common relay. The idea of the scheme is to combine packets at the relay...

  10. Building scalable apps with Redis and Node.js

    CERN Document Server

    Johanan, Joshua

    2014-01-01

    If the phrase scalability sounds alien to you, then this is an ideal book for you. You will not need much Node.js experience as each framework is demonstrated in a way that requires no previous knowledge of the framework. You will be building scalable Node.js applications in no time! Knowledge of JavaScript is required.

  11. The Fermilab Accelerator control system

    Science.gov (United States)

    Bogert, Dixon

    1986-06-01

    With the advent of the Tevatron, considerable upgrades have been made to the controls of all the Fermilab Accelerators. The current system is based on making as large an amount of data as possible available to many operators or end-users. Specifically there are about 100 000 separate readings, settings, and status and control registers in the various machines, all of which can be accessed by seventeen consoles, some in the Main Control Room and others distributed throughout the complex. A "Host" computer network of approximately eighteen PDP-11/34's, seven PDP-11/44's, and three VAX-11/785's supports a distributed data acquisition system including Lockheed MAC-16's left from the original Main Ring and Booster instrumentation and upwards of 1000 Z80, Z8002, and M68000 microprocessors in dozens of configurations. Interaction of the various parts of the system is via a central data base stored on the disk of one of the VAXes. The primary computer-hardware communication is via CAMAC for the new Tevatron and Antiproton Source; certain subsystems, among them vacuum, refrigeration, and quench protection, reside in the distributed microprocessors and communicate via GAS, an in-house protocol. An important hardware feature is an accurate clock system making a large number of encoded "events" in the accelerator supercycle available for both hardware modules and computers. System software features include the ability to save the current state of the machine or any subsystem and later restore it or compare it with the state at another time, a general logging facility to keep track of specific variables over long periods of time, detection of "exception conditions" and the posting of alarms, and a central filesharing capability in which files on VAX disks are available for access by any of the "Host" processors.

  12. BASSET: Scalable Gateway Finder in Large Graphs

    Energy Technology Data Exchange (ETDEWEB)

    Tong, H; Papadimitriou, S; Faloutsos, C; Yu, P S; Eliassi-Rad, T

    2010-11-03

    Given a social network, who is the best person to introduce you to, say, Chris Ferguson, the poker champion? Or, given a network of people and skills, who is the best person to help you learn about, say, wavelets? The goal is to find a small group of 'gateways': persons who are close enough to us, as well as close enough to the target (person, or skill) or, in other words, are crucial in connecting us to the target. The main contributions are the following: (a) we show how to formulate this problem precisely; (b) we show that it is sub-modular and thus it can be solved near-optimally; (c) we give fast, scalable algorithms to find such gateways. Experiments on real data sets validate the effectiveness and efficiency of the proposed methods, achieving up to 6,000,000x speedup.

  13. The Concept of Business Model Scalability

    DEFF Research Database (Denmark)

    Nielsen, Christian; Lund, Morten

    2015-01-01

    are leveraged in this value creation, delivery and realization exercise. Central to the mainstream understanding of business models is the value proposition towards the customer and the hypothesis generated is that if the firm delivers to the customer what he/she requires, then there is a good foundation......The power of business models lies in their ability to visualize and clarify how firms’ may configure their value creation processes. Among the key aspects of business model thinking are a focus on what the customer values, how this value is best delivered to the customer and how strategic partners...... for a long-term profitable business. However, the message conveyed in this article is that while providing a good value proposition may help the firm ‘get by’, the really successful businesses of today are those able to reach the sweet-spot of business model scalability. This article introduces and discusses...

  14. Towards scalable Byzantine fault-tolerant replication

    Science.gov (United States)

    Zbierski, Maciej

    2017-08-01

    Byzantine fault-tolerant (BFT) replication is a powerful technique, enabling distributed systems to remain available and correct even in the presence of arbitrary faults. Unfortunately, existing BFT replication protocols are mostly load-unscalable, i.e. they fail to respond with adequate performance increase whenever new computational resources are introduced into the system. This article proposes a universal architecture facilitating the creation of load-scalable distributed services based on BFT replication. The suggested approach exploits parallel request processing to fully utilize the available resources, and uses a load balancer module to dynamically adapt to the properties of the observed client workload. The article additionally provides a discussion on selected deployment scenarios, and explains how the proposed architecture could be used to increase the dependability of contemporary large-scale distributed systems.

  15. A graph algebra for scalable visual analytics.

    Science.gov (United States)

    Shaverdian, Anna A; Zhou, Hao; Michailidis, George; Jagadish, Hosagrahar V

    2012-01-01

    Visual analytics (VA), which combines analytical techniques with advanced visualization features, is fast becoming a standard tool for extracting information from graph data. Researchers have developed many tools for this purpose, suggesting a need for formal methods to guide these tools' creation. Increased data demands on computing requires redesigning VA tools to consider performance and reliability in the context of analysis of exascale datasets. Furthermore, visual analysts need a way to document their analyses for reuse and results justification. A VA graph framework encapsulated in a graph algebra helps address these needs. Its atomic operators include selection and aggregation. The framework employs a visual operator and supports dynamic attributes of data to enable scalable visual exploration of data.

  16. Declarative and Scalable Selection for Map Visualizations

    DEFF Research Database (Denmark)

    Kefaloukos, Pimin Konstantin Balic

    foreground layers is merited. (2) The typical map making professional has changed from a GIS specialist to a busy person with map making as a secondary skill. Today, thematic maps are produced by journalists, aid workers, amateur data enth siasts, and scientists alike. Therefore it is crucial...... that this diverse group of map makers is provided with easy-to-use and expressible thematic map design tools. Such tools should support customized selection of data for maps in scenarios where developer time is a scarce resource. (3) The Web provides access to massive data repositories for thematic maps...... based on an access log of recent requests. The results show that Glossy SQL og CVL can be used to compute cartographic selection by processing one or more complex queries in a relational database. The scalability of the approach has been verified up to half a million objects in the database. Furthermore...

  17. Scalable and Media Aware Adaptive Video Streaming over Wireless Networks

    Science.gov (United States)

    Tizon, Nicolas; Pesquet-Popescu, Béatrice

    2008-12-01

    This paper proposes an advanced video streaming system based on scalable video coding in order to optimize resource utilization in wireless networks with retransmission mechanisms at radio protocol level. The key component of this system is a packet scheduling algorithm which operates on the different substreams of a main scalable video stream and which is implemented in a so-called media aware network element. The concerned type of transport channel is a dedicated channel subject to parameters (bitrate, loss rate) variations on the long run. Moreover, we propose a combined scalability approach in which common temporal and SNR scalability features can be used jointly with a partitioning of the image into regions of interest. Simulation results show that our approach provides substantial quality gain compared to classical packet transmission methods and they demonstrate how ROI coding combined with SNR scalability allows to improve again the visual quality.

  18. accelerating cavity

    CERN Multimedia

    On the inside of the cavity there is a layer of niobium. Operating at 4.2 degrees above absolute zero, the niobium is superconducting and carries an accelerating field of 6 million volts per metre with negligible losses. Each cavity has a surface of 6 m2. The niobium layer is only 1.2 microns thick, ten times thinner than a hair. Such a large area had never been coated to such a high accuracy. A speck of dust could ruin the performance of the whole cavity so the work had to be done in an extremely clean environment.

  19. Pre-Hardware Optimization of Spacecraft Image Processing Software Algorithms and Hardware Implementation

    Science.gov (United States)

    Kizhner, Semion; Flatley, Thomas P.; Hestnes, Phyllis; Jentoft-Nilsen, Marit; Petrick, David J.; Day, John H. (Technical Monitor)

    2001-01-01

    Spacecraft telemetry rates have steadily increased over the last decade presenting a problem for real-time processing by ground facilities. This paper proposes a solution to a related problem for the Geostationary Operational Environmental Spacecraft (GOES-8) image processing application. Although large super-computer facilities are the obvious heritage solution, they are very costly, making it imperative to seek a feasible alternative engineering solution at a fraction of the cost. The solution is based on a Personal Computer (PC) platform and synergy of optimized software algorithms and re-configurable computing hardware technologies, such as Field Programmable Gate Arrays (FPGA) and Digital Signal Processing (DSP). It has been shown in [1] and [2] that this configuration can provide superior inexpensive performance for a chosen application on the ground station or on-board a spacecraft. However, since this technology is still maturing, intensive pre-hardware steps are necessary to achieve the benefits of hardware implementation. This paper describes these steps for the GOES-8 application, a software project developed using Interactive Data Language (IDL) (Trademark of Research Systems, Inc.) on a Workstation/UNIX platform. The solution involves converting the application to a PC/Windows/RC platform, selected mainly by the availability of low cost, adaptable high-speed RC hardware. In order for the hybrid system to run, the IDL software was modified to account for platform differences. It was interesting to examine the gains and losses in performance on the new platform, as well as unexpected observations before implementing hardware. After substantial pre-hardware optimization steps, the necessity of hardware implementation for bottleneck code in the PC environment became evident and solvable beginning with the methodology described in [1], [2], and implementing a novel methodology for this specific application [6]. The PC-RC interface bandwidth problem for the

  20. Trusted Module Acquisition Through Proof-Carrying Hardware Intellectual Property

    Science.gov (United States)

    2015-05-22

    hardware intellectual property (PCHIP) framework, which aims to ensure the trustworthiness of third-party hardware IPs utilizing formal methods. We...published in non peer-reviewed journals: Final Report: Trusted Module Acquisition Through Proof-Carrying Hardware Intellectual Property Report Title By...borrowing ideas from the proof carrying code (PCC) in software domain, in this project we introduced the proof carrying hardware intellectual property

  1. The Impact of Flight Hardware Scavenging on Space Logistics

    Science.gov (United States)

    Oeftering, Richard C.

    2011-01-01

    For a given fixed launch vehicle capacity the logistics payload delivered to the moon may be only roughly 20 percent of the payload delivered to the International Space Station (ISS). This is compounded by the much lower flight frequency to the moon and thus low availability of spares for maintenance. This implies that lunar hardware is much more scarce and more costly per kilogram than ISS and thus there is much more incentive to preserve hardware. The Constellation Lunar Surface System (LSS) program is considering ways of utilizing hardware scavenged from vehicles including the Altair lunar lander. In general, the hardware will have only had a matter of hours of operation yet there may be years of operational life remaining. By scavenging this hardware the program, in effect, is treating vehicle hardware as part of the payload. Flight hardware may provide logistics spares for system maintenance and reduce the overall logistics footprint. This hardware has a wide array of potential applications including expanding the power infrastructure, and exploiting in-situ resources. Scavenging can also be seen as a way of recovering the value of, literally, billions of dollars worth of hardware that would normally be discarded. Scavenging flight hardware adds operational complexity and steps must be taken to augment the crew s capability with robotics, capabilities embedded in flight hardware itself, and external processes. New embedded technologies are needed to make hardware more serviceable and scavengable. Process technologies are needed to extract hardware, evaluate hardware, reconfigure or repair hardware, and reintegrate it into new applications. This paper also illustrates how scavenging can be used to drive down the cost of the overall program by exploiting the intrinsic value of otherwise discarded flight hardware.

  2. Center for Programming Models for Scalable Parallel Computing - Towards Enhancing OpenMP for Manycore and Heterogeneous Nodes

    Energy Technology Data Exchange (ETDEWEB)

    Barbara Chapman

    2012-02-01

    OpenMP was not well recognized at the beginning of the project, around year 2003, because of its limited use in DoE production applications and the inmature hardware support for an efficient implementation. Yet in the recent years, it has been graduately adopted both in HPC applications, mostly in the form of MPI+OpenMP hybrid code, and in mid-scale desktop applications for scientific and experimental studies. We have observed this trend and worked deligiently to improve our OpenMP compiler and runtimes, as well as to work with the OpenMP standard organization to make sure OpenMP are evolved in the direction close to DoE missions. In the Center for Programming Models for Scalable Parallel Computing project, the HPCTools team at the University of Houston (UH), directed by Dr. Barbara Chapman, has been working with project partners, external collaborators and hardware vendors to increase the scalability and applicability of OpenMP for multi-core (and future manycore) platforms and for distributed memory systems by exploring different programming models, language extensions, compiler optimizations, as well as runtime library support.

  3. Safe to Fly: Certifying COTS Hardware for Spaceflight

    Science.gov (United States)

    Fichuk, Jessica L.

    2011-01-01

    Providing hardware for the astronauts to use on board the Space Shuttle or International Space Station (ISS) involves a certification process that entails evaluating hardware safety, weighing risks, providing mitigation, and verifying requirements. Upon completion of this certification process, the hardware is deemed safe to fly. This process from start to finish can be completed as quickly as 1 week or can take several years in length depending on the complexity of the hardware and whether the item is a unique custom design. One area of cost and schedule savings that NASA implements is buying Commercial Off the Shelf (COTS) hardware and certifying it for human spaceflight as safe to fly. By utilizing commercial hardware, NASA saves time not having to develop, design and build the hardware from scratch, as well as a timesaving in the certification process. By utilizing COTS hardware, the current detailed certification process can be simplified which results in schedule savings. Cost savings is another important benefit of flying COTS hardware. Procuring COTS hardware for space use can be more economical than custom building the hardware. This paper will investigate the cost savings associated with certifying COTS hardware to NASA s standards rather than performing a custom build.

  4. Is Hardware Removal Recommended after Ankle Fracture Repair?

    Directory of Open Access Journals (Sweden)

    Hong-Geun Jung

    2016-01-01

    Full Text Available The indications and clinical necessity for routine hardware removal after treating ankle or distal tibia fracture with open reduction and internal fixation are disputed even when hardware-related pain is insignificant. Thus, we determined the clinical effects of routine hardware removal irrespective of the degree of hardware-related pain, especially in the perspective of patients’ daily activities. This study was conducted on 80 consecutive cases (78 patients treated by surgery and hardware removal after bony union. There were 56 ankle and 24 distal tibia fractures. The hardware-related pain, ankle joint stiffness, discomfort on ambulation, and patient satisfaction were evaluated before and at least 6 months after hardware removal. Pain score before hardware removal was 3.4 (range 0 to 6 and decreased to 1.3 (range 0 to 6 after removal. 58 (72.5% patients experienced improved ankle stiffness and 65 (81.3% less discomfort while walking on uneven ground and 63 (80.8% patients were satisfied with hardware removal. These results suggest that routine hardware removal after ankle or distal tibia fracture could ameliorate hardware-related pain and improves daily activities and patient satisfaction even when the hardware-related pain is minimal.

  5. Unifying Approach to Software and Hardware Design for Scientific Calculations

    OpenAIRE

    Litvinov, G. L.; Maslov, V. P.; Rodionov, A. Ya.

    1999-01-01

    A unifying approach to software and hardware design generated by ideas of Idempotent Mathematics is discussed. The so-called idempotent correspondence principle for algorithms, programs and hardware units is described. A software project based on this approach is presented. Key words: universal algorithms, idempotent calculus, software design, hardware design, object oriented programming

  6. Is Hardware Removal Recommended after Ankle Fracture Repair?

    Science.gov (United States)

    Jung, Hong-Geun; Kim, Jin-Il; Park, Jae-Yong; Park, Jong-Tae; Eom, Joon-Sang; Lee, Dong-Oh

    2016-01-01

    The indications and clinical necessity for routine hardware removal after treating ankle or distal tibia fracture with open reduction and internal fixation are disputed even when hardware-related pain is insignificant. Thus, we determined the clinical effects of routine hardware removal irrespective of the degree of hardware-related pain, especially in the perspective of patients' daily activities. This study was conducted on 80 consecutive cases (78 patients) treated by surgery and hardware removal after bony union. There were 56 ankle and 24 distal tibia fractures. The hardware-related pain, ankle joint stiffness, discomfort on ambulation, and patient satisfaction were evaluated before and at least 6 months after hardware removal. Pain score before hardware removal was 3.4 (range 0 to 6) and decreased to 1.3 (range 0 to 6) after removal. 58 (72.5%) patients experienced improved ankle stiffness and 65 (81.3%) less discomfort while walking on uneven ground and 63 (80.8%) patients were satisfied with hardware removal. These results suggest that routine hardware removal after ankle or distal tibia fracture could ameliorate hardware-related pain and improves daily activities and patient satisfaction even when the hardware-related pain is minimal.

  7. A Survey of Software and Hardware Approaches to Performing Read Alignment in Next Generation Sequencing.

    Science.gov (United States)

    Al Kawam, Ahmad; Khatri, Sunil; Datta, Aniruddha

    2017-01-01

    Computational genomics is an emerging field that is enabling us to reveal the origins of life and the genetic basis of diseases such as cancer. Next Generation Sequencing (NGS) technologies have unleashed a wealth of genomic information by producing immense amounts of raw data. Before any functional analysis can be applied to this data, read alignment is applied to find the genomic coordinates of the produced sequences. Alignment algorithms have evolved rapidly with the advancement in sequencing technology, striving to achieve biological accuracy at the expense of increasing space and time complexities. Hardware approaches have been proposed to accelerate the computational bottlenecks created by the alignment process. Although several hardware approaches have achieved remarkable speedups, most have overlooked important biological features, which have hampered their widespread adoption by the genomics community. In this paper, we provide a brief biological introduction to genomics and NGS. We discuss the most popular next generation read alignment tools and algorithms. Furthermore, we provide a comprehensive survey of the hardware implementations used to accelerate these algorithms.

  8. Computer hardware for radiologists: Part 2

    Directory of Open Access Journals (Sweden)

    Indrajit I

    2010-01-01

    Full Text Available Computers are an integral part of modern radiology equipment. In the first half of this two-part article, we dwelt upon some fundamental concepts regarding computer hardware, covering components like motherboard, central processing unit (CPU, chipset, random access memory (RAM, and memory modules. In this article, we describe the remaining computer hardware components that are of relevance to radiology. "Storage drive" is a term describing a "memory" hardware used to store data for later retrieval. Commonly used storage drives are hard drives, floppy drives, optical drives, flash drives, and network drives. The capacity of a hard drive is dependent on many factors, including the number of disk sides, number of tracks per side, number of sectors on each track, and the amount of data that can be stored in each sector. "Drive interfaces" connect hard drives and optical drives to a computer. The connections of such drives require both a power cable and a data cable. The four most popular "input/output devices" used commonly with computers are the printer, monitor, mouse, and keyboard. The "bus" is a built-in electronic signal pathway in the motherboard to permit efficient and uninterrupted data transfer. A motherboard can have several buses, including the system bus, the PCI express bus, the PCI bus, the AGP bus, and the (outdated ISA bus. "Ports" are the location at which external devices are connected to a computer motherboard. All commonly used peripheral devices, such as printers, scanners, and portable drives, need ports. A working knowledge of computers is necessary for the radiologist if the workflow is to realize its full potential and, besides, this knowledge will prepare the radiologist for the coming innovations in the ′ever increasing′ digital future.

  9. Static Scheduling of Periodic Hardware Tasks with Precedence and Deadline Constraints on Reconfigurable Hardware Devices

    Directory of Open Access Journals (Sweden)

    Ikbel Belaid

    2011-01-01

    Full Text Available Task graph scheduling for reconfigurable hardware devices can be defined as finding a schedule for a set of periodic tasks with precedence, dependence, and deadline constraints as well as their optimal allocations on the available heterogeneous hardware resources. This paper proposes a new methodology comprising three main stages. Using these three main stages, dynamic partial reconfiguration and mixed integer programming, pipelined scheduling and efficient placement are achieved and enable parallel computing of the task graph on the reconfigurable devices by optimizing placement/scheduling quality. Experiments on an application of heterogeneous hardware tasks demonstrate an improvement of resource utilization of 12.45% of the available reconfigurable resources corresponding to a resource gain of 17.3% compared to a static design. The configuration overhead is reduced to 2% of the total running time. Due to pipelined scheduling, the task graph spanning is minimized by 4% compared to sequential execution of the graph.

  10. Space Telecommunications Radio Systems (STRS) Hardware Architecture Standard: Release 1.0 Hardware Section

    Science.gov (United States)

    Reinhart, Richard C.; Kacpura, Thomas J.; Smith, Carl R.; Liebetreu, John; Hill, Gary; Mortensen, Dale J.; Andro, Monty; Scardelletti, Maximilian C.; Farrington, Allen

    2008-01-01

    This report defines a hardware architecture approach for software-defined radios to enable commonality among NASA space missions. The architecture accommodates a range of reconfigurable processing technologies including general-purpose processors, digital signal processors, field programmable gate arrays, and application-specific integrated circuits (ASICs) in addition to flexible and tunable radiofrequency front ends to satisfy varying mission requirements. The hardware architecture consists of modules, radio functions, and interfaces. The modules are a logical division of common radio functions that compose a typical communication radio. This report describes the architecture details, the module definitions, the typical functions on each module, and the module interfaces. Tradeoffs between component-based, custom architecture and a functional-based, open architecture are described. The architecture does not specify a physical implementation internally on each module, nor does the architecture mandate the standards or ratings of the hardware used to construct the radios.

  11. Methodology for Assessing Reusability of Spaceflight Hardware

    Science.gov (United States)

    Childress-Thompson, Rhonda; Thomas, L. Dale; Farrington, Phillip

    2017-01-01

    In 2011 the Space Shuttle, the only Reusable Launch Vehicle (RLV) in the world, returned to earth for the final time. Upon retirement of the Space Shuttle, the United States (U.S.) no longer possessed a reusable vehicle or the capability to send American astronauts to space. With the National Aeronautics and Space Administration (NASA) out of the RLV business and now only pursuing Expendable Launch Vehicles (ELV), not only did companies within the U.S. start to actively pursue the development of either RLVs or reusable components, but entities around the world began to venture into the reusable market. For example, SpaceX and Blue Origin are developing reusable vehicles and engines. The Indian Space Research Organization is developing a reusable space plane and Airbus is exploring the possibility of reusing its first stage engines and avionics housed in the flyback propulsion unit referred to as the Advanced Expendable Launcher with Innovative engine Economy (Adeline). Even United Launch Alliance (ULA) has announced plans for eventually replacing the Atlas and Delta expendable rockets with a family of RLVs called Vulcan. Reuse can be categorized as either fully reusable, the situation in which the entire vehicle is recovered, or partially reusable such as the National Space Transportation System (NSTS) where only the Space Shuttle, Space Shuttle Main Engines (SSME), and Solid Rocket Boosters (SRB) are reused. With this influx of renewed interest in reusability for space applications, it is imperative that a systematic approach be developed for assessing the reusability of spaceflight hardware. The partially reusable NSTS offered many opportunities to glean lessons learned; however, when it came to efficient operability for reuse the Space Shuttle and its associated hardware fell short primarily because of its two to four-month turnaround time. Although there have been several attempts at designing RLVs in the past with the X-33, Venture Star and Delta Clipper

  12. List search hardware for interpretive software

    CERN Document Server

    Altaber, Jacques; Mears, B; Rausch, R

    1979-01-01

    Interpreted languages, e.g. BASIC, are simple to learn, easy to use, quick to modify and in general 'user-friendly'. However, a critically time consuming process during interpretation is that of list searching. A special microprogrammed device for fast list searching has therefore been developed at the SPS Division of CERN. It uses bit- sliced hardware. Fast algorithms perform search, insert and delete of a six-character name and its value in a list of up to 1000 pairs. The prototype shows retrieval times of the order of 10-30 microseconds. (11 refs).

  13. Development of Hardware Dual Modality Tomography System

    Directory of Open Access Journals (Sweden)

    R. M. Zain

    2009-06-01

    Full Text Available The paper describes the hardware development and performance of the Dual Modality Tomography (DMT system. DMT consists of optical and capacitance sensors. The optical sensors consist of 16 LEDs and 16 photodiodes. The Electrical Capacitance Tomography (ECT electrode design use eight electrode plates as the detecting sensor. The digital timing and the control unit have been developing in order to control the light projection of optical emitters, switching the capacitance electrodes and to synchronize the operation of data acquisition. As a result, the developed system is able to provide a maximum 529 set data per second received from the signal conditioning circuit to the computer.

  14. Hardware Trigger Processor for the MDT System

    CERN Document Server

    Costa De Paiva, Thiago; The ATLAS collaboration

    2017-01-01

    We are developing a low-latency hardware trigger processor for the Monitored Drift Tube system in the ATLAS Muon spectrometer. The processor will fit candidate Muon tracks in the drift tubes in real time, improving significantly the momentum resolution provided by the dedicated trigger chambers. We present a novel pure-FPGA implementation of a Legendre transform segment finder, an associative-memory alternative implementation, an ARM (Zynq) processor-based track fitter, and compact ATCA carrier board architecture. The ATCA architecture is designed to allow a modular, staged approach to deployment of the system and exploration of alternative technologies.

  15. A PUFs-based hardware authentication BLAKE algorithm in 65 nm CMOS

    Science.gov (United States)

    Zhang, Yuejun; Wang, Pengjun; Zhang, Xuelong; Weng, Xinqian; Yu, Zhiyi

    2016-06-01

    This paper presents a hardware authentication BLAKE algorithm based on physical unclonable functions (PUFs) in Taiwan Semiconductor Manufacturing Company low-power 65 nm CMOS. To support hardware authentication feature, PUFs have been organised in BLAKE algorithm as the salt value. The trials table method is used to improve the robust of PUFs, resulting in approximately 100% stability against supply voltage variations form 0.7 V to 1.6 V. By discussing the G-function of BLAKE algorithm, the hardware implementation is considered for acceleration, resulting in significant performance improvements. The die occupies 2.62 mm2 and operates maximum frequency 1.0 GHz at 1.6 V. Measured results show that PUFs have great random characteristic and the authentication chip dissipates an average power of 91 mW under typical condition at 1.2 V and 780 MHz. In comparison with other works, the PUFs-based BLAKE algorithm has hardware authentication feature and improves throughput about 45%.

  16. Embedded Hardware-Efficient Real-Time Classification With Cascade Support Vector Machines.

    Science.gov (United States)

    Kyrkou, Christos; Bouganis, Christos-Savvas; Theocharides, Theocharis; Polycarpou, Marios M

    2016-01-01

    Cascade support vector machines (SVMs) are optimized to efficiently handle problems, where the majority of the data belong to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, SVM classification is a computationally demanding task and existing hardware architectures for SVMs only consider monolithic classifiers. This paper proposes the acceleration of cascade SVMs through a hybrid processing hardware architecture optimized for the cascade SVM classification flow, accompanied by a method to reduce the required hardware resources for its implementation, and a method to improve the classification speed utilizing cascade information to further discard data samples. The proposed SVM cascade architecture is implemented on a Spartan-6 field-programmable gate array (FPGA) platform and evaluated for object detection on 800×600 (Super Video Graphics Array) resolution images. The proposed architecture, boosted by a neural network that processes cascade information, achieves a real-time processing rate of 40 frames/s for the benchmark face detection application. Furthermore, the hardware-reduction method results in the utilization of 25% less FPGA custom-logic resources and 20% peak power reduction compared with a baseline implementation.

  17. VLSI realization of learning vector quantization with hardware/software co-design for different applications

    Science.gov (United States)

    An, Fengwei; Akazawa, Toshinobu; Yamasaki, Shogo; Chen, Lei; Jürgen Mattausch, Hans

    2015-04-01

    This paper reports a VLSI realization of learning vector quantization (LVQ) with high flexibility for different applications. It is based on a hardware/software (HW/SW) co-design concept for on-chip learning and recognition and designed as a SoC in 180 nm CMOS. The time consuming nearest Euclidean distance search in the LVQ algorithm’s competition layer is efficiently implemented as a pipeline with parallel p-word input. Since neuron number in the competition layer, weight values, input and output number are scalable, the requirements of many different applications can be satisfied without hardware changes. Classification of a d-dimensional input vector is completed in n × \\lceil d/p \\rceil + R clock cycles, where R is the pipeline depth, and n is the number of reference feature vectors (FVs). Adjustment of stored reference FVs during learning is done by the embedded 32-bit RISC CPU, because this operation is not time critical. The high flexibility is verified by the application of human detection with different numbers for the dimensionality of the FVs.

  18. A low power biomedical signal processor ASIC based on hardware software codesign.

    Science.gov (United States)

    Nie, Z D; Wang, L; Chen, W G; Zhang, T; Zhang, Y T

    2009-01-01

    A low power biomedical digital signal processor ASIC based on hardware and software codesign methodology was presented in this paper. The codesign methodology was used to achieve higher system performance and design flexibility. The hardware implementation included a low power 32bit RISC CPU ARM7TDMI, a low power AHB-compatible bus, and a scalable digital co-processor that was optimized for low power Fast Fourier Transform (FFT) calculations. The co-processor could be scaled for 8-point, 16-point and 32-point FFTs, taking approximate 50, 100 and 150 clock circles, respectively. The complete design was intensively simulated using ARM DSM model and was emulated by ARM Versatile platform, before conducted to silicon. The multi-million-gate ASIC was fabricated using SMIC 0.18 microm mixed-signal CMOS 1P6M technology. The die area measures 5,000 microm x 2,350 microm. The power consumption was approximately 3.6 mW at 1.8 V power supply and 1 MHz clock rate. The power consumption for FFT calculations was less than 1.5 % comparing with the conventional embedded software-based solution.

  19. Building Scalable Knowledge Graphs for Earth Science

    Science.gov (United States)

    Ramachandran, Rahul; Maskey, Manil; Gatlin, Patrick; Zhang, Jia; Duan, Xiaoyi; Miller, J. J.; Bugbee, Kaylin; Christopher, Sundar; Freitag, Brian

    2017-01-01

    Knowledge Graphs link key entities in a specific domain with other entities via relationships. From these relationships, researchers can query knowledge graphs for probabilistic recommendations to infer new knowledge. Scientific papers are an untapped resource which knowledge graphs could leverage to accelerate research discovery. Goal: Develop an end-to-end (semi) automated methodology for constructing Knowledge Graphs for Earth Science.

  20. Introduction to Hardware Security and Trust

    CERN Document Server

    Wang, Cliff

    2012-01-01

    The emergence of a globalized, horizontal semiconductor business model raises a set of concerns involving the security and trust of the information systems on which modern society is increasingly reliant for mission-critical functionality. Hardware-oriented security and trust issues span a broad range including threats related to the malicious insertion of Trojan circuits designed, e.g.,to act as a ‘kill switch’ to disable a chip, to integrated circuit (IC) piracy,and to attacks designed to extract encryption keys and IP from a chip. This book provides the foundations for understanding hardware security and trust, which have become major concerns for national security over the past decade.  Coverage includes security and trust issues in all types of electronic devices and systems such as ASICs, COTS, FPGAs, microprocessors/DSPs, and embedded systems.  This serves as an invaluable reference to the state-of-the-art research that is of critical significance to the security of,and trust in, modern society�...

  1. ISS Logistics Hardware Disposition and Metrics Validation

    Science.gov (United States)

    Rogers, Toneka R.

    2010-01-01

    I was assigned to the Logistics Division of the International Space Station (ISS)/Spacecraft Processing Directorate. The Division consists of eight NASA engineers and specialists that oversee the logistics portion of the Checkout, Assembly, and Payload Processing Services (CAPPS) contract. Boeing, their sub-contractors and the Boeing Prime contract out of Johnson Space Center, provide the Integrated Logistics Support for the ISS activities at Kennedy Space Center. Essentially they ensure that spares are available to support flight hardware processing and the associated ground support equipment (GSE). Boeing maintains a Depot for electrical, mechanical and structural modifications and/or repair capability as required. My assigned task was to learn project management techniques utilized by NASA and its' contractors to provide an efficient and effective logistics support infrastructure to the ISS program. Within the Space Station Processing Facility (SSPF) I was exposed to Logistics support components, such as, the NASA Spacecraft Services Depot (NSSD) capabilities, Mission Processing tools, techniques and Warehouse support issues, required for integrating Space Station elements at the Kennedy Space Center. I also supported the identification of near-term ISS Hardware and Ground Support Equipment (GSE) candidates for excessing/disposition prior to October 2010; and the validation of several Logistics Metrics used by the contractor to measure logistics support effectiveness.

  2. ARM assembly language with hardware experiments

    CERN Document Server

    Elahi, Ata

    2015-01-01

    This book provides a hands-on approach to learning ARM assembly language with the use of a TI microcontroller. The book starts with an introduction to computer architecture and then discusses number systems and digital logic. The text covers ARM Assembly Language, ARM Cortex Architecture and its components, and Hardware Experiments using TILM3S1968. Written for those interested in learning embedded programming using an ARM Microcontroller. ·         Introduces number systems and signal transmission methods   ·         Reviews logic gates, registers, multiplexers, decoders and memory   ·         Provides an overview and examples of ARM instruction set   ·         Uses using Keil development tools for writing and debugging ARM assembly language Programs   ·         Hardware experiments using a Mbed NXP LPC1768 microcontroller; including General Purpose Input/Output (GPIO) configuration, real time clock configuration, binary input to 7-segment display, creating ...

  3. Fast and Scalable Computation of the Forward and Inverse Discrete Periodic Radon Transform.

    Science.gov (United States)

    Carranza, Cesar; Llamocca, Daniel; Pattichis, Marios

    2016-01-01

    The discrete periodic radon transform (DPRT) has extensively been used in applications that involve image reconstructions from projections. Beyond classic applications, the DPRT can also be used to compute fast convolutions that avoids the use of floating-point arithmetic associated with the use of the fast Fourier transform. Unfortunately, the use of the DPRT has been limited by the need to compute a large number of additions and the need for a large number of memory accesses. This paper introduces a fast and scalable approach for computing the forward and inverse DPRT that is based on the use of: a parallel array of fixed-point adder trees; circular shift registers to remove the need for accessing external memory components when selecting the input data for the adder trees; an image block-based approach to DPRT computation that can fit the proposed architecture to available resources; and fast transpositions that are computed in one or a few clock cycles that do not depend on the size of the input image. As a result, for an N × N image (N prime), the proposed approach can compute up to N(2) additions per clock cycle. Compared with the previous approaches, the scalable approach provides the fastest known implementations for different amounts of computational resources. For example, for a 251×251 image, for approximately 25% fewer flip-flops than required for a systolic implementation, we have that the scalable DPRT is computed 36 times faster. For the fastest case, we introduce optimized just 2N + ⌈log(2) N⌉ + 1 and 2N + 3 ⌈log(2) N⌉ + B + 2 cycles, architectures that can compute the DPRT and its inverse in respectively, where B is the number of bits used to represent each input pixel. On the other hand, the scalable DPRT approach requires more 1-b additions than for the systolic implementation and provides a tradeoff between speed and additional 1-b additions. All of the proposed DPRT architectures were implemented in VHSIC Hardware Description Language

  4. Hardware/Software Co-Design of a Traffic Sign Recognition System Using Zynq FPGAs

    Directory of Open Access Journals (Sweden)

    Yan Han

    2015-12-01

    Full Text Available Traffic sign recognition (TSR, taken as an important component of an intelligent vehicle system, has been an emerging research topic in recent years. In this paper, a traffic sign detection system based on color segmentation, speeded-up robust features (SURF detection and the k-nearest neighbor classifier is introduced. The proposed system benefits from the SURF detection algorithm, which achieves invariance to rotated, skewed and occluded signs. In addition to the accuracy and robustness issues, a TSR system should target a real-time implementation on an embedded system. Therefore, a hardware/software co-design architecture for a Zynq-7000 FPGA is presented as a major objective of this work. The sign detection operations are accelerated by programmable hardware logic that searches the potential candidates for sign classification. Sign recognition and classification uses a feature extraction and matching algorithm, which is implemented as a software component that runs on the embedded ARM CPU.

  5. Oracle database performance and scalability a quantitative approach

    CERN Document Server

    Liu, Henry H

    2011-01-01

    A data-driven, fact-based, quantitative text on Oracle performance and scalability With database concepts and theories clearly explained in Oracle's context, readers quickly learn how to fully leverage Oracle's performance and scalability capabilities at every stage of designing and developing an Oracle-based enterprise application. The book is based on the author's more than ten years of experience working with Oracle, and is filled with dependable, tested, and proven performance optimization techniques. Oracle Database Performance and Scalability is divided into four parts that enable reader

  6. A novel 3D scalable video compression algorithm

    Science.gov (United States)

    Somasundaram, Siva; Subbalakshmi, Koduvayur P.

    2003-05-01

    In this paper we propose a scalable video coding scheme that utilizes the embedded block coding with optimal truncation (EBCOT) compression algorithm. Three dimensional spatio-temporal decomposition of the video sequence succeeded by compression using the EBCOT generates a SNR and resolution scalable bit stream. The proposed video coding algorithm not only performs closer to the MPEG-4 video coding standard in compression efficiency but also provides better SNR and resolution scalability. Experimental results show that the performance of the proposed algorithm does better than the 3-D SPIHT (Set Partitioning in Hierarchial Trees) algorithm by 1.5dB.

  7. Application of recursive manipulator dynamics to hybrid software/hardware simulation

    Science.gov (United States)

    Hill, Christopher J.; Hopping, Kenneth A.; Price, Charles R.

    1989-01-01

    Computer simulations of robotic mechanisms have traditionally solved the dynamic equations of motion for an N degree of freedom manipulator by formulating an N dimensional matrix equation combining the accelerations and torques (forces) for all joints. The use of an alternative formulation that is strictly recursive is described. The dynamic solution proceeds on a joint by joint basis, so it is possible to perform inverse dynamics at arbitrary joints. The dynamic formulation is generalized with respect to both rotational and translational joints, and it is also directly extendable to branched manipulator chains. A hardware substitution test is described in which a servo drive motor was integrated with a simulated manipulator arm. The form of the dynamic equation permits calculation of acceleration given torque or vice versa. Computing torque as a function of acceleration is required for the hybrid software/hardware simulation test described. For this test, a joint servo motor is controlled in conjunction with the simulation, and the dynamic torque on the servo motor is provided by a load motor on a common driveshaft.

  8. Scalable, remote administration of Windows NT.

    Energy Technology Data Exchange (ETDEWEB)

    Gomberg, M.; Stacey, C.; Sayre, J.

    1999-06-08

    In the UNIX community there is an overwhelming perception that NT is impossible to manage remotely and that NT administration doesn't scale. This was essentially true with earlier versions of the operating system. Even today, out of the box, NT is difficult to manage remotely. Many tools, however, now make remote management of NT not only possible, but under some circumstances very easy. In this paper we discuss how we at Argonne's Mathematics and Computer Science Division manage all our NT machines remotely from a single console, with minimum locally installed software overhead. We also present NetReg, which is a locally developed tool for scalable registry management. NetReg allows us to apply a registry change to a specified set of machines. It is a command line utility that can be run in either interactive or batch mode and is written in Perl for Win32, taking heavy advantage of the Win32::TieRegistry module.

  9. Scalable conditional induction variables (CIV) analysis

    KAUST Repository

    Oancea, Cosmin E.

    2015-02-01

    Subscripts using induction variables that cannot be expressed as a formula in terms of the enclosing-loop indices appear in the low-level implementation of common programming abstractions such as Alter, or stack operations and pose significant challenges to automatic parallelization. Because the complexity of such induction variables is often due to their conditional evaluation across the iteration space of loops we name them Conditional Induction Variables (CIV). This paper presents a flow-sensitive technique that summarizes both such CIV-based and affine subscripts to program level, using the same representation. Our technique requires no modifications of our dependence tests, which is agnostic to the original shape of the subscripts, and is more powerful than previously reported dependence tests that rely on the pairwise disambiguation of read-write references. We have implemented the CIV analysis in our parallelizing compiler and evaluated its impact on five Fortran benchmarks. We have found that that there are many important loops using CIV subscripts and that our analysis can lead to their scalable parallelization. This in turn has led to the parallelization of the benchmark programs they appear in.

  10. Scalable Notch Antenna System for Multiport Applications

    Directory of Open Access Journals (Sweden)

    Abdurrahim Toktas

    2016-01-01

    Full Text Available A novel and compact scalable antenna system is designed for multiport applications. The basic design is built on a square patch with an electrical size of 0.82λ0×0.82λ0 (at 2.4 GHz on a dielectric substrate. The design consists of four symmetrical and orthogonal triangular notches with circular feeding slots at the corners of the common patch. The 4-port antenna can be simply rearranged to 8-port and 12-port systems. The operating band of the system can be tuned by scaling (S the size of the system while fixing the thickness of the substrate. The antenna system with S: 1/1 in size of 103.5×103.5 mm2 operates at the frequency band of 2.3–3.0 GHz. By scaling the antenna with S: 1/2.3, a system of 45×45 mm2 is achieved, and thus the operating band is tuned to 4.7–6.1 GHz with the same scattering characteristic. A parametric study is also conducted to investigate the effects of changing the notch dimensions. The performance of the antenna is verified in terms of the antenna characteristics as well as diversity and multiplexing parameters. The antenna system can be tuned by scaling so that it is applicable to the multiport WLAN, WIMAX, and LTE devices with port upgradability.

  11. Scalable inference for stochastic block models

    KAUST Repository

    Peng, Chengbin

    2017-12-08

    Community detection in graphs is widely used in social and biological networks, and the stochastic block model is a powerful probabilistic tool for describing graphs with community structures. However, in the era of "big data," traditional inference algorithms for such a model are increasingly limited due to their high time complexity and poor scalability. In this paper, we propose a multi-stage maximum likelihood approach to recover the latent parameters of the stochastic block model, in time linear with respect to the number of edges. We also propose a parallel algorithm based on message passing. Our algorithm can overlap communication and computation, providing speedup without compromising accuracy as the number of processors grows. For example, to process a real-world graph with about 1.3 million nodes and 10 million edges, our algorithm requires about 6 seconds on 64 cores of a contemporary commodity Linux cluster. Experiments demonstrate that the algorithm can produce high quality results on both benchmark and real-world graphs. An example of finding more meaningful communities is illustrated consequently in comparison with a popular modularity maximization algorithm.

  12. A Programmable, Scalable-Throughput Interleaver

    Directory of Open Access Journals (Sweden)

    Rijshouwer EJC

    2010-01-01

    Full Text Available The interleaver stages of digital communication standards show a surprisingly large variation in throughput, state sizes, and permutation functions. Furthermore, data rates for 4G standards such as LTE-Advanced will exceed typical baseband clock frequencies of handheld devices. Multistream operation for Software Defined Radio and iterative decoding algorithms will call for ever higher interleave data rates. Our interleave machine is built around 8 single-port SRAM banks and can be programmed to generate up to 8 addresses every clock cycle. The scalable architecture combines SIMD and VLIW concepts with an efficient resolution of bank conflicts. A wide range of cellular, connectivity, and broadcast interleavers have been mapped on this machine, with throughputs up to more than 0.5 Gsymbol/second. Although it was designed for channel interleaving, the application domain of the interleaver extends also to Turbo interleaving. The presented configuration of the architecture is designed as a part of a programmable outer receiver on a prototype board. It offers (near universal programmability to enable the implementation of new interleavers. The interleaver measures 2.09 m in 65 nm CMOS (including memories and proves functional on silicon.

  13. SCTP as scalable video coding transport

    Science.gov (United States)

    Ortiz, Jordi; Graciá, Eduardo Martínez; Skarmeta, Antonio F.

    2013-12-01

    This study presents an evaluation of the Stream Transmission Control Protocol (SCTP) for the transport of the scalable video codec (SVC), proposed by MPEG as an extension to H.264/AVC. Both technologies fit together properly. On the one hand, SVC permits to split easily the bitstream into substreams carrying different video layers, each with different importance for the reconstruction of the complete video sequence at the receiver end. On the other hand, SCTP includes features, such as the multi-streaming and multi-homing capabilities, that permit to transport robustly and efficiently the SVC layers. Several transmission strategies supported on baseline SCTP and its concurrent multipath transfer (CMT) extension are compared with the classical solutions based on the Transmission Control Protocol (TCP) and the Realtime Transmission Protocol (RTP). Using ns-2 simulations, it is shown that CMT-SCTP outperforms TCP and RTP in error-prone networking environments. The comparison is established according to several performance measurements, including delay, throughput, packet loss, and peak signal-to-noise ratio of the received video.

  14. Scalable Combinatorial Tools for Health Disparities Research

    Directory of Open Access Journals (Sweden)

    Michael A. Langston

    2014-10-01

    Full Text Available Despite staggering investments made in unraveling the human genome, current estimates suggest that as much as 90% of the variance in cancer and chronic diseases can be attributed to factors outside an individual’s genetic endowment, particularly to environmental exposures experienced across his or her life course. New analytical approaches are clearly required as investigators turn to complicated systems theory and ecological, place-based and life-history perspectives in order to understand more clearly the relationships between social determinants, environmental exposures and health disparities. While traditional data analysis techniques remain foundational to health disparities research, they are easily overwhelmed by the ever-increasing size and heterogeneity of available data needed to illuminate latent gene x environment interactions. This has prompted the adaptation and application of scalable combinatorial methods, many from genome science research, to the study of population health. Most of these powerful tools are algorithmically sophisticated, highly automated and mathematically abstract. Their utility motivates the main theme of this paper, which is to describe real applications of innovative transdisciplinary models and analyses in an effort to help move the research community closer toward identifying the causal mechanisms and associated environmental contexts underlying health disparities. The public health exposome is used as a contemporary focus for addressing the complex nature of this subject.

  15. Scalability and interoperability within glideinWMS

    Energy Technology Data Exchange (ETDEWEB)

    Bradley, D.; /Wisconsin U., Madison; Sfiligoi, I.; /Fermilab; Padhi, S.; /UC, San Diego; Frey, J.; /Wisconsin U., Madison; Tannenbaum, T.; /Wisconsin U., Madison

    2010-01-01

    Physicists have access to thousands of CPUs in grid federations such as OSG and EGEE. With the start-up of the LHC, it is essential for individuals or groups of users to wrap together available resources from multiple sites across multiple grids under a higher user-controlled layer in order to provide a homogeneous pool of available resources. One such system is glideinWMS, which is based on the Condor batch system. A general discussion of glideinWMS can be found elsewhere. Here, we focus on recent advances in extending its reach: scalability and integration of heterogeneous compute elements. We demonstrate that the new developments exceed the design goal of over 10,000 simultaneous running jobs under a single Condor schedd, using strong security protocols across global networks, and sustaining a steady-state job completion rate of a few Hz. We also show interoperability across heterogeneous computing elements achieved using client-side methods. We discuss this technique and the challenges in direct access to NorduGrid and CREAM compute elements, in addition to Globus based systems.

  16. GPU-Accelerated Text Mining

    Energy Technology Data Exchange (ETDEWEB)

    Cui, Xiaohui [ORNL; Mueller, Frank [North Carolina State University; Zhang, Yongpeng [ORNL; Potok, Thomas E [ORNL

    2009-01-01

    Accelerating hardware devices represent a novel promise for improving the performance for many problem domains but it is not clear for which domains what accelerators are suitable. While there is no room in general-purpose processor design to significantly increase the processor frequency, developers are instead resorting to multi-core chips duplicating conventional computing capabilities on a single die. Yet, accelerators offer more radical designs with a much higher level of parallelism and novel programming environments. This present work assesses the viability of text mining on CUDA. Text mining is one of the key concepts that has become prominent as an effective means to index the Internet, but its applications range beyond this scope and extend to providing document similarity metrics, the subject of this work. We have developed and optimized text search algorithms for GPUs to exploit their potential for massive data processing. We discuss the algorithmic challenges of parallelization for text search problems on GPUs and demonstrate the potential of these devices in experiments by reporting significant speedups. Our study may be one of the first to assess more complex text search problems for suitability for GPU devices, and it may also be one of the first to exploit and report on atomic instruction usage that have recently become available in NVIDIA devices.

  17. Locating hardware faults in a parallel computer

    Science.gov (United States)

    Archer, Charles J.; Megerian, Mark G.; Ratterman, Joseph D.; Smith, Brian E.

    2010-04-13

    Locating hardware faults in a parallel computer, including defining within a tree network of the parallel computer two or more sets of non-overlapping test levels of compute nodes of the network that together include all the data communications links of the network, each non-overlapping test level comprising two or more adjacent tiers of the tree; defining test cells within each non-overlapping test level, each test cell comprising a subtree of the tree including a subtree root compute node and all descendant compute nodes of the subtree root compute node within a non-overlapping test level; performing, separately on each set of non-overlapping test levels, an uplink test on all test cells in a set of non-overlapping test levels; and performing, separately from the uplink tests and separately on each set of non-overlapping test levels, a downlink test on all test cells in a set of non-overlapping test levels.

  18. Communication Estimation for Hardware/Software Codesign

    DEFF Research Database (Denmark)

    Knudsen, Peter Voigt; Madsen, Jan

    1998-01-01

    to be general enough to be able to capture the characteristics of a wide range of communication protocols and yet to be sufficiently detailed as to allow the designer or design tool to efficiently explore tradeoffs between throughput, bus widths, burst/non-burst transfers and data packing strategies. Thus......This paper presents a general high level estimation model of communication throughput for the implementation of a given communication protocol. The model, which is part of a larger model that includes component price, software driver object code size and hardware driver area, is intended...... it provides a basis for decision making with respect to communication protocols/components and communication driver design in the initial design space exploration phase of a co-synthesis process where a large number of possibilities must be examined and where fast estimators are therefore necessary. The fill...

  19. Hardware codec for digital HDTV recording

    Science.gov (United States)

    Stammnitz, Peter; Boettcher, K.; Grueneberg, Kirsten A.; Hoefker, U.; Klein, H.

    1993-11-01

    For the purpose of digital recording of HDTV signals (EUREKA standard, 1250/50/2:1) a codec has been realized (HDI-codec) which can reduce the initial data rate from 1,152 GBit/s down to one fifth. According to the desired reduction, the playtime of a digital VCR (Video Cassette Recorder) can be increased from about 40 - 60 minutes up to at least the length of a feature film. This paper describes the hardware realization of the data rate reduction codec. Algorithms utilized for data rate reduction are adaptive intraframe/intrafield discrete cosine transform (DCT), adaptive quantization and variable length encoding (VLC). Interframe editing, multiple copy and shuttle mode are supported by a special codec architecture.

  20. Theorem Proving in Intel Hardware Design

    Science.gov (United States)

    O'Leary, John

    2009-01-01

    For the past decade, a framework combining model checking (symbolic trajectory evaluation) and higher-order logic theorem proving has been in production use at Intel. Our tools and methodology have been used to formally verify execution cluster functionality (including floating-point operations) for a number of Intel products, including the Pentium(Registered TradeMark)4 and Core(TradeMark)i7 processors. Hardware verification in 2009 is much more challenging than it was in 1999 - today s CPU chip designs contain many processor cores and significant firmware content. This talk will attempt to distill the lessons learned over the past ten years, discuss how they apply to today s problems, outline some future directions.

  1. Compressive Sensing Image Sensors-Hardware Implementation

    Directory of Open Access Journals (Sweden)

    Shahram Shirani

    2013-04-01

    Full Text Available The compressive sensing (CS paradigm uses simultaneous sensing and compression to provide an efficient image acquisition technique. The main advantages of the CS method include high resolution imaging using low resolution sensor arrays and faster image acquisition. Since the imaging philosophy in CS imagers is different from conventional imaging systems, new physical structures have been developed for cameras that use the CS technique. In this paper, a review of different hardware implementations of CS encoding in optical and electrical domains is presented. Considering the recent advances in CMOS (complementary metal–oxide–semiconductor technologies and the feasibility of performing on-chip signal processing, important practical issues in the implementation of CS in CMOS sensors are emphasized. In addition, the CS coding for video capture is discussed.

  2. Handbook of hardware/software codesign

    CERN Document Server

    Teich, Jürgen

    2017-01-01

    This handbook presents fundamental knowledge on the hardware/software (HW/SW) codesign methodology. Contributing expert authors look at key techniques in the design flow as well as selected codesign tools and design environments, building on basic knowledge to consider the latest techniques. The book enables readers to gain real benefits from the HW/SW codesign methodology through explanations and case studies which demonstrate its usefulness. Readers are invited to follow the progress of design techniques through this work, which assists readers in following current research directions and learning about state-of-the-art techniques. Students and researchers will appreciate the wide spectrum of subjects that belong to the design methodology from this handbook. .

  3. Current conveyors variants, applications and hardware implementations

    CERN Document Server

    Senani, Raj; Singh, A K

    2015-01-01

    This book serves as a single-source reference to Current Conveyors and their use in modern Analog Circuit Design. The authors describe the various types of current conveyors discovered over the past 45 years, details of all currently available, off-the-shelf integrated circuit current conveyors, and implementations of current conveyors using other, off-the-shelf IC building blocks. Coverage includes prominent bipolar/CMOS/Bi-CMOS architectures of current conveyors, as well as all varieties of starting from third generation current conveyors to universal current conveyors, their implementations and applications. •Describes all commercially available off-the-shelf IC current conveyors, as well as hardware implementations of current conveyors using other off-the-shelf ICs; • Describes numerous variants of current conveyors evolved over the past forty five years; • Describes a number of Bipolar/CMOS/Bi-CMOS architectures of current conveyors, along with their characteristic features; • Includes a comprehe...

  4. Perspectives in Simulation Hardware and Software Architecture

    Directory of Open Access Journals (Sweden)

    W.O. Grierson

    1985-10-01

    Full Text Available Historically, analog and hybrid computer systems have provided effective real-time solutions for the simulation of large dynamic systems. In the mid 1970s, ADI concluded that these systems were no longer adequate to meet the demands of larger, more complex models and the demand for greater simulation accuracy. The decision was to design an all-digital system to satisfy these growing requirements (see Gilbert and Howe, (1978. This all-digital approach was called the SYSTEM 10. The SYSTEM 10 has been effective in solving time-critical simulation problems and in replacing the previous approach of utilizing hybrid computers. Recent advances in 100 K emitter coupled logic (ECL now make it possible to support a new generation of equipment that expands modeling capabilities to serve simulation needs. The hardware and software concepts of this system, called the SYSTEM 100, are the subject of this paper.

  5. Extravehicular Activity (EVA) Hardware & Operations Overview

    Science.gov (United States)

    Moore, Sandra; Marmolejo, Jose

    2014-01-01

    The objectives of this presentation are to: Define Extravehicular Activity (EVA), identify the reasons for conducting an EVA, and review the role that EVA has played in the space program; Identify the types of EVAs that may be performed; Describe some of the U.S. Space Station equipment and tools that are used during an EVA, such as the Extravehicular Mobility Unit (EMU), the Simplified Aid For EVA Rescue (SAFER), the International Space Station (ISS) Joint Airlock and Russian Docking Compartment 1 (DC-1), and EVA Tools & Equipment; Outline the methods and procedures of EVA Preparation, EVA, and Post-EVA operations; Describe the Russian spacesuit used to perform an EVA; Provide a comparison between U.S. and Russian spacesuit hardware and EVA support; and Define the roles that different training facilities play in EVA training.

  6. Scan image compression-encryption hardware system

    Science.gov (United States)

    Bourbakis, Nikolaos G.; Brause, R.; Alexopoulos, C.

    1995-04-01

    This paper deals with the hardware design of an image compression/encryption scheme called SCAN. The scheme is based on the principles and ideas reflected by the specification of the SCAN language. SCAN is a fractal based context-free language which accesses sequentially the data of a 2D array, by describing and generating a wide range (near (nxn)) of space filling curves (or SCAN patterns) from a short set of simple ones. The SCAN method uses the algorithmic description of each 2D image as SCAN patterns combinations for the compression and encryption of the image data. Note that each SCAN letter or word accesses the image data with a different order (or sequence), thus the application of a variety of SCAN words associated with the compression scheme will produce various compressed versions of the same image. The compressed versions are compared in memory size and the best of them with the smallest size in bits could be used for the image compression/encryption. Note that the encryption of the image data is a result of the great number of possible space filling curves which could be generated by SCAN. Since the software implementation of the SCAN compression/encryption scheme requires some time, the hardware design and implementation of the SCAN scheme is necessary in order to reduce the image compression/encryption time to the real-time one. The development of such an image compression encryption system will have a significant impact on the transmission and storage of images. It will be applicable in multimedia and transmission of images through communication lines.

  7. ARC Code TI: Block-GP: Scalable Gaussian Process Regression

    Data.gov (United States)

    National Aeronautics and Space Administration — Block GP is a Gaussian Process regression framework for multimodal data, that can be an order of magnitude more scalable than existing state-of-the-art nonlinear...

  8. Scalable pattern recognition algorithms applications in computational biology and bioinformatics

    CERN Document Server

    Maji, Pradipta

    2014-01-01

    Reviews the development of scalable pattern recognition algorithms for computational biology and bioinformatics Includes numerous examples and experimental results to support the theoretical concepts described Concludes each chapter with directions for future research and a comprehensive bibliography

  9. Scalability of telecom cloud architectures for live-TV distribution

    OpenAIRE

    Asensio Carmona, Adrian; Contreras, Luis Miguel; Ruiz Ramírez, Marc; López Álvarez, Victor; Velasco Esteban, Luis Domingo

    2015-01-01

    A hierarchical distributed telecom cloud architecture for live-TV distribution exploiting flexgrid networking and SBVTs is proposed. Its scalability is compared to that of a centralized architecture. Cost savings as high as 32 % are shown. Peer Reviewed

  10. Scalable RFCMOS Model for 90 nm Technology

    Directory of Open Access Journals (Sweden)

    Ah Fatt Tong

    2011-01-01

    Full Text Available This paper presents the formation of the parasitic components that exist in the RF MOSFET structure during its high-frequency operation. The parasitic components are extracted from the transistor's S-parameter measurement, and its geometry dependence is studied with respect to its layout structure. Physical geometry equations are proposed to represent these parasitic components, and by implementing them into the RF model, a scalable RFCMOS model, that is, valid up to 49.85 GHz is demonstrated. A new verification technique is proposed to verify the quality of the developed scalable RFCMOS model. The proposed technique can shorten the verification time of the scalable RFCMOS model and ensure that the coded scalable model file is error-free and thus more reliable to use.

  11. Scalable-to-lossless transform domain distributed video coding

    DEFF Research Database (Denmark)

    Huang, Xin; Ukhanova, Ann; Veselov, Anton

    2010-01-01

    Distributed video coding (DVC) is a novel approach providing new features as low complexity encoding by mainly exploiting the source statistics at the decoder based on the availability of decoder side information. In this paper, scalable-tolossless DVC is presented based on extending a lossy...... TransformDomain Wyner-Ziv (TDWZ) distributed video codec with feedback.The lossless coding is obtained by using a reversible integer DCT.Experimental results show that the performance of the proposed scalable-to-lossless TDWZ video codec can outperform alternatives based on the JPEG 2000 standard. The TDWZ...... codec provides frame by frame encoding. Comparing the lossless coding efficiency, the proposed scalable-to-lossless TDWZ video codec can save up to 5%-13% bits compared to JPEG LS and H.264 Intra frame lossless coding and do so as a scalable-to-lossless coding....

  12. Improving the Performance Scalability of the Community Atmosphere Model

    Energy Technology Data Exchange (ETDEWEB)

    Mirin, Arthur [Lawrence Livermore National Laboratory (LLNL); Worley, Patrick H [ORNL

    2012-01-01

    The Community Atmosphere Model (CAM), which serves as the atmosphere component of the Community Climate System Model (CCSM), is the most computationally expensive CCSM component in typical configurations. On current and next-generation leadership class computing systems, the performance of CAM is tied to its parallel scalability. Improving performance scalability in CAM has been a challenge, due largely to algorithmic restrictions necessitated by the polar singularities in its latitude-longitude computational grid. Nevertheless, through a combination of exploiting additional parallelism, implementing improved communication protocols, and eliminating scalability bottlenecks, we have been able to more than double the maximum throughput rate of CAM on production platforms. We describe these improvements and present results on the Cray XT5 and IBM BG/P. The approaches taken are not specific to CAM and may inform similar scalability enhancement activities for other codes.

  13. Parallelism and Scalability in an Image Processing Application

    DEFF Research Database (Denmark)

    Rasmussen, Morten Sleth; Stuart, Matthias Bo; Karlsson, Sven

    2008-01-01

    parallel programs. This paper investigates parallelism and scalability of an embedded image processing application. The major challenges faced when parallelizing the application were to extract enough parallelism from the application and to reduce load imbalance. The application has limited immediately...

  14. Scalable Multiple-Description Image Coding Based on Embedded Quantization

    Directory of Open Access Journals (Sweden)

    Moerman Ingrid

    2007-01-01

    Full Text Available Scalable multiple-description (MD coding allows for fine-grain rate adaptation as well as robust coding of the input source. In this paper, we present a new approach for scalable MD coding of images, which couples the multiresolution nature of the wavelet transform with the robustness and scalability features provided by embedded multiple-description scalar quantization (EMDSQ. Two coding systems are proposed that rely on quadtree coding to compress the side descriptions produced by EMDSQ. The proposed systems are capable of dynamically adapting the bitrate to the available bandwidth while providing robustness to data losses. Experiments performed under different simulated network conditions demonstrate the effectiveness of the proposed scalable MD approach for image streaming over error-prone channels.

  15. Scalable Multiple-Description Image Coding Based on Embedded Quantization

    Directory of Open Access Journals (Sweden)

    Augustin I. Gavrilescu

    2007-02-01

    Full Text Available Scalable multiple-description (MD coding allows for fine-grain rate adaptation as well as robust coding of the input source. In this paper, we present a new approach for scalable MD coding of images, which couples the multiresolution nature of the wavelet transform with the robustness and scalability features provided by embedded multiple-description scalar quantization (EMDSQ. Two coding systems are proposed that rely on quadtree coding to compress the side descriptions produced by EMDSQ. The proposed systems are capable of dynamically adapting the bitrate to the available bandwidth while providing robustness to data losses. Experiments performed under different simulated network conditions demonstrate the effectiveness of the proposed scalable MD approach for image streaming over error-prone channels.

  16. Automatic generation of application specific FPGA multicore accelerators

    DEFF Research Database (Denmark)

    Hindborg, Andreas Erik; Schleuniger, Pascal; Jensen, Nicklas Bo

    2014-01-01

    High performance computing systems make increasing use of hardware accelerators to improve performance and power properties. For large high-performance FPGAs to be successfully integrated in such computing systems, methods to raise the abstraction level of FPGA programming are required. In this p...

  17. TriG: Next Generation Scalable Spaceborne GNSS Receiver

    Science.gov (United States)

    Tien, Jeffrey Y.; Okihiro, Brian Bachman; Esterhuizen, Stephan X.; Franklin, Garth W.; Meehan, Thomas K.; Munson, Timothy N.; Robison, David E.; Turbiner, Dmitry; Young, Lawrence E.

    2012-01-01

    TriG is the next generation NASA scalable space GNSS Science Receiver. It will track all GNSS and additional signals (i.e. GPS, GLONASS, Galileo, Compass and Doris). Scalable 3U architecture and fully software and firmware recofigurable, enabling optimization to meet specific mission requirements. TriG GNSS EM is currently undergoing testing and is expected to complete full performance testing later this year.

  18. SDC: Scalable description coding for adaptive streaming media

    OpenAIRE

    Quinlan, Jason J.; Zahran, Ahmed H.; Sreenan, Cormac J.

    2012-01-01

    Video compression techniques enable adaptive media streaming over heterogeneous links to end-devices. Scalable Video Coding (SVC) and Multiple Description Coding (MDC) represent well-known techniques for video compression with distinct characteristics in terms of bandwidth efficiency and resiliency to packet loss. In this paper, we present Scalable Description Coding (SDC), a technique to compromise the tradeoff between bandwidth efficiency and error resiliency without sacrificing user-percei...

  19. Scalable persistent identifier systems for dynamic datasets

    Science.gov (United States)

    Golodoniuc, P.; Cox, S. J. D.; Klump, J. F.

    2016-12-01

    Reliable and persistent identification of objects, whether tangible or not, is essential in information management. Many Internet-based systems have been developed to identify digital data objects, e.g., PURL, LSID, Handle, ARK. These were largely designed for identification of static digital objects. The amount of data made available online has grown exponentially over the last two decades and fine-grained identification of dynamically generated data objects within large datasets using conventional systems (e.g., PURL) has become impractical. We have compared capabilities of various technological solutions to enable resolvability of data objects in dynamic datasets, and developed a dataset-centric approach to resolution of identifiers. This is particularly important in Semantic Linked Data environments where dynamic frequently changing data is delivered live via web services, so registration of individual data objects to obtain identifiers is impractical. We use identifier patterns and pattern hierarchies for identification of data objects, which allows relationships between identifiers to be expressed, and also provides means for resolving a single identifier into multiple forms (i.e. views or representations of an object). The latter can be implemented through (a) HTTP content negotiation, or (b) use of URI querystring parameters. The pattern and hierarchy approach has been implemented in the Linked Data API supporting the United Nations Spatial Data Infrastructure (UNSDI) initiative and later in the implementation of geoscientific data delivery for the Capricorn Distal Footprints project using International Geo Sample Numbers (IGSN). This enables flexible resolution of multi-view persistent identifiers and provides a scalable solution for large heterogeneous datasets.

  20. Microscopic Characterization of Scalable Coherent Rydberg Superatoms

    Directory of Open Access Journals (Sweden)

    Johannes Zeiher

    2015-08-01

    Full Text Available Strong interactions can amplify quantum effects such that they become important on macroscopic scales. Controlling these coherently on a single-particle level is essential for the tailored preparation of strongly correlated quantum systems and opens up new prospects for quantum technologies. Rydberg atoms offer such strong interactions, which lead to extreme nonlinearities in laser-coupled atomic ensembles. As a result, multiple excitation of a micrometer-sized cloud can be blocked while the light-matter coupling becomes collectively enhanced. The resulting two-level system, often called a “superatom,” is a valuable resource for quantum information, providing a collective qubit. Here, we report on the preparation of 2 orders of magnitude scalable superatoms utilizing the large interaction strength provided by Rydberg atoms combined with precise control of an ensemble of ultracold atoms in an optical lattice. The latter is achieved with sub-shot-noise precision by local manipulation of a two-dimensional Mott insulator. We microscopically confirm the superatom picture by in situ detection of the Rydberg excitations and observe the characteristic square-root scaling of the optical coupling with the number of atoms. Enabled by the full control over the atomic sample, including the motional degrees of freedom, we infer the overlap of the produced many-body state with a W state from the observed Rabi oscillations and deduce the presence of entanglement. Finally, we investigate the breakdown of the superatom picture when two Rydberg excitations are present in the system, which leads to dephasing and a loss of coherence.

  1. Microscopic Characterization of Scalable Coherent Rydberg Superatoms

    Science.gov (United States)

    Zeiher, Johannes; Schauß, Peter; Hild, Sebastian; Macrı, Tommaso; Bloch, Immanuel; Gross, Christian

    2015-07-01

    Strong interactions can amplify quantum effects such that they become important on macroscopic scales. Controlling these coherently on a single-particle level is essential for the tailored preparation of strongly correlated quantum systems and opens up new prospects for quantum technologies. Rydberg atoms offer such strong interactions, which lead to extreme nonlinearities in laser-coupled atomic ensembles. As a result, multiple excitation of a micrometer-sized cloud can be blocked while the light-matter coupling becomes collectively enhanced. The resulting two-level system, often called a "superatom," is a valuable resource for quantum information, providing a collective qubit. Here, we report on the preparation of 2 orders of magnitude scalable superatoms utilizing the large interaction strength provided by Rydberg atoms combined with precise control of an ensemble of ultracold atoms in an optical lattice. The latter is achieved with sub-shot-noise precision by local manipulation of a two-dimensional Mott insulator. We microscopically confirm the superatom picture by in situ detection of the Rydberg excitations and observe the characteristic square-root scaling of the optical coupling with the number of atoms. Enabled by the full control over the atomic sample, including the motional degrees of freedom, we infer the overlap of the produced many-body state with a W state from the observed Rabi oscillations and deduce the presence of entanglement. Finally, we investigate the breakdown of the superatom picture when two Rydberg excitations are present in the system, which leads to dephasing and a loss of coherence.

  2. Myria: Scalable Analytics as a Service

    Science.gov (United States)

    Howe, B.; Halperin, D.; Whitaker, A.

    2014-12-01

    At the UW eScience Institute, we're working to empower non-experts, especially in the sciences, to write and use data-parallel algorithms. To this end, we are building Myria, a web-based platform for scalable analytics and data-parallel programming. Myria's internal model of computation is the relational algebra extended with iteration, such that every program is inherently data-parallel, just as every query in a database is inherently data-parallel. But unlike databases, iteration is a first class concept, allowing us to express machine learning tasks, graph traversal tasks, and more. Programs can be expressed in a number of languages and can be executed on a number of execution environments, but we emphasize a particular language called MyriaL that supports both imperative and declarative styles and a particular execution engine called MyriaX that uses an in-memory column-oriented representation and asynchronous iteration. We deliver Myria over the web as a service, providing an editor, performance analysis tools, and catalog browsing features in a single environment. We find that this web-based "delivery vector" is critical in reaching non-experts: they are insulated from irrelevant effort technical work associated with installation, configuration, and resource management. The MyriaX backend, one of several execution runtimes we support, is a main-memory, column-oriented, RDBMS-on-the-worker system that supports cyclic data flows as a first-class citizen and has been shown to outperform competitive systems on 100-machine cluster sizes. I will describe the Myria system, give a demo, and present some new results in large-scale oceanographic microbiology.

  3. Memory-Scalable GPU Spatial Hierarchy Construction.

    Science.gov (United States)

    Qiming Hou; Xin Sun; Kun Zhou; Lauterbach, C; Manocha, D

    2011-04-01

    Recent GPU algorithms for constructing spatial hierarchies have achieved promising performance for moderately complex models by using the breadth-first search (BFS) construction order. While being able to exploit the massive parallelism on the GPU, the BFS order also consumes excessive GPU memory, which becomes a serious issue for interactive applications involving very complex models with more than a few million triangles. In this paper, we propose to use the partial breadth-first search (PBFS) construction order to control memory consumption while maximizing performance. We apply the PBFS order to two hierarchy construction algorithms. The first algorithm is for kd-trees that automatically balances between the level of parallelism and intermediate memory usage. With PBFS, peak memory consumption during construction can be efficiently controlled without costly CPU-GPU data transfer. We also develop memory allocation strategies to effectively limit memory fragmentation. The resulting algorithm scales well with GPU memory and constructs kd-trees of models with millions of triangles at interactive rates on GPUs with 1 GB memory. Compared with existing algorithms, our algorithm is an order of magnitude more scalable for a given GPU memory bound. The second algorithm is for out-of-core bounding volume hierarchy (BVH) construction for very large scenes based on the PBFS construction order. At each iteration, all constructed nodes are dumped to the CPU memory, and the GPU memory is freed for the next iteration's use. In this way, the algorithm is able to build trees that are too large to be stored in the GPU memory. Experiments show that our algorithm can construct BVHs for scenes with up to 20 M triangles, several times larger than previous GPU algorithms.

  4. Delayless acceleration measurement method for motion control applications

    Energy Technology Data Exchange (ETDEWEB)

    Vaeliviita, S.; Ovaska, S.J. [Helsinki University of Technology, Otaniemi (Finland). Institute of Intelligent Power Electronics

    1997-12-31

    Delayless and accurate sensing of angular acceleration can improve the performance of motion control in motor drives. Acceleration control is, however, seldom implemented in practical drive systems due to prohibitively high costs or unsatisfactory results of most acceleration measurement methods. In this paper we propose an efficient and accurate acceleration measurement method based on direct differentiation of the corresponding velocity signal. Polynomial predictive filtering is used to smooth the resulting noisy signal without delay. This type of prediction is justified by noticing that a low-degree polynomial can usually be fitted into the primary acceleration curve. No additional hardware is required to implement the procedure if the velocity signal is already available. The performance of the acceleration measurement method is evaluated by applying it to a demanding motion control application. (orig.) 12 refs.

  5. PACE: A dynamic programming algorithm for hardware/software partitioning

    DEFF Research Database (Denmark)

    Knudsen, Peter Voigt; Madsen, Jan

    1996-01-01

    This paper presents the PACE partitioning algorithm which is used in the LYCOS co-synthesis system for partitioning control/dataflow graphs into hardware and software parts. The algorithm is a dynamic programming algorithm which solves both the problem of minimizing system execution time...... communication model and thus attempts to minimize communication overhead. The time-complexity of the algorithm is O(n2·𝒜) and the space-complexity is O(n·𝒜) where 𝒜 is the total area of the hardware chip and n the number of code fragments which may be placed in either hardware or software...... with a hardware area constraint and the problem of minimizing hardware area with a system execution time constraint. The target architecture consists of a single microprocessor and a single hardware chip (ASIC, FPGA, etc.) which are connected by a communication channel. The algorithm incorporates a realistic...

  6. Expert System analysis of non-fuel assembly hardware and spent fuel disassembly hardware: Its generation and recommended disposal

    Energy Technology Data Exchange (ETDEWEB)

    Williamson, Douglas Alan [Univ. of Florida, Gainesville, FL (United States)

    1991-01-01

    Almost all of the effort being expended on radioactive waste disposal in the United States is being focused on the disposal of spent Nuclear Fuel, with little consideration for other areas that will have to be disposed of in the same facilities. one area of radioactive waste that has not been addressed adequately because it is considered a secondary part of the waste issue is the disposal of the various Non-Fuel Bearing Components of the reactor core. These hardware components fall somewhat arbitrarily into two categories: Non-Fuel Assembly (NFA) hardware and Spent Fuel Disassembly (SFD) hardware. This work provides a detailed examination of the generation and disposal of NFA hardware and SFD hardware by the nuclear utilities of the United States as it relates to the Civilian Radioactive Waste Management Program. All available sources of data on NFA and SFD hardware are analyzed with particular emphasis given to the Characteristics Data Base developed by Oak Ridge National Laboratory and the characterization work performed by Pacific Northwest Laboratories and Rochester Gas & Electric. An Expert System developed as a portion of this work is used to assist in the prediction of quantities of NFA hardware and SFD hardware that will be generated by the United States` utilities. Finally, the hardware waste management practices of the United Kingdom, France, Germany, Sweden, and Japan are studied for possible application to the disposal of domestic hardware wastes. As a result of this work, a general classification scheme for NFA and SFD hardware was developed. Only NFA and SFD hardware constructed of zircaloy and experiencing a burnup of less than 70,000 MWD/MTIHM and PWR control rods constructed of stainless steel are considered Low-Level Waste. All other hardware is classified as Greater-ThanClass-C waste.

  7. GRAAL : A framework for low-power 3D graphics accelerators

    NARCIS (Netherlands)

    Juurlink, B.; Antochi, I.; Crisu, D.; Cotofana, S.; Vassiliadis, S.

    2008-01-01

    The GRAphics AcceLerator (GRAAL) design-exploration framework is an open system that offers a coherent development methodology for hardware/software cosimulation and codesign of embedded 3D graphics accelerators. GRAAL incorporates tools to help visually debug graphics algorithms implemented in

  8. Why Open Source Hardware matters and why you should care

    OpenAIRE

    Gürkaynak, Frank K.

    2017-01-01

    Open source hardware is currently where open source software was about 30 years ago. The idea is well received by enthusiasts, there is interest and the open source hardware has gained visible momentum recently, with several well-known universities including UC Berkeley, Cambridge and ETH Zürich actively working on large projects involving open source hardware, attracting the attention of companies big and small. But it is still not quite there yet. In this talk, based on my experience on the...

  9. Hardware-Enabled Security Through On-Chip Reconfigurable Fabric

    Science.gov (United States)

    2016-02-05

    SECURITY CLASSIFICATION OF: The goal of this project was to enable hardware -based security techniques on future microprocessors in a way that they...can be added and updated after fabrication, similar to software, while maintaining the efficiency and the security of hardware . For this purpose, the...Mar-2011 31-May-2014 Approved for Public Release; Distribution Unlimited Final Report: Hardware -Enabled Security Through On-Chip Reconfigurable Fabric

  10. Accelerating Science Driven System Design With RAMP

    Energy Technology Data Exchange (ETDEWEB)

    Wawrzynek, John [Univ. of California, Berkeley, CA (United States)

    2015-05-01

    Researchers from UC Berkeley, in collaboration with the Lawrence Berkeley National Lab, are engaged in developing an Infrastructure for Synthesis with Integrated Simulation (ISIS). The ISIS Project was a cooperative effort for “application-driven hardware design” that engages application scientists in the early parts of the hardware design process for future generation supercomputing systems. This project served to foster development of computing systems that are better tuned to the application requirements of demanding scientific applications and result in more cost-effective and efficient HPC system designs. In order to overcome long conventional design-cycle times, we leveraged reconfigurable devices to aid in the design of high-efficiency systems, including conventional multi- and many-core systems. The resulting system emulation/prototyping environment, in conjunction with the appropriate intermediate abstractions, provided both a convenient user programming experience and retained flexibility, and thus efficiency, of a reconfigurable platform. We initially targeted the Berkeley RAMP system (Research Accelerator for Multiple Processors) as that hardware emulation environment to facilitate and ultimately accelerate the iterative process of science-driven system design. Our goal was to develop and demonstrate a design methodology for domain-optimized computer system architectures. The tangible outcome is a methodology and tools for rapid prototyping and design-space exploration, leading to highly optimized and efficient HPC systems.

  11. Reliable software for unreliable hardware a cross layer perspective

    CERN Document Server

    Rehman, Semeen; Henkel, Jörg

    2016-01-01

    This book describes novel software concepts to increase reliability under user-defined constraints. The authors’ approach bridges, for the first time, the reliability gap between hardware and software. Readers will learn how to achieve increased soft error resilience on unreliable hardware, while exploiting the inherent error masking characteristics and error (stemming from soft errors, aging, and process variations) mitigations potential at different software layers. · Provides a comprehensive overview of reliability modeling and optimization techniques at different hardware and software levels; · Describes novel optimization techniques for software cross-layer reliability, targeting unreliable hardware.

  12. NIOS II processor-based acceleration of motion compensation techniques

    Science.gov (United States)

    González, Diego; Botella, Guillermo; Mookherjee, Soumak; Meyer-Bäse, Uwe; Meyer-Bäse, Anke

    2011-06-01

    This paper focuses on the hardware acceleration of motion compensation techniques suitable for the MPEG video compression. A plethora of representative motion estimation search algorithms and the new perspectives are introduced. The methods and designs described here are qualified for medical imaging area where are involved larger images. The structure of the processing systems considered has a good fit for reconfigurable acceleration. The system is based in a platform like FPGA working with the Nios II Microprocessor platform applying C2H acceleration. The paper shows the results in terms of performance and resources needed.

  13. Accelerating Wavelet-Based Video Coding on Graphics Hardware using CUDA

    NARCIS (Netherlands)

    Laan, Wladimir J. van der; Roerdink, Jos B.T.M.; Jalba, Andrei C.; Zinterhof, P; Loncaric, S; Uhl, A; Carini, A

    2009-01-01

    The Discrete Wavelet Transform (DWT) has a wide range of applications from signal processing to video and image compression. This transform, by means of the lifting scheme, can be performed in a memory mid computation efficient way on modern, programmable GPUs, which can be regarded as massively

  14. Real-time medical video processing, enabled by hardware accelerated correlations

    DEFF Research Database (Denmark)

    Savarimuthu, T. R.; Kjaer-Nielsen, A.; Sorensen, A. S.

    2011-01-01

    Image processing involving correlation based filter algorithms have proved extremely useful for image enhancement, feature extraction and recognition, in a wide range of medical applications, but is almost exclusively used with still images due to the amount of computations required by the correl......Image processing involving correlation based filter algorithms have proved extremely useful for image enhancement, feature extraction and recognition, in a wide range of medical applications, but is almost exclusively used with still images due to the amount of computations required...

  15. Efficient Sphere Detector Algorithm for Massive MIMO using GPU Hardware Accelerator

    KAUST Repository

    Arfaoui, Mohamed-Amine

    2016-06-01

    To further enhance the capacity of next generation wireless communication systems, massive MIMO has recently appeared as a necessary enabling technology to achieve high performance signal processing for large-scale multiple antennas. However, massive MIMO systems inevitably generate signal processing overheads, which translate into ever-increasing rate of complexity, and therefore, such system may not maintain the inherent real-time requirement of wireless systems. We redesign the non-linear sphere decoder method to increase the performance of the system, cast most memory-bound computations into compute-bound operations to reduce the overall complexity, and maintain the real-time processing thanks to the GPU computational power. We show a comprehensive complexity and performance analysis on an unprecedented MIMO system scale, which can ease the design phase toward simulating future massive MIMO wireless systems.

  16. A Hardware-Accelerated Fast Adaptive Vortex-Based Flow Simulation Software Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Applied Scientific Research has recently developed a Lagrangian vortex-boundary element method for the grid-free simulation of unsteady incompressible...

  17. Wire like link for cycle reproducible and cycle accurate hardware accelerator

    Science.gov (United States)

    Asaad, Sameh; Kapur, Mohit; Parker, Benjamin D

    2015-04-07

    First and second field programmable gate arrays are provided which implement first and second blocks of a circuit design to be simulated. The field programmable gate arrays are operated at a first clock frequency and a wire like link is provided to send a plurality of signals between them. The wire like link includes a serializer, on the first field programmable gate array, to serialize the plurality of signals; a deserializer on the second field programmable gate array, to deserialize the plurality of signals; and a connection between the serializer and the deserializer. The serializer and the deserializer are operated at a second clock frequency, greater than the first clock frequency, and the second clock frequency is selected such that latency of transmission and reception of the plurality of signals is less than the period corresponding to the first clock frequency.

  18. A Hardware Track Finder for ATLAS Trigger

    CERN Document Server

    Volpi, G; The ATLAS collaboration; Andreazza, A; Citterio, M; Favareto, A; Liberali, V; Meroni, C; Riva, M; Sabatini, F; Stabile, A; Annovi, A; Beretta, M; Castegnaro, A; Bevacqua, V; Crescioli, F; Francesco, C; Dell'Orso, M; Giannetti, P; Magalotti, D; Piendibene, M; Roda, C; Sacco, I; Tripiccione, R; Fabbri, L; Franchini, M; Giorgi, F; Giannuzzi, F; Lasagni, F; Sbarra, C; Valentinetti, S; Villa, M; Zoccoli, A; Lanza, A; Negri, A; Vercesi, V; Bogdan, M; Boveia, A; Canelli, F; Cheng, Y; Dunford, M; Li, H L; Kapliy, A; Kim, Y K; Melachrinos, C; Shochet, M; Tang, F; Tang, J; Tuggle, J; Tompkins, L; Webster, J; Atkinson, M; Cavaliere, V; Chang, P; Kasten, M; McCarn, A; Neubauer, M; Hoff, J; Liu, T; Okumura, Y; Olsen, J; Penning, B; Todri, A; Wu, J; Drake, G; Proudfoot, J; Zhang, J; Blair, R; Anderson, J; Auerbach, B; Blazey, G; Kimura, N; Yorita, K; Sakurai, Y; Mitani, T; Iizawa, T

    2012-01-01

    The existing three level ATLAS trigger system is deployed to reduce the event rate from the bunch crossing rate of 40 MHz to ~400 Hz for permanent storage at the LHC design luminosity of 10^34 cm^-2 s^-1. When the LHC reaches beyond the design luminosity, the load on the Level-2 trigger system will significantly increase due to both the need for more sophisticated algorithms to suppress background and the larger event sizes. The Fast TracKer (FTK) is a custom electronics system that will operate at the full Level-1 accepted rate of 100 KHz and provide high quality tracks at the beginning of processing in the Level-2 trigger, by performing track reconstruction in hardware with massive parallelism of associative memories and FPGAs. The performance in important physics areas including b-tagging, tau-tagging and lepton isolation will be demonstrated with the ATLAS MC simulation at different LHC luminosities. The system design will be overviewed. The latest R&D progress of individual components...

  19. Mechanics of Granular Materials labeled hardware

    Science.gov (United States)

    2000-01-01

    Mechanics of Granular Materials (MGM) flight hardware takes two twin double locker assemblies in the Space Shuttle middeck or the Spacehab module. Sand and soil grains have faces that can cause friction as they roll and slide against each other, or even cause sticking and form small voids between grains. This complex behavior can cause soil to behave like a liquid under certain conditions such as earthquakes or when powders are handled in industrial processes. MGM experiments aboard the Space Shuttle use the microgravity of space to simulate this behavior under conditions that carnot be achieved in laboratory tests on Earth. MGM is shedding light on the behavior of fine-grain materials under low effective stresses. Applications include earthquake engineering, granular flow technologies (such as powder feed systems for pharmaceuticals and fertilizers), and terrestrial and planetary geology. Nine MGM specimens have flown on two Space Shuttle flights. Another three are scheduled to fly on STS-107. The principal investigator is Stein Sture of the University of Colorado at Boulder. (Credit: NASA/MSFC).

  20. Hardware platform for multiple mobile robots

    Science.gov (United States)

    Parzhuber, Otto; Dolinsky, D.

    2004-12-01

    This work is concerned with software and communications architectures that might facilitate the operation of several mobile robots. The vehicles should be remotely piloted or tele-operated via a wireless link between the operator and the vehicles. The wireless link will carry control commands from the operator to the vehicle, telemetry data from the vehicle back to the operator and frequently also a real-time video stream from an on board camera. For autonomous driving the link will carry commands and data between the vehicles. For this purpose we have developed a hardware platform which consists of a powerful microprocessor, different sensors, stereo- camera and Wireless Local Area Network (WLAN) for communication. The adoption of IEEE802.11 standard for the physical and access layer protocols allow a straightforward integration with the internet protocols TCP/IP. For the inspection of the environment the robots are equipped with a wide variety of sensors like ultrasonic, infrared proximity sensors and a small inertial measurement unit. Stereo cameras give the feasibility of the detection of obstacles, measurement of distance and creation of a map of the room.

  1. Nanorobot Hardware Architecture for Medical Defense.

    Science.gov (United States)

    Cavalcanti, Adriano; Shirinzadeh, Bijan; Zhang, Mingjun; Kretly, Luiz C

    2008-05-06

    This work presents a new approach with details on the integrated platform and hardware architecture for nanorobots application in epidemic control, which should enable real time in vivo prognosis of biohazard infection. The recent developments in the field of nanoelectronics, with transducers progressively shrinking down to smaller sizes through nanotechnology and carbon nanotubes, are expected to result in innovative biomedical instrumentation possibilities, with new therapies and efficient diagnosis methodologies. The use of integrated systems, smart biosensors, and programmable nanodevices are advancing nanoelectronics, enabling the progressive research and development of molecular machines. It should provide high precision pervasive biomedical monitoring with real time data transmission. The use of nanobioelectronics as embedded systems is the natural pathway towards manufacturing methodology to achieve nanorobot applications out of laboratories sooner as possible. To demonstrate the practical application of medical nanorobotics, a 3D simulation based on clinical data addresses how to integrate communication with nanorobots using RFID, mobile phones, and satellites, applied to long distance ubiquitous surveillance and health monitoring for troops in conflict zones. Therefore, the current model can also be used to prevent and save a population against the case of some targeted epidemic disease.

  2. Nanorobot Hardware Architecture for Medical Defense

    Directory of Open Access Journals (Sweden)

    Luiz C. Kretly

    2008-05-01

    Full Text Available This work presents a new approach with details on the integrated platform and hardware architecture for nanorobots application in epidemic control, which should enable real time in vivo prognosis of biohazard infection. The recent developments in the field of nanoelectronics, with transducers progressively shrinking down to smaller sizes through nanotechnology and carbon nanotubes, are expected to result in innovative biomedical instrumentation possibilities, with new therapies and efficient diagnosis methodologies. The use of integrated systems, smart biosensors, and programmable nanodevices are advancing nanoelectronics, enabling the progressive research and development of molecular machines. It should provide high precision pervasive biomedical monitoring with real time data transmission. The use of nanobioelectronics as embedded systems is the natural pathway towards manufacturing methodology to achieve nanorobot applications out of laboratories sooner as possible. To demonstrate the practical application of medical nanorobotics, a 3D simulation based on clinical data addresses how to integrate communication with nanorobots using RFID, mobile phones, and satellites, applied to long distance ubiquitous surveillance and health monitoring for troops in conflict zones. Therefore, the current model can also be used to prevent and save a population against the case of some targeted epidemic disease.

  3. Employing ISRU Models to Improve Hardware Design

    Science.gov (United States)

    Linne, Diane L.

    2010-01-01

    An analytical model for hydrogen reduction of regolith was used to investigate the effects of several key variables on the energy and mass performance of reactors for a lunar in-situ resource utilization oxygen production plant. Reactor geometry, reaction time, number of reactors, heat recuperation, heat loss, and operating pressure were all studied to guide hardware designers who are developing future prototype reactors. The effects of heat recuperation where the incoming regolith is pre-heated by the hot spent regolith before transfer was also investigated for the first time. In general, longer reaction times per batch provide a lower overall energy, but also result in larger and heavier reactors. Three reactors with long heat-up times results in similar energy requirements as a two-reactor system with all other parameters the same. Three reactors with heat recuperation results in energy reductions of 20 to 40 percent compared to a three-reactor system with no heat recuperation. Increasing operating pressure can provide similar energy reductions as heat recuperation for the same reaction times.

  4. Flow testing rear face hardware combinations

    Energy Technology Data Exchange (ETDEWEB)

    Haun, F.E. Jr.

    1962-06-01

    The purpose of these tests is to provide necessary laboratory data in support of an R,PEO program in determining the energy loss associated with various hardware size combinations on the rear face of the B-D-F reactors. The original method used to check for critical flow was determined to be faulty. A revised method demonstrated critical flow did occur in the 5/8-inch inconel connector and combination 1 fittings. The remaining fitting combinations with the 5/8-inch inconel and 3/4-inch aluminum connector were not rechecked because of the reaming of the I.D. to permit the continuation of the original tests. During test number 6, audible cavitation was heard with the highest severity at a point midway between pressure points 3 and 4 on the connector. This condition appeared again in tests 6A, 7, and 7A, with incipient cavitation at approximately 40 gpm in each test, regardless of the rear header pressure and/or temperature.

  5. Hardware Architectures for Data-Intensive Computing Problems: A Case Study for String Matching

    Energy Technology Data Exchange (ETDEWEB)

    Tumeo, Antonino; Villa, Oreste; Chavarría-Miranda, Daniel

    2012-12-28

    DNA analysis is an emerging application of high performance bioinformatic. Modern sequencing machinery are able to provide, in few hours, large input streams of data, which needs to be matched against exponentially growing databases of known fragments. The ability to recognize these patterns effectively and fastly may allow extending the scale and the reach of the investigations performed by biology scientists. Aho-Corasick is an exact, multiple pattern matching algorithm often at the base of this application. High performance systems are a promising platform to accelerate this algorithm, which is computationally intensive but also inherently parallel. Nowadays, high performance systems also include heterogeneous processing elements, such as Graphic Processing Units (GPUs), to further accelerate parallel algorithms. Unfortunately, the Aho-Corasick algorithm exhibits large performance variability, depending on the size of the input streams, on the number of patterns to search and on the number of matches, and poses significant challenges on current high performance software and hardware implementations. An adequate mapping of the algorithm on the target architecture, coping with the limit of the underlining hardware, is required to reach the desired high throughputs. In this paper, we discuss the implementation of the Aho-Corasick algorithm for GPU-accelerated high performance systems. We present an optimized implementation of Aho-Corasick for GPUs and discuss its tradeoffs on the Tesla T10 and he new Tesla T20 (codename Fermi) GPUs. We then integrate the optimized GPU code, respectively, in a MPI-based and in a pthreads-based load balancer to enable execution of the algorithm on clusters and large sharedmemory multiprocessors (SMPs) accelerated with multiple GPUs.

  6. Progress Report 2008: A Scalable and Extensible Earth System Model for Climate Change Science

    Energy Technology Data Exchange (ETDEWEB)

    Drake, John B [ORNL; Worley, Patrick H [ORNL; Hoffman, Forrest M [ORNL; Jones, Phil [Los Alamos National Laboratory (LANL)

    2009-01-01

    This project employs multi-disciplinary teams to accelerate development of the Community Climate System Model (CCSM), based at the National Center for Atmospheric Research (NCAR). A consortium of eight Department of Energy (DOE) National Laboratories collaborate with NCAR and the NASA Global Modeling and Assimilation Office (GMAO). The laboratories are Argonne (ANL), Brookhaven (BNL) Los Alamos (LANL), Lawrence Berkeley (LBNL), Lawrence Livermore (LLNL), Oak Ridge (ORNL), Pacific Northwest (PNNL) and Sandia (SNL). The work plan focuses on scalablity for petascale computation and extensibility to a more comprehensive earth system model. Our stated goal is to support the DOE mission in climate change research by helping ... To determine the range of possible climate changes over the 21st century and beyond through simulations using a more accurate climate system model that includes the full range of human and natural climate feedbacks with increased realism and spatial resolution.

  7. Sindbad: a multi-purpose and scalable X-ray simulation tool for NDE and medical imaging

    Energy Technology Data Exchange (ETDEWEB)

    Guillemaud, R.; Tabary, J.; Hugonnard, P.; Mathy, F.; Koenig, A.; Gliere, A

    2003-07-01

    In a unified framework, S.i.n.d.b.a.d. is a multipurpose X-ray simulation software which provides scalable approach of computation and very efficient results by combining analytical and monte Carlo simulations. The software has been validated experimentally. it is also a easy to use software with a strong emphasize on user friendly GUI, simple description of object (CAD or volume) and visualization tools. The next developments will be focused on acceleration of Monte Carlo simulation for scatter fraction computation and the addition of new types of detector. (N.C.)

  8. Ultrasound and clinical evaluation of soft-tissue versus hardware biceps tenodesis: is hardware tenodesis worth the cost?

    Science.gov (United States)

    Elkousy, Hussein; Romero, Jose A; Edwards, T Bradley; Gartsman, Gary M; O'Connor, Daniel P

    2014-02-01

    This study assesses the failure rate of soft-tissue versus hardware fixation of biceps tenodesis by ultrasound to determine if the expense of a hardware tenodesis technique is warranted. Seventy-two patients that underwent arthroscopic biceps tenodesis over a 3-year period were evaluated using postoperative ultrasonography and clinical examination. The tenodesis technique employed was either a soft-tissue technique with sutures or an interference screw technique using hardware based on surgeon preference. Patient age was 57.9 years on average with ultrasound and clinical examination done at an average of 9.3 months postoperatively. Thirty-one patients had a hardware technique and 41 a soft-tissue technique. Overall, 67.7% of biceps tenodesis done with hardware were intact, compared with 75.6% for the soft-tissue technique by ultrasound (P = .46). Clinical evaluation indicated that 80.7% of hardware techniques and 78% of soft-tissue techniques were intact. Average material cost to the hospital for the hardware technique was $514.32, compared with $32.05 for the soft-tissue technique. Biceps tenodesis success, as determined by clinical deformity and ultrasound, was not improved using hardware as compared to soft-tissue techniques. Soft-tissue techniques are equally efficacious and more cost effective than hardware techniques.

  9. An acceleration framework for synthetic aperture radar algorithms

    Science.gov (United States)

    Kim, Youngsoo; Gloster, Clay S.; Alexander, Winser E.

    2017-04-01

    Algorithms for radar signal processing, such as Synthetic Aperture Radar (SAR) are computationally intensive and require considerable execution time on a general purpose processor. Reconfigurable logic can be used to off-load the primary computational kernel onto a custom computing machine in order to reduce execution time by an order of magnitude as compared to kernel execution on a general purpose processor. Specifically, Field Programmable Gate Arrays (FPGAs) can be used to accelerate these kernels using hardware-based custom logic implementations. In this paper, we demonstrate a framework for algorithm acceleration. We used SAR as a case study to illustrate the potential for algorithm acceleration offered by FPGAs. Initially, we profiled the SAR algorithm and implemented a homomorphic filter using a hardware implementation of the natural logarithm. Experimental results show a linear speedup by adding reasonably small processing elements in Field Programmable Gate Array (FPGA) as opposed to using a software implementation running on a typical general purpose processor.

  10. PCIU: Hardware Implementations of an Efficient Packet Classification Algorithm with an Incremental Update Capability

    Directory of Open Access Journals (Sweden)

    O. Ahmed

    2011-01-01

    Full Text Available Packet classification plays a crucial role for a number of network services such as policy-based routing, firewalls, and traffic billing, to name a few. However, classification can be a bottleneck in the above-mentioned applications if not implemented properly and efficiently. In this paper, we propose PCIU, a novel classification algorithm, which improves upon previously published work. PCIU provides lower preprocessing time, lower memory consumption, ease of incremental rule update, and reasonable classification time compared to state-of-the-art algorithms. The proposed algorithm was evaluated and compared to RFC and HiCut using several benchmarks. Results obtained indicate that PCIU outperforms these algorithms in terms of speed, memory usage, incremental update capability, and preprocessing time. The algorithm, furthermore, was improved and made more accessible for a variety of applications through implementation in hardware. Two such implementations are detailed and discussed in this paper. The results indicate that a hardware/software codesign approach results in a slower, but easier to optimize and improve within time constraints, PCIU solution. A hardware accelerator based on an ESL approach using Handel-C, on the other hand, resulted in a 31x speed-up over a pure software implementation running on a state of the art Xeon processor.

  11. Using inverted indices for accelerating LINGO calculations.

    Science.gov (United States)

    Kristensen, Thomas G; Nielsen, Jesper; Pedersen, Christian N S

    2011-03-28

    The ever growing size of chemical databases calls for the development of novel methods for representing and comparing molecules. One such method called LINGO is based on fragmenting the SMILES string representation of molecules. Comparison of molecules can then be performed by calculating the Tanimoto coefficient, which is called LINGOsim when used on LINGO multisets. This paper introduces a verbose representation for storing LINGO multisets, which makes it possible to transform them into sparse fingerprints such that fingerprint data structures and algorithms can be used to accelerate queries. The previous best method for rapidly calculating the LINGOsim similarity matrix required specialized hardware to yield a significant speedup over existing methods. By representing LINGO multisets in the verbose representation and using inverted indices, it is possible to calculate LINGOsim similarity matrices roughly 2.6 times faster than existing methods without relying on specialized hardware.

  12. Hardware packet pacing using a DMA in a parallel computer

    Science.gov (United States)

    Chen, Dong; Heidelberger, Phillip; Vranas, Pavlos

    2013-08-13

    Method and system for hardware packet pacing using a direct memory access controller in a parallel computer which, in one aspect, keeps track of a total number of bytes put on the network as a result of a remote get operation, using a hardware token counter.

  13. Teaching Robotics Software with the Open Hardware Mobile Manipulator

    Science.gov (United States)

    Vona, M.; Shekar, N. H.

    2013-01-01

    The "open hardware mobile manipulator" (OHMM) is a new open platform with a unique combination of features for teaching robotics software and algorithms. On-board low- and high-level processors support real-time embedded programming and motor control, as well as higher-level coding with contemporary libraries. Full hardware designs and…

  14. The role of the visual hardware system in rugby performance ...

    African Journals Online (AJOL)

    This study explores the importance of the 'hardware' factors of the visual system in the game of rugby. A group of professional and club rugby players were tested and the results compared. The results were also compared with the established norms for elite athletes. The findings indicate no significant difference in hardware ...

  15. The role of the visual hardware system in rugby performance ...

    African Journals Online (AJOL)

    This suggests that in the game of rugby the hardware skills may be of less importance and that visual enhancement programmes should focus more on improving the players' software skills. Key words: Vision, hardware, rugby, sports performance. (Af. J. Physical, Health Education, Recreation and Dance: 2003 Special ...

  16. [Hardware and software for X-ray therapy planning].

    Science.gov (United States)

    Zhizniakov, A L; Semenov, S I; Sushkova, L T; Troitskii, D P; Chirkov, K V

    2007-01-01

    Hardware, circuitry, and software suggested in this work make it possible to use the SLS-9 X-ray simulator for classical and computer tomographic imaging. The suggested hardware and software can be used as a basis for designing special-purpose tomographic systems.

  17. Accelerator Mass Spectrometry on SIRIUS: New 6 MV spectrometer at ANSTO

    Science.gov (United States)

    Wilcken, K. M.; Fink, D.; Hotchkis, M. A. C.; Garton, D.; Button, D.; Mann, M.; Kitchen, R.; Hauser, T.; O'Connor, A.

    2017-09-01

    The Centre for Accelerator Science at ANSTO operates four tandem accelerator systems for Accelerator Mass Spectrometry (AMS) and Ion Beam Analysis (IBA). The latest addition to the fleet is SIRIUS, a 6 MV combined IBA and AMS system. Following initial ion beam testing, conditioning and debugging software and hardware, SIRIUS is now commissioned. Details of the instrument design and performance data for 10Be, 26Al and 36Cl are presented.

  18. Monitoring Particulate Matter with Commodity Hardware

    Science.gov (United States)

    Holstius, David

    Health effects attributed to outdoor fine particulate matter (PM 2.5) rank it among the risk factors with the highest health burdens in the world, annually accounting for over 3.2 million premature deaths and over 76 million lost disability-adjusted life years. Existing PM2.5 monitoring infrastructure cannot, however, be used to resolve variations in ambient PM2.5 concentrations with adequate spatial and temporal density, or with adequate coverage of human time-activity patterns, such that the needs of modern exposure science and control can be met. Small, inexpensive, and portable devices, relying on newly available off-the-shelf sensors, may facilitate the creation of PM2.5 datasets with improved resolution and coverage, especially if many such devices can be deployed concurrently with low system cost. Datasets generated with such technology could be used to overcome many important problems associated with exposure misclassification in air pollution epidemiology. Chapter 2 presents an epidemiological study of PM2.5 that used data from ambient monitoring stations in the Los Angeles basin to observe a decrease of 6.1 g (95% CI: 3.5, 8.7) in population mean birthweight following in utero exposure to the Southern California wildfires of 2003, but was otherwise limited by the sparsity of the empirical basis for exposure assessment. Chapter 3 demonstrates technical potential for remedying PM2.5 monitoring deficiencies, beginning with the generation of low-cost yet useful estimates of hourly and daily PM2.5 concentrations at a regulatory monitoring site. The context (an urban neighborhood proximate to a major goods-movement corridor) and the method (an off-the-shelf sensor costing approximately USD $10, combined with other low-cost, open-source, readily available hardware) were selected to have special significance among researchers and practitioners affiliated with contemporary communities of practice in public health and citizen science. As operationalized by

  19. FPGA BASED HARDWARE KEY FOR TEMPORAL ENCRYPTION

    Directory of Open Access Journals (Sweden)

    B. Lakshmi

    2010-09-01

    Full Text Available In this paper, a novel encryption scheme with time based key technique on an FPGA is presented. Time based key technique ensures right key to be entered at right time and hence, vulnerability of encryption through brute force attack is eliminated. Presently available encryption systems, suffer from Brute force attack and in such a case, the time taken for breaking a code depends on the system used for cryptanalysis. The proposed scheme provides an effective method in which the time is taken as the second dimension of the key so that the same system can defend against brute force attack more vigorously. In the proposed scheme, the key is rotated continuously and four bits are drawn from the key with their concatenated value representing the delay the system has to wait. This forms the time based key concept. Also the key based function selection from a pool of functions enhances the confusion and diffusion to defend against linear and differential attacks while the time factor inclusion makes the brute force attack nearly impossible. In the proposed scheme, the key scheduler is implemented on FPGA that generates the right key at right time intervals which is then connected to a NIOS – II processor (a virtual microcontroller which is brought out from Altera FPGA that communicates with the keys to the personal computer through JTAG (Joint Test Action Group communication and the computer is used to perform encryption (or decryption. In this case the FPGA serves as hardware key (dongle for data encryption (or decryption.

  20. A Practical Introduction to HardwareSoftware Codesign

    CERN Document Server

    Schaumont, Patrick R

    2013-01-01

    This textbook provides an introduction to embedded systems design, with emphasis on integration of custom hardware components with software. The key problem addressed in the book is the following: how can an embedded systems designer strike a balance between flexibility and efficiency? The book describes how combining hardware design with software design leads to a solution to this important computer engineering problem. The book covers four topics in hardware/software codesign: fundamentals, the design space of custom architectures, the hardware/software interface and application examples. The book comes with an associated design environment that helps the reader to perform experiments in hardware/software codesign. Each chapter also includes exercises and further reading suggestions. Improvements in this second edition include labs and examples using modern FPGA environments from Xilinx and Altera, which make the material applicable to a greater number of courses where these tools are already in use.  Mo...

  1. A computer control system for the PNC high power cw electron linac. Concept and hardware

    Energy Technology Data Exchange (ETDEWEB)

    Emoto, T.; Hirano, K.; Takei, Hayanori; Nomura, Masahiro; Tani, S. [Power Reactor and Nuclear Fuel Development Corp., Oarai, Ibaraki (Japan). Oarai Engineering Center; Kato, Y.; Ishikawa, Y.

    1998-06-01

    Design and construction of a high power cw (Continuous Wave) electron linac for studying feasibility of nuclear waste transmutation was started in 1989 at PNC. The PNC accelerator (10 MeV, 20 mA average current, 4 ms pulse width, 50 Hz repetition) is dedicated machine for development of the high current acceleration technology in future need. The computer control system is responsible for accelerator control and supporting the experiment for high power operation. The feature of the system is the measurements of accelerator status simultaneously and modularity of software and hardware for easily implemented for modification or expansion. The high speed network (SCRAM Net {approx} 15 MB/s), Ethernet, and front end processors (Digital Signal Processor) were employed for the high speed data taking and control. The system was designed to be standard modules and software implemented man machine interface. Due to graphical-user-interface and object-oriented-programming, the software development environment is effortless programming and maintenance. (author)

  2. Piezoelectric particle accelerator

    Science.gov (United States)

    Kemp, Mark A.; Jongewaard, Erik N.; Haase, Andrew A.; Franzi, Matthew

    2017-08-29

    A particle accelerator is provided that includes a piezoelectric accelerator element, where the piezoelectric accelerator element includes a hollow cylindrical shape, and an input transducer, where the input transducer is disposed to provide an input signal to the piezoelectric accelerator element, where the input signal induces a mechanical excitation of the piezoelectric accelerator element, where the mechanical excitation is capable of generating a piezoelectric electric field proximal to an axis of the cylindrical shape, where the piezoelectric accelerator is configured to accelerate a charged particle longitudinally along the axis of the cylindrical shape according to the piezoelectric electric field.

  3. Quality Scalability Compression on Single-Loop Solution in HEVC

    Directory of Open Access Journals (Sweden)

    Mengmeng Zhang

    2014-01-01

    Full Text Available This paper proposes a quality scalable extension design for the upcoming high efficiency video coding (HEVC standard. In the proposed design, the single-loop decoder solution is extended into the proposed scalable scenario. A novel interlayer intra/interprediction is added to reduce the amount of bits representation by exploiting the correlation between coding layers. The experimental results indicate that the average Bjøntegaard delta rate decrease of 20.50% can be gained compared with the simulcast encoding. The proposed technique achieved 47.98% Bjøntegaard delta rate reduction compared with the scalable video coding extension of the H.264/AVC. Consequently, significant rate savings confirm that the proposed method achieves better performance.

  4. Technical Report: Scalable Parallel Algorithms for High Dimensional Numerical Integration

    Energy Technology Data Exchange (ETDEWEB)

    Masalma, Yahya [Universidad del Turabo; Jiao, Yu [ORNL

    2010-10-01

    We implemented a scalable parallel quasi-Monte Carlo numerical high-dimensional integration for tera-scale data points. The implemented algorithm uses the Sobol s quasi-sequences to generate random samples. Sobol s sequence was used to avoid clustering effects in the generated random samples and to produce low-discrepancy random samples which cover the entire integration domain. The performance of the algorithm was tested. Obtained results prove the scalability and accuracy of the implemented algorithms. The implemented algorithm could be used in different applications where a huge data volume is generated and numerical integration is required. We suggest using the hyprid MPI and OpenMP programming model to improve the performance of the algorithms. If the mixed model is used, attention should be paid to the scalability and accuracy.

  5. Concurrent heterogeneous neural model simulation on real-time neuromimetic hardware.

    Science.gov (United States)

    Rast, Alexander; Galluppi, Francesco; Davies, Sergio; Plana, Luis; Patterson, Cameron; Sharp, Thomas; Lester, David; Furber, Steve

    2011-11-01

    Dedicated hardware is becoming increasingly essential to simulate emerging very-large-scale neural models. Equally, however, it needs to be able to support multiple models of the neural dynamics, possibly operating simultaneously within the same system. This may be necessary either to simulate large models with heterogeneous neural types, or to simplify simulation and analysis of detailed, complex models in a large simulation by isolating the new model to a small subpopulation of a larger overall network. The SpiNNaker neuromimetic chip is a dedicated neural processor able to support such heterogeneous simulations. Implementing these models on-chip uses an integrated library-based tool chain incorporating the emerging PyNN interface that allows a modeller to input a high-level description and use an automated process to generate an on-chip simulation. Simulations using both LIF and Izhikevich models demonstrate the ability of the SpiNNaker system to generate and simulate heterogeneous networks on-chip, while illustrating, through the network-scale effects of wavefront synchronisation and burst gating, methods that can provide effective behavioural abstractions for large-scale hardware modelling. SpiNNaker's asynchronous virtual architecture permits greater scope for model exploration, with scalable levels of functional and temporal abstraction, than conventional (or neuromorphic) computing platforms. The complete system illustrates a potential path to understanding the neural model of computation, by building (and breaking) neural models at various scales, connecting the blocks, then comparing them against the biology: computational cognitive neuroscience. Copyright © 2011 Elsevier Ltd. All rights reserved.

  6. Current parallel I/O limitations to scalable data analysis.

    Energy Technology Data Exchange (ETDEWEB)

    Mascarenhas, Ajith Arthur; Pebay, Philippe Pierre

    2011-07-01

    This report describes the limitations to parallel scalability which we have encountered when applying our otherwise optimally scalable parallel statistical analysis tool kit to large data sets distributed across the parallel file system of the current premier DOE computational facility. This report describes our study to evaluate the effect of parallel I/O on the overall scalability of a parallel data analysis pipeline using our scalable parallel statistics tool kit [PTBM11]. In this goal, we tested it using the Jaguar-pf DOE/ORNL peta-scale platform on a large combustion simulation data under a variety of process counts and domain decompositions scenarios. In this report we have recalled the foundations of the parallel statistical analysis tool kit which we have designed and implemented, with the specific double intent of reproducing typical data analysis workflows, and achieving optimal design for scalable parallel implementations. We have briefly reviewed those earlier results and publications which allow us to conclude that we have achieved both goals. However, in this report we have further established that, when used in conjuction with a state-of-the-art parallel I/O system, as can be found on the premier DOE peta-scale platform, the scaling properties of the overall analysis pipeline comprising parallel data access routines degrade rapidly. This finding is problematic and must be addressed if peta-scale data analysis is to be made scalable, or even possible. In order to attempt to address these parallel I/O limitations, we will investigate the use the Adaptable IO System (ADIOS) [LZL+10] to improve I/O performance, while maintaining flexibility for a variety of IO options, such MPI IO, POSIX IO. This system is developed at ORNL and other collaborating institutions, and is being tested extensively on Jaguar-pf. Simulation code being developed on these systems will also use ADIOS to output the data thereby making it easier for other systems, such as ours, to

  7. The Art of Space Flight Exercise Hardware: Design and Implementation

    Science.gov (United States)

    Beyene, Nahom M.

    2004-01-01

    The design of space flight exercise hardware depends on experience with crew health maintenance in a microgravity environment, history in development of flight-quality exercise hardware, and a foundation for certifying proper project management and design methodology. Developed over the past 40 years, the expertise in designing exercise countermeasures hardware at the Johnson Space Center stems from these three aspects of design. The medical community has steadily pursued an understanding of physiological changes in humans in a weightless environment and methods of counteracting negative effects on the cardiovascular and musculoskeletal system. The effects of weightlessness extend to the pulmonary and neurovestibular system as well with conditions ranging from motion sickness to loss of bone density. Results have shown losses in water weight and muscle mass in antigravity muscle groups. With the support of university-based research groups and partner space agencies, NASA has identified exercise to be the primary countermeasure for long-duration space flight. The history of exercise hardware began during the Apollo Era and leads directly to the present hardware on the International Space Station. Under the classifications of aerobic and resistive exercise, there is a clear line of development from the early devices to the countermeasures hardware used today. In support of all engineering projects, the engineering directorate has created a structured framework for project management. Engineers have identified standards and "best practices" to promote efficient and elegant design of space exercise hardware. The quality of space exercise hardware depends on how well hardware requirements are justified by exercise performance guidelines and crew health indicators. When considering the microgravity environment of the device, designers must consider performance of hardware separately from the combined human-in-hardware system. Astronauts are the caretakers of the hardware

  8. Natural product synthesis in the age of scalability.

    Science.gov (United States)

    Kuttruff, Christian A; Eastgate, Martin D; Baran, Phil S

    2014-04-01

    The ability to procure useful quantities of a molecule by simple, scalable routes is emerging as an important goal in natural product synthesis. Approaches to molecules that yield substantial material enable collaborative investigations (such as SAR studies or eventual commercial production) and inherently spur innovation in chemistry. As such, when evaluating a natural product synthesis, scalability is becoming an increasingly important factor. In this Highlight, we discuss recent examples of natural product synthesis from our laboratory and others, where the preparation of gram-scale quantities of a target compound or a key intermediate allowed for a deeper understanding of biological activities or enabled further investigational collaborations.

  9. Providing scalable system software for high-end simulations

    Energy Technology Data Exchange (ETDEWEB)

    Greenberg, D. [Sandia National Labs., Albuquerque, NM (United States)

    1997-12-31

    Detailed, full-system, complex physics simulations have been shown to be feasible on systems containing thousands of processors. In order to manage these computer systems it has been necessary to create scalable system services. In this talk Sandia`s research on scalable systems will be described. The key concepts of low overhead data movement through portals and of flexible services through multi-partition architectures will be illustrated in detail. The talk will conclude with a discussion of how these techniques can be applied outside of the standard monolithic MPP system.

  10. Scalable and Hybrid Radio Resource Management for Future Wireless Networks

    DEFF Research Database (Denmark)

    Mino, E.; Luo, Jijun; Tragos, E.

    2007-01-01

    The concept of ubiquitous and scalable system is applied in the IST WINNER II [1] project to deliver optimum performance for different deployment scenarios, from local area to wide area wireless networks. The integration in a unique radio system of a cellular and local area type networks supposes...... describes a proposal for scalable and hybrid radio resource management to efficiently integrate the different WINNER system modes. Index...... a great advantage for the final user and for the operator, compared with the current situation, with disconnected systems, usually with different subscriptions, radio interfaces and terminals. To be a ubiquitous wireless system, the IST project WINNER II has defined three system modes. This contribution...

  11. Scalability limitations of VIA-based technologies in supporting MPI

    Energy Technology Data Exchange (ETDEWEB)

    BRIGHTWELL,RONALD B.; MACCABE,ARTHUR BERNARD

    2000-04-17

    This paper analyzes the scalability limitations of networking technologies based on the Virtual Interface Architecture (VIA) in supporting the runtime environment needed for an implementation of the Message Passing Interface. The authors present an overview of the important characteristics of VIA and an overview of the runtime system being developed as part of the Computational Plant (Cplant) project at Sandia National Laboratories. They discuss the characteristics of VIA that prevent implementations based on this system to meet the scalability and performance requirements of Cplant.

  12. A Scalable Smart Meter Data Generator Using Spark

    DEFF Research Database (Denmark)

    Iftikhar, Nadeem; Liu, Xiufeng; Danalachi, Sergiu

    2017-01-01

    Today, smart meters are being used worldwide. As a matter of fact smart meters produce large volumes of data. Thus, it is important for smart meter data management and analytics systems to process petabytes of data. Benchmarking and testing of these systems require scalable data, however, it can...... be challenging to get large data sets due to privacy and/or data protection regulations. This paper presents a scalable smart meter data generator using Spark that can generate realistic data sets. The proposed data generator is based on a supervised machine learning method that can generate data of any size...

  13. Production, Characterization, and Acceleration of Optical Microbunches

    Energy Technology Data Exchange (ETDEWEB)

    Sears, Christopher M.S. [Stanford Univ., CA (United States)

    2008-06-20

    Optical microbunches with a spacing of 800 nm have been produced for laser acceleration research. The microbunches are produced using a inverse Free-Electron-Laser (IFEL) followed by a dispersive chicane. The microbunched electron beam is characterized by coherent optical transition radiation (COTR) with good agreement to the analytic theory for bunch formation. In a second experiment the bunches are accelerated in a second stage to achieve for the first time direct net acceleration of electrons traveling in a vacuum with visible light. This dissertation presents the theory of microbunch formation and characterization of the microbunches. It also presents the design of the experimental hardware from magnetostatic and particle tracking simulations, to fabrication and measurement of the undulator and chicane magnets. Finally, the dissertation discusses three experiments aimed at demonstrating the IFEL interaction, microbunch production, and the net acceleration of the microbunched beam. At the close of the dissertation, a separate but related research effort on the tight focusing of electrons for coupling into optical scale, Photonic Bandgap, structures is presented. This includes the design and fabrication of a strong focusing permanent magnet quadrupole triplet and an outline of an initial experiment using the triplet to observe wakefields generated by an electron beam passing through an optical scale accelerator.

  14. Scalable Track Initiation for Optical Space Surveillance

    Science.gov (United States)

    Schumacher, P.; Wilkins, M. P.

    2012-09-01

    least cubic and commonly quartic or higher. Therefore, practical implementations require attention to the scalability of the algorithms, when one is dealing with the very large number of observations from large surveillance telescopes. We address two broad categories of algorithms. The first category includes and extends the classical methods of Laplace and Gauss, as well as the more modern method of Gooding, in which one solves explicitly for the apparent range to the target in terms of the given data. In particular, recent ideas offered by Mortari and Karimi allow us to construct a family of range-solution methods that can be scaled to many processors efficiently. We find that the orbit solutions (data association hypotheses) can be ranked by means of a concept we call persistence, in which a simple statistical measure of likelihood is based on the frequency of occurrence of combinations of observations in consistent orbit solutions. Of course, range-solution methods can be expected to perform poorly if the orbit solutions of most interest are not well conditioned. The second category of algorithms addresses this difficulty. Instead of solving for range, these methods attach a set of range hypotheses to each measured line of sight. Then all pair-wise combinations of observations are considered and the family of Lambert problems is solved for each pair. These algorithms also have polynomial complexity, though now the complexity is quadratic in the number of observations and also quadratic in the number of range hypotheses. We offer a novel type of admissible-region analysis, constructing partitions of the orbital element space and deriving rigorous upper and lower bounds on the possible values of the range for each partition. This analysis allows us to parallelize with respect to the element partitions and to reduce the number of range hypotheses that have to be considered in each processor simply by making the partitions smaller. Naturally, there are many ways to

  15. Approaches for scalable modeling and emulation of cyber systems : LDRD final report.

    Energy Technology Data Exchange (ETDEWEB)

    Mayo, Jackson R.; Minnich, Ronald G.; Armstrong, Robert C.; Rudish, Don W.

    2009-09-01

    The goal of this research was to combine theoretical and computational approaches to better understand the potential emergent behaviors of large-scale cyber systems, such as networks of {approx} 10{sup 6} computers. The scale and sophistication of modern computer software, hardware, and deployed networked systems have significantly exceeded the computational research community's ability to understand, model, and predict current and future behaviors. This predictive understanding, however, is critical to the development of new approaches for proactively designing new systems or enhancing existing systems with robustness to current and future cyber threats, including distributed malware such as botnets. We have developed preliminary theoretical and modeling capabilities that can ultimately answer questions such as: How would we reboot the Internet if it were taken down? Can we change network protocols to make them more secure without disrupting existing Internet connectivity and traffic flow? We have begun to address these issues by developing new capabilities for understanding and modeling Internet systems at scale. Specifically, we have addressed the need for scalable network simulation by carrying out emulations of a network with {approx} 10{sup 6} virtualized operating system instances on a high-performance computing cluster - a 'virtual Internet'. We have also explored mappings between previously studied emergent behaviors of complex systems and their potential cyber counterparts. Our results provide foundational capabilities for further research toward understanding the effects of complexity in cyber systems, to allow anticipating and thwarting hackers.

  16. A robust and scalable neuromorphic communication system by combining synaptic time multiplexing and MIMO-OFDM.

    Science.gov (United States)

    Srinivasa, Narayan; Zhang, Deying; Grigorian, Beayna

    2014-03-01

    This paper describes a novel architecture for enabling robust and efficient neuromorphic communication. The architecture combines two concepts: 1) synaptic time multiplexing (STM) that trades space for speed of processing to create an intragroup communication approach that is firing rate independent and offers more flexibility in connectivity than cross-bar architectures and 2) a wired multiple input multiple output (MIMO) communication with orthogonal frequency division multiplexing (OFDM) techniques to enable a robust and efficient intergroup communication for neuromorphic systems. The MIMO-OFDM concept for the proposed architecture was analyzed by simulating large-scale spiking neural network architecture. Analysis shows that the neuromorphic system with MIMO-OFDM exhibits robust and efficient communication while operating in real time with a high bit rate. Through combining STM with MIMO-OFDM techniques, the resulting system offers a flexible and scalable connectivity as well as a power and area efficient solution for the implementation of very large-scale spiking neural architectures in hardware.

  17. A~Scalable~Data~Taking~System at~a~Test~Beam~for~LHC

    CERN Multimedia

    2002-01-01

    % RD-13 A Scalable Data Taking System at a Test Beam for LHC \\\\ \\\\We have installed a test beam read-out facility for the simultaneous test of LHC detectors, trigger and read-out electronics, together with the development of the supporting architecture in a multiprocessor environment. The aim of the project is to build a system which incorporates all the functionality of a complete read-out chain. Emphasis is put on a highly modular design, such that new hardware and software developments can be conveniently introduced. Exploiting this modularity, the set-up will evolve driven by progress in technologies and new software developments. \\\\ \\\\One of the main thrusts of the project is modelling and integration of different read-out architectures to provide a valuable training ground for new techniques. To address these aspects in a realistic manner, we collaborate with detector R\\&D projects in order to test higher level trigger systems, event building and high rate data transfers, once the techniques involve...

  18. Using common table expressions to build a scalable Boolean query generator for clinical data warehouses.

    Science.gov (United States)

    Harris, Daniel R; Henderson, Darren W; Kavuluru, Ramakanth; Stromberg, Arnold J; Johnson, Todd R

    2014-09-01

    We present a custom, Boolean query generator utilizing common-table expressions (CTEs) that is capable of scaling with big datasets. The generator maps user-defined Boolean queries, such as those interactively created in clinical-research and general-purpose healthcare tools, into SQL. We demonstrate the effectiveness of this generator by integrating our study into the Informatics for Integrating Biology and the Bedside (i2b2) query tool and show that it is capable of scaling. Our custom generator replaces and outperforms the default query generator found within the Clinical Research Chart cell of i2b2. In our experiments, 16 different types of i2b2 queries were identified by varying four constraints: date, frequency, exclusion criteria, and whether selected concepts occurred in the same encounter. We generated nontrivial, random Boolean queries based on these 16 types; the corresponding SQL queries produced by both generators were compared by execution times. The CTE-based solution significantly outperformed the default query generator and provided a much more consistent response time across all query types (M = 2.03, SD = 6.64 versus M = 75.82, SD = 238.88 s). Without costly hardware upgrades, we provide a scalable solution based on CTEs with very promising empirical results centered on performance gains. The evaluation methodology used for this provides a means of profiling clinical data warehouse performance.

  19. Steps Towards Scalable and Modularized Flight Software for Unmanned Aircraft Systems

    Directory of Open Access Journals (Sweden)

    Johann C. Dauer

    2014-05-01

    Full Text Available Unmanned aircraft (UA applications impose a variety of computing tasks on the on-board computer system. From a research perspective, it is often more convenient to evaluate algorithms on bigger aircraft as they are capable of lifting heavier loads and thus more powerful computational units. On the other hand, smaller systems are often less expensive and operation is less restricted in many countries. This paper thus presents a conceptual design for flight software that can be evaluated on the UA of convenient size. The integration effort required to transfer the algorithm to different sized UA is significantly reduced. This scalability is achieved by using exchangeable payload modules and a flexible process distribution on different processing units. The presented approach is discussed using the example of the flight software of a 14 kg unmanned helicopter and an equivalent of 1.5 kg. The proof of concept is shown by means of flight performance in a hardware-in-the-loop simulation.

  20. Hardware Realization of Chaos Based Symmetric Image Encryption

    KAUST Repository

    Barakat, Mohamed L.

    2012-06-01

    This thesis presents a novel work on hardware realization of symmetric image encryption utilizing chaos based continuous systems as pseudo random number generators. Digital implementation of chaotic systems results in serious degradations in the dynamics of the system. Such defects are illuminated through a new technique of generalized post proceeding with very low hardware cost. The thesis further discusses two encryption algorithms designed and implemented as a block cipher and a stream cipher. The security of both systems is thoroughly analyzed and the performance is compared with other reported systems showing a superior results. Both systems are realized on Xilinx Vetrix-4 FPGA with a hardware and throughput performance surpassing known encryption systems.

  1. Hardware support for collecting performance counters directly to memory

    Science.gov (United States)

    Gara, Alan; Salapura, Valentina; Wisniewski, Robert W.

    2012-09-25

    Hardware support for collecting performance counters directly to memory, in one aspect, may include a plurality of performance counters operable to collect one or more counts of one or more selected activities. A first storage element may be operable to store an address of a memory location. A second storage element may be operable to store a value indicating whether the hardware should begin copying. A state machine may be operable to detect the value in the second storage element and trigger hardware copying of data in selected one or more of the plurality of performance counters to the memory location whose address is stored in the first storage element.

  2. 2014 CERN Accelerator Schools: Plasma Wake Acceleration

    CERN Multimedia

    2014-01-01

    A specialised school on Plasma Wake Acceleration will be held at CERN, Switzerland from 23-29 November, 2014.   This course will be of interest to staff and students in accelerator laboratories, university departments and companies working in or having an interest in the field of new acceleration techniques. Following introductory lectures on plasma and laser physics, the course will cover the different components of a plasma wake accelerator and plasma beam systems. An overview of the experimental studies, diagnostic tools and state of the art wake acceleration facilities, both present and planned, will complement the theoretical part. Topical seminars and a visit of CERN will complete the programme. Further information can be found at: http://cas.web.cern.ch/cas/PlasmaWake2014/CERN-advert.html http://indico.cern.ch/event/285444/

  3. A software and hardware architecture for a high-availability PACS.

    Science.gov (United States)

    Gutiérrez-Martínez, Josefina; Núñez-Gaona, Marco Antonio; Aguirre-Meneses, Heriberto; Delgado-Esquerra, Ruth Evelin

    2012-08-01

    Increasing radiology studies has led to the emergence of new requirements for management medical information, mainly affecting the storage of digital images. Today, it is a necessary interaction between workflow management and legal rules that govern it, to allow an efficient control of medical technology and associated costs. Another important topic that is growing in importance within the healthcare sector is compliance, which includes the retention of studies, information security, and patient privacy. Previously, we conducted a series of extensive analysis and measurements of pre-existing operating conditions. These studies and projects have been described in other papers. The first phase: hardware and software installation and initial tests were completed in March 2006. The storage phase was built step by step until the PACS-INR was totally completed. Two important aspects were considered in the integration of components: (1) the reliability and performance of the system to transfer and display DICOM images, and (2) the availability of data backups for disaster recovery and downtime scenarios. This paper describes the high-availability model for a large-scale PACS to support the storage and retrieve of data using CAS and DAS technologies to provide an open storage platform. This solution offers a simple framework that integrates and automates the information at low cost and minimum risk. Likewise, the model allows an optimized use of the information infrastructure in the clinical environment. The tests of the model include massive data migration, openness, scalability, and standard compatibility to avoid locking data into a proprietary technology.

  4. IPbus A flexible Ethernet-based control system for xTCA hardware

    CERN Document Server

    Williams, Thomas Stephen

    2014-01-01

    The ATCA and uTCA standards include industry-standard data pathway technologies such as Gigabit Ethernet which can be used for control communication, but no specific hardware control protocol is defined. The IPbus suite of software and firmware implements a reliable high-performance control link for particle physics electronics, and has successfully replaced VME control in several large projects. In this paper, we outline the IPbus system architecture, and describe recent developments in the reliability, scalability and performance of IPbus systems, carried out in preparation for deployment of uTCA-based CMS upgrades before the LHC 2015 run. We also discuss plans for future development of the IPbus suite.SUMMARY IPbus will be used for controlling the uTCA electronics in the CMS HCAL, TCDS, Pixel and Level-1 trigger upgrades. IPbus control has already been extensively used in the work of these upgrade projects so far, and final uTCA systems will be deployed in the experiment starting from Autumn 2014. IPbus is...

  5. Combining Topological Hardware and Topological Software: Color-Code Quantum Computing with Topological Superconductor Networks

    Directory of Open Access Journals (Sweden)

    Daniel Litinski

    2017-09-01

    Full Text Available We present a scalable architecture for fault-tolerant topological quantum computation using networks of voltage-controlled Majorana Cooper pair boxes and topological color codes for error correction. Color codes have a set of transversal gates which coincides with the set of topologically protected gates in Majorana-based systems, namely, the Clifford gates. In this way, we establish color codes as providing a natural setting in which advantages offered by topological hardware can be combined with those arising from topological error-correcting software for full-fledged fault-tolerant quantum computing. We provide a complete description of our architecture, including the underlying physical ingredients. We start by showing that in topological superconductor networks, hexagonal cells can be employed to serve as physical qubits for universal quantum computation, and we present protocols for realizing topologically protected Clifford gates. These hexagonal-cell qubits allow for a direct implementation of open-boundary color codes with ancilla-free syndrome read-out and logical T gates via magic-state distillation. For concreteness, we describe how the necessary operations can be implemented using networks of Majorana Cooper pair boxes, and we give a feasibility estimate for error correction in this architecture. Our approach is motivated by nanowire-based networks of topological superconductors, but it could also be realized in alternative settings such as quantum-Hall–superconductor hybrids.

  6. Hardware Implementation of Serially Concatenated PPM Decoder

    Science.gov (United States)

    Moision, Bruce; Hamkins, Jon; Barsoum, Maged; Cheng, Michael; Nakashima, Michael

    2009-01-01

    A prototype decoder for a serially concatenated pulse position modulation (SCPPM) code has been implemented in a field-programmable gate array (FPGA). At the time of this reporting, this is the first known hardware SCPPM decoder. The SCPPM coding scheme, conceived for free-space optical communications with both deep-space and terrestrial applications in mind, is an improvement of several dB over the conventional Reed-Solomon PPM scheme. The design of the FPGA SCPPM decoder is based on a turbo decoding algorithm that requires relatively low computational complexity while delivering error-rate performance within approximately 1 dB of channel capacity. The SCPPM encoder consists of an outer convolutional encoder, an interleaver, an accumulator, and an inner modulation encoder (more precisely, a mapping of bits to PPM symbols). Each code is describable by a trellis (a finite directed graph). The SCPPM decoder consists of an inner soft-in-soft-out (SISO) module, a de-interleaver, an outer SISO module, and an interleaver connected in a loop (see figure). Each SISO module applies the Bahl-Cocke-Jelinek-Raviv (BCJR) algorithm to compute a-posteriori bit log-likelihood ratios (LLRs) from apriori LLRs by traversing the code trellis in forward and backward directions. The SISO modules iteratively refine the LLRs by passing the estimates between one another much like the working of a turbine engine. Extrinsic information (the difference between the a-posteriori and a-priori LLRs) is exchanged rather than the a-posteriori LLRs to minimize undesired feedback. All computations are performed in the logarithmic domain, wherein multiplications are translated into additions, thereby reducing complexity and sensitivity to fixed-point implementation roundoff errors. To lower the required memory for storing channel likelihood data and the amounts of data transfer between the decoder and the receiver, one can discard the majority of channel likelihoods, using only the remainder in

  7. Quicksilver: Middleware for Scalable Self-Regenerative Systems

    Science.gov (United States)

    2006-04-01

    standard best practice in the area, and hence helped us identify problems that can be justified in terms of real user needs. Our own group may write a...semantics, generally lack efficient, scalable implementations. Systems aproaches usually lack a precise formal specification, limiting the

  8. Scalable learning of probabilistic latent models for collaborative filtering

    DEFF Research Database (Denmark)

    Langseth, Helge; Nielsen, Thomas Dyhre

    2015-01-01

    Collaborative filtering has emerged as a popular way of making user recommendations, but with the increasing sizes of the underlying databases scalability is becoming a crucial issue. In this paper we focus on a recently proposed probabilistic collaborative filtering model that explicitly...

  9. PSOM2—partitioning-based scalable ontology matching using ...

    Indian Academy of Sciences (India)

    B Sathiya

    2017-11-16

    Nov 16, 2017 ... Abstract. The growth and use of semantic web has led to a drastic increase in the size, heterogeneity and number of ontologies that are available on the web. Correspondingly, scalable ontology matching algorithms that will eliminate the heterogeneity among large ontologies have become a necessity.

  10. Cognition-inspired Descriptors for Scalable Cover Song Retrieval

    NARCIS (Netherlands)

    van Balen, J.M.H.; Bountouridis, D.; Wiering, F.; Veltkamp, R.C.

    2014-01-01

    Inspired by representations used in music cognition studies and computational musicology, we propose three simple and interpretable descriptors for use in mid- to high-level computational analysis of musical audio and applications in content-based retrieval. We also argue that the task of scalable

  11. Scalable Directed Self-Assembly Using Ultrasound Waves

    Science.gov (United States)

    2015-09-04

    at Aberdeen Proving Grounds (APG), to discuss a possible collaboration. The idea is to integrate the ultrasound directed self- assembly technique ...difference between the ultrasound technology studied in this project, and other directed self-assembly techniques is its scalability and...deliverable: A scientific tool to predict particle organization, pattern, and orientation, based on the operating and design parameters of the ultrasound

  12. Scalable Robust Principal Component Analysis Using Grassmann Averages.

    Science.gov (United States)

    Hauberg, Sren; Feragen, Aasa; Enficiaud, Raffi; Black, Michael J

    2016-11-01

    In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average ( GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average ( TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

  13. Scalable electro-photonic integration concept based on polymer waveguides

    NARCIS (Netherlands)

    Bosman, E.; Steenberge, G. van; Boersma, A.; Wiegersma, S.; Harmsma, P.J.; Karppinen, M.; Korhonen, T.; Offrein, B.J.; Dangel, R.; Daly, A.; Ortsiefer, M.; Justice, J.; Corbett, B.; Dorrestein, S.; Duis, J.

    2016-01-01

    A novel method for fabricating a single mode optical interconnection platform is presented. The method comprises the miniaturized assembly of optoelectronic single dies, the scalable fabrication of polymer single mode waveguides and the coupling to glass fiber arrays providing the I/O's. The low

  14. Coilable Crystalline Fiber (CCF) Lasers and their Scalability

    Science.gov (United States)

    2014-03-01

    highly power scalable, nearly diffraction-limited output laser. 37 References 1. Snitzer, E. Optical Maser Action of Nd 3+ in A Barium Crown Glass ...Electron Devices Directorate Helmuth Meissner Onyx Optics Approved for public release; distribution...lasers, but their composition ( glass ) poses significant disadvantages in pump absorption, gain, and thermal conductivity. All-crystalline fiber lasers

  15. Efficient Enhancement for Spatial Scalable Video Coding Transmission

    Directory of Open Access Journals (Sweden)

    Mayada Khairy

    2017-01-01

    Full Text Available Scalable Video Coding (SVC is an international standard technique for video compression. It is an extension of H.264 Advanced Video Coding (AVC. In the encoding of video streams by SVC, it is suitable to employ the macroblock (MB mode because it affords superior coding efficiency. However, the exhaustive mode decision technique that is usually used for SVC increases the computational complexity, resulting in a longer encoding time (ET. Many other algorithms were proposed to solve this problem with imperfection of increasing transmission time (TT across the network. To minimize the ET and TT, this paper introduces four efficient algorithms based on spatial scalability. The algorithms utilize the mode-distribution correlation between the base layer (BL and enhancement layers (ELs and interpolation between the EL frames. The proposed algorithms are of two categories. Those of the first category are based on interlayer residual SVC spatial scalability. They employ two methods, namely, interlayer interpolation (ILIP and the interlayer base mode (ILBM method, and enable ET and TT savings of up to 69.3% and 83.6%, respectively. The algorithms of the second category are based on full-search SVC spatial scalability. They utilize two methods, namely, full interpolation (FIP and the full-base mode (FBM method, and enable ET and TT savings of up to 55.3% and 76.6%, respectively.

  16. Scalable power selection method for wireless mesh networks

    CSIR Research Space (South Africa)

    Olwal, TO

    2009-01-01

    Full Text Available This paper addresses the problem of a scalable dynamic power control (SDPC) for wireless mesh networks (WMNs) based on IEEE 802.11 standards. An SDPC model that accounts for architectural complexities witnessed in multiple radios and hops...

  17. Estimates of the Sampling Distribution of Scalability Coefficient H

    Science.gov (United States)

    Van Onna, Marieke J. H.

    2004-01-01

    Coefficient "H" is used as an index of scalability in nonparametric item response theory (NIRT). It indicates the degree to which a set of items rank orders examinees. Theoretical sampling distributions, however, have only been derived asymptotically and only under restrictive conditions. Bootstrap methods offer an alternative possibility to…

  18. Hardware locks for a real-time Java chip multiprocessor

    DEFF Research Database (Denmark)

    Strøm, Torur Biskopstø; Puffitsch, Wolfgang; Schoeberl, Martin

    2016-01-01

    A software locking mechanism commonly protects shared resources for multithreaded applications. This mechanism can, especially in chip-multiprocessor systems, result in a large synchronization overhead. For real-time systems in particular, this overhead increases the worst-case execution time...... and may void a task set's schedulability. This paper presents 2 hardware locking mechanisms to reduce the worst-case time required to acquire and release synchronization locks. These solutions are implemented for the chip-multiprocessor version of the Java Optimized Processor. The 2 hardware locking...... mechanisms are compared with a software locking solution as well as the original locking system of the processor. The hardware cost and performance are evaluated for all presented locking mechanisms. The performance of the better-performing hardware locks is comparable with that of the original single global...

  19. A versatile hardware platform for brain computer interfaces.

    Science.gov (United States)

    Garcia, Pablo A; Haberman, Marcelo; Spinelli, Enrique M

    2010-01-01

    This article presents the development of a versatile hardware platform for brain computer interfaces (BCI). The aim of this work is to produce a small, autonomous and configurable BCI platform adaptable to the user's needs.

  20. Scientific Computing Using Consumer Video-Gaming Hardware Devices

    CERN Document Server

    Volkema, Glenn

    2016-01-01

    Commodity video-gaming hardware (consoles, graphics cards, tablets, etc.) performance has been advancing at a rapid pace owing to strong consumer demand and stiff market competition. Gaming hardware devices are currently amongst the most powerful and cost-effective computational technologies available in quantity. In this article, we evaluate a sample of current generation video-gaming hardware devices for scientific computing and compare their performance with specialized supercomputing general purpose graphics processing units (GPGPUs). We use the OpenCL SHOC benchmark suite, which is a measure of the performance of compute hardware on various different scientific application kernels, and also a popular public distributed computing application, Einstein@Home in the field of gravitational physics for the purposes of this evaluation.